If BOLT can work directly on binaries, is there anything stopping it from being integrated as a kernel module into the OS, so that binaries are continually being profiled and optimized?
It seems to me that an optimized OS image could be also be created.
That doesn't seem like it would be profitable. Profiling the running process, processing the profile, and then relinking the binary on a single host wouldn't pay off compared to profiling in the large, relinking the program once and redeploying it at scale. Peak optimization is expensive.
My phone runs approximately zero new binaries every day. I'd happily let my phone optimize itself while charging so that netflix, youtube, or other heavy CPU users use less battery.
It seems to me that an optimized OS image could be also be created.