"TurboQuant proved it can quantize the key-value cache to just 3 bits without requiring training or fine-tuning and causing any compromise in model accuracy" -- what do each 3 bits correspond to? Hardly individual keys or values, since it would limit each of them to 8 different vectors.
With a 0$ Bluetooth-3.5mm jack dongle from AliExpress one can have best of both worlds, or at least continue using wired cans with a phone where the charging port stopped working.
No but Apple has been putting their weight behind services. Some of these services are platform agnostic but they do work best on a Mac. Their success story is the efficiency of the closed ecosystem, something that Android and Windows are converging to.
However, these services revenues (App Store tax, iCloud storage, even the deal with Google) are still anchored to users' loyalty to the platform. So Apple needs users to (1) stay on the platform and (2) buy new devices. Making user experience painful everywhere except for on the newest devices works against (1) and for (2).
That said, Microsoft's trade offs re quality of their software are rather different and their solution is even weirder: high-quality user-facing software is not in competition with their b2b sales, so ok, no reason to spend too many resources on it, but absolutely no evident reason to make it noticeably worse either.
As a result, Microsoft's approach of regressive evolution probably lets Apple get away with almost not caring or even going the path of slower regressive evolution.
Not really anymore, their silicon is impressive but most users I would guess don't use it in any meaningful sense. If hardware is your main goal as a customer, you're building a machine with better hardware.
reply