DS4 Pro/Flash were post trained with QAT, so they are already quantized to FP4 for the most part. That's why when downloading the weights, they are much smaller than what their weights at fp8 or fp16 would be. For example, Flash is a 284B model, but its GB size is only ~160GB. OFC maybe DeeppInfra went even further, but there is no proof of that.
I presume the providers are the ones giving the info to OpenRouter? I mean, technically it is a mix of fp8 and fp4 (although it is predominately fp4), so I don't think either is inaccurate.
Pricing for DeepSeek V4 flash is $0.14 in/$0.28 out across basically every provider or close to it. It seems most providers just follow the model creator and set their prices to match. V4 pro was set to be $1.74 in/$3.48 out when DeepSeek first announced it; all providers have set their prices to be about that price, & now DeepSeek has set their pricing to $0.435 in/ $0.87 out. I don't know if this is special pricing, or the promise they made for dropping the price when they get more Huawei cards online. It seems that providers like ParaSail, Together, and Novita just set the price when the model comes out and don't compete.
No one has yet to turn a profit from LLMs. I don't understand why we need to intently look at everybody's pricing, when the most important number is instead their losses. That is the number that tells us what they're really doing.
Why would these 3rd-party providers be taking losses? Together, Novita, etc... are not losing money on inference services, they are profiting. You can easily do napkin math with current & last gen Nvidia cards to calculate cost to host/serve these models. I would also doubt that any 1st-party providers like OpenAI and Anthropic lose money on per token billing. There is almost undoubtedly healthy margin being made on that.
HDR playback in chrome on KDE works as expected from what I can tell. For GNOME 49.2 it does not, it doesn't get the luminance that it should at this time. 49.3 may fix this.
nvidium is using GL_NV_mesh_shader which is only available for nVIDIA cards. This mod is the only game/mod I know of that uses mesh shaders & is OpenGL. & so the new gl extension will let users of other vendors use the mod if it gets updated to use the new extension.
No, having the bare minimum "HDR support" does not mean it works fine. I have a 27-inch 4K 144Hz monitor with P3 wide color gamut and HDR600. This monitor is connected to 2 PCs, one running Arch Linux with GNOME as the DE and one with Windows 11.
Since Windows 11 24H2, with the new color management feature turned on, I can get correct colors on the monitor in both SDR and HDR modes. So it ends up with HDR on at all times, and mpv can play HDR videos with no color or brightness issues.
GNOME, on the other hand, is stuck with sRGB output in SDR mode, so you get oversaturated colors. With HDR on, SDR content will no longer be oversaturated, but if you play HDR videos with mpv, the image looks darkened and wrong. I've tried setting target-peak and target-contrast to match the auto-detected values on Windows, but the video still looks off.
Their GUI is closed-source. If someone wants an easy to use & easy to setup app, may as well use LMStudio, which doesn't try to pretend to be OSS. Or use ramalama which is basically just containerizing LLMs and the relevant bits, pretty damn similar to ollama. Or just go back to "basics" and use llama.cpp or vllm.
reply