More

wirybeige · 2026-05-28T20:39:18 1780000758

DS4 Pro/Flash were post trained with QAT, so they are already quantized to FP4 for the most part. That's why when downloading the weights, they are much smaller than what their weights at fp8 or fp16 would be. For example, Flash is a 284B model, but its GB size is only ~160GB. OFC maybe DeeppInfra went even further, but there is no proof of that.

pants2 · 2026-05-29T04:02:13 1780027333

Interesting then that OpenRouter[1] tags many providers as FP8 and DeepInfra as FP4.

1. https://openrouter.ai/deepseek/deepseek-v4-pro

wirybeige · 2026-05-29T16:49:47 1780073387

I presume the providers are the ones giving the info to OpenRouter? I mean, technically it is a mix of fp8 and fp4 (although it is predominately fp4), so I don't think either is inaccurate.

wirybeige · 2026-05-18T15:46:58 1779119218

Personal anecdote --- Proton Pass very quickly went from worse than Bitwarden to better with more reliable auto-fill.

wirybeige · 2026-05-02T15:05:58 1777734358

These were trained on NVIDIA gpus. It is running inference on Huawei.

wirybeige · 2026-05-01T15:15:08 1777648508

Pricing for DeepSeek V4 flash is $0.14 in/$0.28 out across basically every provider or close to it. It seems most providers just follow the model creator and set their prices to match. V4 pro was set to be $1.74 in/$3.48 out when DeepSeek first announced it; all providers have set their prices to be about that price, & now DeepSeek has set their pricing to $0.435 in/ $0.87 out. I don't know if this is special pricing, or the promise they made for dropping the price when they get more Huawei cards online. It seems that providers like ParaSail, Together, and Novita just set the price when the model comes out and don't compete.

philistine · 2026-05-01T15:26:50 1777649210

No one has yet to turn a profit from LLMs. I don't understand why we need to intently look at everybody's pricing, when the most important number is instead their losses. That is the number that tells us what they're really doing.

wirybeige · 2026-05-01T15:56:21 1777650981

Why would these 3rd-party providers be taking losses? Together, Novita, etc... are not losing money on inference services, they are profiting. You can easily do napkin math with current & last gen Nvidia cards to calculate cost to host/serve these models. I would also doubt that any 1st-party providers like OpenAI and Anthropic lose money on per token billing. There is almost undoubtedly healthy margin being made on that.

andriy_koval · 2026-05-01T17:58:34 1777658314

> Why would these 3rd-party providers be taking losses?

we are in market capture phase. Domestically hosted Chinese LLMs is a descent market to capture.

nickthegreek · 2026-05-01T15:43:33 1777650213

OpenRouter isnt turning a profit?

wirybeige · 2026-02-28T22:32:46 1772317966

The vulkan backend for llama.cpp isn't that far behind rocm for pp and tp speeds

wirybeige · 2026-02-08T20:25:26 1770582326

Sodium batteries don't yet have the scale that lifepo4 batteries have. I'd expect we will see them get cheaper.

wirybeige · 2026-01-02T02:23:29 1767320609

HDR playback in chrome on KDE works as expected from what I can tell. For GNOME 49.2 it does not, it doesn't get the luminance that it should at this time. 49.3 may fix this.

wirybeige · 2025-10-10T14:16:43 1760105803

The post links to this: https://github.com/MCRcortex/nvidium

nvidium is using GL_NV_mesh_shader which is only available for nVIDIA cards. This mod is the only game/mod I know of that uses mesh shaders & is OpenGL. & so the new gl extension will let users of other vendors use the mod if it gets updated to use the new extension.

wirybeige · 2025-10-07T13:19:48 1759843188

GNOME has both color management and color representation protocols implemented. HDR works fine on it

database64128 · 2025-10-07T14:10:30 1759846230

No, having the bare minimum "HDR support" does not mean it works fine. I have a 27-inch 4K 144Hz monitor with P3 wide color gamut and HDR600. This monitor is connected to 2 PCs, one running Arch Linux with GNOME as the DE and one with Windows 11.

Since Windows 11 24H2, with the new color management feature turned on, I can get correct colors on the monitor in both SDR and HDR modes. So it ends up with HDR on at all times, and mpv can play HDR videos with no color or brightness issues.

GNOME, on the other hand, is stuck with sRGB output in SDR mode, so you get oversaturated colors. With HDR on, SDR content will no longer be oversaturated, but if you play HDR videos with mpv, the image looks darkened and wrong. I've tried setting target-peak and target-contrast to match the auto-detected values on Windows, but the video still looks off.

wirybeige · 2025-10-07T14:23:14 1759846994

Sorry it doesn't work for you. I don't have that issue. Gnome looks proper in HDR mode for both HDR and SDR content for me.

wirybeige · 2025-09-25T22:59:30 1758841170

Their GUI is closed-source. If someone wants an easy to use & easy to setup app, may as well use LMStudio, which doesn't try to pretend to be OSS. Or use ramalama which is basically just containerizing LLMs and the relevant bits, pretty damn similar to ollama. Or just go back to "basics" and use llama.cpp or vllm.