Hacker News

durhamg · 2026-02-15T04:14:03 1771128843

This sounds exactly like Claude wrote it. I've noticed Claude saying "genuinely" a lot lately, and the "real killer feature" segue just feels like Claude being asked to review something.

Terretta · 2026-02-15T14:10:33 1771164633

> The fact that you're getting 15-30 tok/s for text gen on phone hardware is wild — that's basically usable for real conversations.

Wild how bad it is compared to, say, Russet for iOS/ipadOS, which runs these same models at 110 tps.

ali_chherawalla · 2026-02-15T04:12:36 1771128756

I've added a section for recommended models. So basically you can chose from there.

I'd recommend going for any quantized 1B parameter model. So you can look at llama 3.2 1B, gemma3 1B, qwen3 VL 2B (if you'd like vision)

Appreciate the kind words!

add-sub-mul-div · 2026-02-15T04:11:35 1771128695

> that's basically usable for real conversations.

That's using the word "real" very loosely.