More

PunchyHamster · 2026-02-28T23:58:25 1772323105

I think there is distinction there between look and functionality.

They were functionally just fine; good even compared to some modern abominations.

But the look was just plain and ugly, even compared to some alternatives at the time.

> Things started going downhill, in my opinion, with the Windows XP "Fisher-Price" Luna interface and the Microsoft Office 2007 ribbon.

Yeah I just ran it with 2000-compatible look; still ugly but at least not wasting screen space

layer8 · 2026-03-01T00:12:57 1772323977

Windows 95 was a vast improvement in looks over 3.x. Of course tastes differ, but I found it very aesthetic, not ugly at all, and used the classic look until Windows 7 EOLd.

PunchyHamster · 2026-02-28T22:51:28 1772319088

I asked it to recite potato 100 times coz I wanted to benchmark speed of CPU vs GPU. It's on 150 line of planning. It recited the requested thing 4 times already and started drafting the 5th response.

...yeah I doubt it

lachiflippi · 2026-02-28T23:03:55 1772319835

Qwen3.5 pretty much requires a long system prompt, otherwise it goes into a weird planning mode where it reasons for minutes about what to do, and double and triple checks everything it does. Both Gemini's and Claude Opus 4.6's prompts work pretty well, but are so long that whatever you're using to run the model has to support prompt caching. Asking it to "Say the word "potato" 100 times, once per line, numbered.", for example, results in the following reasoning, followed by the word "potato" in 100 numbered lines, using the smallest (and therefore dumbest) quant unsloth/Qwen3.5-35B-A3B-GGUF:UD-IQ2_XXS:

"User is asking me to repeat the word "potato" 100 times, numbered. This is a simple request - I can comply with this request. Let me create a response that includes the word "potato" 100 times, numbered from 1 to 100.

I'll need to be careful about formatting - the user wants it numbered and once per line. I should use minimal formatting as per my instructions."

PunchyHamster · 2026-02-28T23:21:27 1772320887

good to know, thanks. I just ran ollama with qwen3.5:27b. Currently it's stuck on picking format

    Let's write.
    Wait, I'll write the response.
    Wait, I'll check if I should use a table.
    No, text is fine.
    Okay.
    Let's write.
    Wait, I'll write the response.
    Wait, I'll check if I should use a bullet list.
    No, just lines.
    Okay.
    Let's write.
    Wait, I'll write the response.
    Wait, I'll check if I should use a numbered list.
    No, lines are fine.
    Okay.
    Let's write.
    Wait, I'll write the response.
    Wait, I'll check if I should use a code block.
    Yes.
    Okay.
    Let's write.
    Wait, I'll write the response.
    Wait, I'll check if I should use a pre block.
    Code block is better.

... (for next 100 lines)

lachiflippi · 2026-02-28T23:30:14 1772321414

Yeah, it tends to get stuck in loops like that a lot with everything set to default. I wonder if they distilled Gemini at some point, I've seen that get stuck in a similar "I will now do [thing]. I am preparing to do [thing]. I will do it." failure mode as well a couple of times.

xmddmx · 2026-03-01T00:38:32 1772325512

See my other note [1] about bugs in Ollama with Qwen3.5.

I just tried this (Ollama macOS 0.17.4, qwen3.5:35b-a3b-q4_K_M) on a M4 Pro, and it did fine:

[Thought for 50.0 seconds]

1. potato 2. potato [...] 100. potato

In other words, it did great.

I think 50 seconds of thinking beforehand was perhaps excessive?

[1] https://news.ycombinator.com/item?id=47202082

xmddmx · 2026-03-01T00:37:53 1772325473

See my other note about bugs in Ollama with Qwen3.5.

I just tried this (Ollama macOS 0.17.4, qwen3.5:35b-a3b-q4_K_M) on a M4 Pro, and it did fine:

[Thought for 50.0 seconds]

1. potato 2. potato [...] 100. potato

In other words, it did great.

I think 50 seconds of thinking beforehand was perhaps excessive?

CamperBob2 · 2026-03-01T01:43:32 1772329412

What quant? I just ran Repeat the word "potato" 100 times, numbered and it worked fine, taking 44 seconds at 24 tokens/second. Command line:

    llama-server ^
      --model Qwen3.5-27B-BF16-00001-of-00002.gguf ^
      --mmproj mmproj-BF16.gguf ^
      --fit on ^
      --host 127.0.0.1 ^
      --port 2080 ^
      --temp 0.8 ^
      --top-p 0.95 ^
      --top-k 20 ^
      --min-p 0.00 ^
      --presence_penalty 1.5 ^
      --repeat_penalty 1.1 ^
      --no-mmap ^
      --no-warmup

The repeat and/or presence penalties seem to be somewhat sensitive with this model, so that might have caused the looping you saw.

throwdbaaway · 2026-03-01T04:32:04 1772339524

I don't quite get the low temperature coupled with the high penalty. We get thinking loop due to low temperature, and we then counter it with high penalty. That seems backward.

For Qwen3.5 27B, I got good result with --temp 1.0 --top-p 1.0 --top-k 40 --min-p 0.2, without penalty. It allows the model to explore (temp, top-p, top-k) without going off the rail (min-p) during reasoning. No loop so far.

CamperBob2 · 2026-03-01T05:03:12 1772341392

The guidelines are a little hard to interpret. At https://huggingface.co/Qwen/Qwen3.5-27B Qwen says to use temp 0.6, pres 0.0, rep 1.0 for "thinking mode for precise coding tasks" and temp 1.0, pres 1.5, rep 1.0 for "thinking mode for general tasks." Those parameters are just swinging wildly all over the place, and I don't know if printing potato 100 times is considered to be more like a "precise coding task" or a "general task."

When setting up the batch file for some previous tests, I decided to split the difference between 0.6 and 1.0 for temperature and use the larger recommended values for presence and repetition. For this prompt, it probably isn't a good idea to discourage repetition, I guess. But keeping the existing parameters worked well enough, so I didn't mess with them.

lumirth · 2026-02-28T22:56:04 1772319364

well hold on now, maybe it’s onto something. do you really know what it means to “recite” “potato” “100” “times”? each of those words could be pulled apart into a dissertation-level thesis and analysis of language, history, and communication.

either that, or it has a delusional level of instruction following. doesn’t mean it can’t code like sonnet though

PunchyHamster · 2026-02-28T23:07:11 1772320031

It's still amusing to see those seemingly simple things still put it into loop it is still going

> do you really know what it means to “recite” “potato” “100” “times”?

asking user question is an option. Sonnet did that a bunch when I was trying to debug some network issue. It also forgot the facts checked for it and told it before...

lumirth · 2026-03-01T00:36:05 1772325365

I wonder how much certain models have been trained to avoid asking too many questions. I’ve had coworkers who’ll complete an entire project before asking a single additional question to management, and it has never gone well for them. Unsurprising that the same would be true for the “managing AI” era of programming.

The thing I struggle most with, honestly, is when AI (usually GPT5.3-Codex) asks me a question and I genuinely don’t know the answer. I’m just like “well, uh… follow industry best practice, please? unless best practice is dumb, I guess. do a good. please do a good.” And then I get to find out what the answer should’ve been the hard way.

PunchyHamster · 2026-02-28T22:14:01 1772316841

Seeing what coming ? They pivoted into storage for AI, lone maintainer is not threat to their business model

dijit · 2026-02-28T22:36:41 1772318201

Still, I would probably abandon the name for trademark enforcement reasons. It's low hanging fruit for them if they want to kill you.

(this is also why the Pentium was called the Pentium instead of the numbers that processors used to be called.. and why the gameboy copyright text was embedded into the ROMs)

PunchyHamster · 2026-02-28T23:51:24 1772322684

Oh yes, I'm sure that if this fork gets any momentum they will send lawyers

PunchyHamster · 2026-02-28T21:23:57 1772313837

Ah, yes, OpenAI, org known for keeping the word they gave on the direction of the company, with literal lie about that in their very name.

PunchyHamster · 2026-02-28T21:19:38 1772313578

I dont use it a lot but when I do it's pretty much 2 patterns

* "search on steroids" - get me to the thing I need or ask whether the thing I need exists, give me few examples and I can get it running.

* getting the trivial and uninteresting parts out of the way, like writing some helper function for stuff I'm doing now, I'll just call AI, let it do its thing and continue writing the code in meantime, look back ,check if it makes sense and use it.

So I'm not really cheating myself out of the learning process, just outsource the parts I know well enough that I can check for correctness but save time writing

PunchyHamster · 2026-02-28T21:13:05 1772313185

> Can you create incredibly useful code without that knowledge today?

You could do that without that knowledge back in the day too, we had languages that were higher level than assembler for forever.

It's just that the range of knowledge needed to maximize machine usage is far smaller now. Before you had to know how to write a ton of optimizations, nowadays you have to know how to write your code so the compiler have easy job optimizing it.

Before you had to manage the memory accesses, nowadays making sure you're not jumping actross memory too much and being aware how cache works is enough

steveBK123 · 2026-03-01T03:56:15 1772337375

Or more so - machines have gotten so fast, with so much disk and memory.. that people can ship slopware filled with bloatware and the UX is almost as responsive as Windows 3.1 was

PunchyHamster · 2026-02-28T10:31:37 1772274697

if 2fa is "use the second factor that's on same device as first factor" (like when using phone apps in many cases, password + 2fa from email/sms/authenticator app on same device), I disagree.

dwedge · 2026-02-28T13:12:44 1772284364

If I get your password, and you use 2fa that's stored on your phone, does that improve your security position or not

PunchyHamster · 2026-02-25T12:57:45 1772024265

Read lock requires communication between cores. It just can't scale with CPU count

PunchyHamster · 2026-02-25T00:24:55 1771979095

it's simple if the other stuff is already in place

PunchyHamster · 2026-02-24T18:17:50 1771957070

Better figure out how to replace management and HR dept with dogs

ilaksh · 2026-02-24T21:54:20 1771970060

It's actually extremely similar: the agent has to figure out a way to associate the next logical steps with the (often disconnected or nonsensical) directives the executive gave them.

It might be a little easier with a dog though. With a dog, you just give it treats and it doesn't care how you interpret what it typed.

jjk166 · 2026-02-24T22:29:26 1771972166

Pretty sure just a drop in replacement would be an immediate improvement.

koolba · 2026-02-24T23:58:02 1771977482

The next round of massive tech layoffs will be ruff.

notxorand · 2026-02-24T18:22:53 1771957373

gonna be good stuff tho. dogs are mostly lovelier