Interesting. I've only dipped my toe in the AI waters but my initial experience ...

wild_egg · 2026-02-14T13:11:29 1771074689

You need to be telling it to create reproduction test cases first and iterate until it's truly solved. There's no need for you to manually be testing that sort of thing.

The key to success with agents is tight, correct feedback loops so they can validate their own work. Go has great tooling for debugging race conditions. Tell it to leverage those properly and it shouldn't have any problems solving it unless you steer it off course.

epolanski · 2026-02-14T14:23:21 1771079001

+1 half the time I see such posts the answer is "harness".

Put the LLM in a situation where it can test and reason about its results.

JetSetIlly · 2026-02-14T14:29:02 1771079342

I do have a test harness. That's how I could show that the code suggested was poor.

If you mean, put the LLM in the test harness. Sure, I accept that that's the best way to use the tools. The problem is that there's nothing requiring me or anyone else to do that.

Someone · 2026-02-14T15:51:52 1771084312

If that’s what you have to do that makes LLMs look more like advanced fuzzers that take textual descriptions as input (“find code that segfaults calling x from multiple threads”, followed by “find changes that make the tests succeed again”) than as truly intelligent. Or, maybe, we should see them as diligent juniors who never get tired.

wild_egg · 2026-02-14T18:01:24 1771092084

I don't see any problems with either of those framings.

It really doesn't matter at all whether these things are "truly intelligent". They give me functioning code that meets my requirements. If standard fuzzers or search algorithms could do the same, I would use those too.

kitd · 2026-02-14T19:36:36 1771097796

TDD and the coding agent: a match made in heaven.

It is Valentine's Day after all.

JetSetIlly · 2026-02-14T14:20:26 1771078826

I accept what you say about the best way to use these agents. But my worry is that there is nothing that requires people to use them in that way. I was deliberately vague and general in my test. I don't think how Claude responded under those conditions was good at all.

I guess I just don't see what the point of these tools are. If I was to guide the tool in the way you describe, I don't see how that's better than just thinking about and writing the code myself.

I'm prepared to be shown differently of course, but I remain highly sceptical.

wild_egg · 2026-02-14T18:41:37 1771094497

Just want to say upfront: this mindset is completely baffling to me.

Someone gives you a hammer. You've never seen one before. They tell you it's a great new tool with so many ways to use it. So you hook a bag on both ends and use it to carry your groceries home.

You hear lots of people are using their own hammers to make furniture and fix things around the home.

Your response is "I accept what you say about the best way to use these hammers. But my worry is that there is nothing that requires people to use them in that way."

These things are not intelligent. They're just tools. If you don't use a guide with your band saw, you aren't going to get straight cuts. If you want straight cuts from your AI, you need the right structure around it to keep it on track.

Incidentally, those structures are also the sorts of things that greatly benefit human programmers.

JetSetIlly · 2026-02-14T19:21:55 1771096915

"These things are not intelligent. They're just tools."

Correct. But they are being marketed as being intelligent and can easily convince a casual observer that they are through the confidence of their responses. I think that's a problem. I think AI companies are encouraging people to use these tools irresponsibly. I think the tools should be improved so they can't be misused.

"Incidentally, those structures are also the sorts of things that greatly benefit human programmers."

Correct. And that's why I have testing in place and why I used it to show that the race condition had been introduced.

strawhatguy · 2026-02-14T15:03:37 1771081417

Okay. If you’re being vague, you get vague results.

Golang and Claude have worked well for me, on existing production codebases, because I tell it precisely what I want and it does it.

I’ve never found generic “find performance issues” just by reading the code helpful.

Write specifications, give it freedom to implement, and it can surprise you.

Hell once it thought of how to backfill existing data with the change I was making, completely unasked. And I’m like that’s awesome

JetSetIlly · 2026-02-14T15:20:16 1771082416

"Okay. If you’re being vague, you get vague results."

No. I was vague and got a concrete suggestion.

I have no issue with people using Claude in an optimal way. The problem is that it's too easy to use in a poor way.

My example was to test my own curiosity about whether these tools live up to the claims that they'll be replacing programmers. On the evidence I've seen I don't believe they will and I don't see how Go is any different to any other language in that regard.

IMO, for tools like Claude to be truly useful, they need to understand their own limitations and refuse to work unless the conditions are correct. As you say, it works best when you tell it precisely what you want. So why doesn't Claude recognise when you're not being precise and refuse to work until you are?

To reiterate, I think coding assistants are great when used in the optimal way.

treyd · 2026-02-14T13:25:11 1771075511

If only there was a way to prevent race conditions by design as part if the language's type system, and in a way that provides rich and detailed error messages that allow coding agents to troubleshoot issues directly (without having to be prompted to write/run tests that just check for race conditions).