Hacker Newsnew | past | comments | ask | show | jobs | submit | more electroly's commentslogin

LLMs are great at reviewing. This is not stupid at all if it's what you want; you can still derive benefit from LLMs this way. I like to have them review at the design level where I write a spec document, and the LLM reviews and advises. I don't like having the LLM actually write the document, even though they are capable of it. I do like them writing the code, but I totally get it; it's no different than me and the spec documents.


Right, I'd say this is the best value I've gotten out of it so far: I'm planning to build this thing in this way, does that seem like a good idea to you? Sometimes I get good feedback that something else would be better.


If LLMs are great at reviewing, why do they produce the quality of code they produce?


Reviewing is the easier task: it only has to point me in the right direction. It's also easy to ignore incorrect review suggestions.


Imho it's because you worked before asking the LLM for input, thus you already have information and an opinion about what the code should look like. You can recognize good suggestions and quickly discard bad ones.

It's like reading, for better learning and understanding, it is advised that you think and question the text before reading it, and then again after just skimming it.

Whereas if you ask first for the answer, you are less prepared for the topic, is harder to form a different opinion.

It's my perception.


Its also because they are only as good as they are with their given skills. If you tell them "code <advandced project> and make no x and y mistakes" they will still make those mistakes. But if you say "perform a code review and look specifically for x and y", then it may have some notion of what to do. That's my experience with using it for both writing and reviewing the same code in different passes.


Steel-manning the idea, perhaps they would ship object files (.o/.a) and the apt-get equivalent would link the system? I believe this arrangement was common in the days before dynamic linking. You don't have to redownload everything, but you do have to relink everything.


> Steel-manning the idea, perhaps they would ship object files (.o/.a) and the apt-get equivalent would link the system? I believe this arrangement was common in the days before dynamic linking. You don't have to redownload everything, but you do have to relink everything.

This was indeed comon for Unix. The only way to tune the systems (or even change the timezone) was to edit the very few source files and run make, which compiled those files then linked them into a new binary.

Linking-only is (or was) much faster than recompiling.


But if I have to relink everything, I need all the makefiles, linker scripts and source code structure. I might as well compile it outright. On the other hand, I might as well just link it whenever I run it, like, dynamically ;)


And then how would this be any different in practice from dynamic linking?


My hobby language[1] also has no reference semantics, very similar to Herd. I think this is a really interesting point in the design space. A lot of complexity goes away when it's only values, and there are real languages like classic APL that work this way. But there are some serious downsides.

In practice I have found that it's very painful to thread state through your program. I ended up offering global variables, which provide something similar to but worse than generalized reference semantics. My language aims for simplicity so I think this may still be a good tradeoff, but it's tricky to imagine this working well in a larger user codebase.

I like that having only value semantics allows us, internally, to use reference counted immutable objects to cut down on copying; we both pass-by-reference internally and present it as pass-by-value to the programmer. No cycle detection needed because it's not possible to construct cycles. I use an immutable data structures library[2] so that modifications are reasonably efficient. I recommend trying that in Herd; it's almost always better than copy-on-write. Think about the Big-O of modifying a single element in an array, or building up a list by repeatedly appending to it. With pure COW it's hard to have a large array at all--it takes too long to do anything with it!

For the programmer, missing reference semantics can be a negative. Sometimes people want circular linked lists, or to implement custom data structures. It's tough to build new data structures in a language without reference semantics. For the most part, the programmer has to simulate them with arrays. This works for APL because it's an array language, but my BASIC has less of an excuse.

I was able to avoid nearly all reference counting overhead by being single threaded only. My reference counts aren't atomic so I don't pay anything but the inc/dec. For a simple language like TMBASIC this was sensible, but in a language with multithreading that has to pay for atomic refcounts, it's a tough performance pill to swallow. You may want to consider a tracing GC for Herd.

[1] https://tmbasic.com

[2] https://github.com/arximboldi/immer


How do I square "he has debunked that" with the article about his brain fMRI and the results about his amygdala, linked above in this subthread? It's full of direct quotes from both Honnold and the doctors. Where did he debunk it... and how? He's got a more accurate analysis than the fMRI? Do you have a link?



There is no contradiction


There's a mention on Wikipedia [1] that the Internet Archive maintains international mirror sites in Egypt and the Netherlands, in addition to several domestic sites within North America.

[1] https://en.wikipedia.org/wiki/Internet_Archive#Operations


My feedback: it seems like this tool isn't really like aws-nuke, but the copy keeps comparing it to aws-nuke, extending further into this HN post. aws-nuke doesn't need delete permissions (you just can't do the "delete" step, obviously), aws-nuke makes you decide what to delete, aws-nuke doesn't need confidence scoring since it shows you everything in the account, and aws-nuke is open source. From your list of key differences, the only one that aws-nuke doesn't already do is the one that doesn't make sense for aws-nuke. This is, IMO, a problem with your list and not with the app: there are differentiating things CleanCloud does that you can focus on instead.

IMO, don't mention aws-nuke at all. This isn't the same kind of product as aws-nuke, which is explicitly the "One-click cleanup workflows" category in your "Not designed for" box. Your tool is for accounts that I'm not trying to nuke. So why invite the comparison? These tools are not intended for the same use case.

Spitballing here, I'd think you would want to lean into the cost savings aspect of deleting orphaned resources. aws-nuke is about cleaning out disposable AWS accounts. CleanCloud is about cloud cost optimization on real production/staging accounts.

A final note: it seems like the name CleanCloud is already used by a laundry service provider. You still have time to pick a different name for which you can take the top Google spot.


This is fair feedback — thanks for calling it out.

You’re right that CleanCloud is not the same category as aws-nuke, and comparing them directly is misleading. aws-nuke is great for disposable accounts; CleanCloud is explicitly for long-lived production and regulated environments where destructive access isn’t acceptable.

The intent with CleanCloud is read-only hygiene evaluation: identifying cost waste and risky misconfigurations with evidence and confidence, so teams can act safely through their normal change processes.

I’ll update the copy to remove the aws-nuke comparison and make that distinction clearer.

Many Thanks Suresh


EC2 instances can hibernate, too. You stop paying for the instance while it's hibernated; you pay the EBS storage cost only.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Hibernat...


XFCE is X11-only, isn't it? Wayland support is still in development/experimental. I personally use XFCE with X11 to this day.


OpenSUSE Leap 16 has Wayland-only Xfce 4.20, using LabWC as the WM/compositor.

It works but keyboard-driven window management is broken: LabWC doesn't understand the standard (i.e. Windows) keystrokes.


Can you explain tmux's contribution here? I'm confused why this process wouldn't work just the same if CC directly executed the program rather than involving tmux. Are you just using tmux to trick the program under test into running its TUI instead of operating in a dumb-stdout mode?


It allows Claude to take screenshots and generate keyboard inputs. It's like TUI Playwright.


Maybe I'm not understanding it (totally possible!) but could Claude just do that by reading standard out and writing to standard in?


I had a really hard time getting anything like that to work (you can't just read stdout and write stdin, because you're driving a terminal in raw mode), but it took like 3 sentences worth of Claude prompt to get Claude to use tmux to do this reliably.


I tell Claude code to use an existing tmux session to interact with eg a rails console, and it uses tmux send-keys and capture-pane for IO. It gets tripped up if a pager is invoked, but otherwise it works pretty well. Didn’t occur to me to tell it to take screenshots.


`tmux capture-pane`.


I would love to see your prompt if you ever post it anywhere.


For Claude, it's enough to prompt "use tmux to test", that usually does the work out of the box. If colors are important I also add "use -e option with capture-pane to see colors". It just works. I used it regularly with Claude and my TUI. For other agents other than Claude I need to use a more specific set of instructions ("use send-keys, capture-pane and mouse control via tmux" etc.)

Since I have e2e tests, I only use the agent for: guiding it on how to write the e2e test ("use tmux to try the new UI and then write a test") or to evaluate its overall usability (fake user testing, before actual user testing): "use tmux to evaluate the feature X and compile a list of usability issues"


Thank you!


Also many CLIs act differently when invoked connected to a terminal (TUI/interactive) vs not. So you’d run into issues there where Claude could only test the non-interactive things.


So by screenshots you mean tmux capture-pane, not actual screenshots. So in essence it is using stdout, just not Claude’s own.


"In essence" but terminals do stuff to render stdout that you do not want a LLM to have to replicate, I think. If your TUI does stuff in fullscreen or otherwise with a bunch of control codes, that is simple work for a terminal but potentially intractable for a LLM.


My website's contact form has a reCAPTCHA and it still gets spam sent through it (though vastly less). They pass the reCAPTCHA somehow. My contact form literally only emails me and they still do it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: