The biggest problem with LLM reviews for me is not false positives, but authority. Younger devs are used to accepting bot comments as the ultimate truth, even when they are clearly questionable
I alluded to it in a separate comment but the problem I have here is that it is really hard to get through to them on this too.
Upskilling a junior dev required you spend time in the code and sharing knowledge, doing pairing and such like. LLMs have abstracted a good part of that away and in doing so broken a line of communication, and while there are still many other topics that can be tackled as a mentor, the one most relevant to an upstart junior is effective programming and they will more likely disappear into Claude Code for extended lengths of time than reach out for help now.
This is difficult to work with because you’ll need to do more frequent check-ins, akin to managing. And coaching someone through a prompt and a fancy MCP setup isn’t the same as walking through a codebase, giving context, advising on idiomatic language use and such like.
Yes, I've found some really interesting bugs using LLM feedback, but it's about a 40% accuracy rate, mostly when it's highlighting things that are noncritical (for example, we don't need to worry about portability in a single architecture app that runs on a specific OS)