Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is the critical bit (paraphrasing):

Humans have worked out the amplitudes for integer n up to n = 6 by hand, obtaining very complicated expressions, which correspond to a “Feynman diagram expansion” whose complexity grows superexponentially in n. But no one has been able to greatly reduce the complexity of these expressions, providing much simpler forms. And from these base cases, no one was then able to spot a pattern and posit a formula valid for all n. GPT did that.

Basically, they used GPT to refactor a formula and then generalize it for all n. Then verified it themselves.

I think this was all already figured out in 1986 though: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.56... see also https://en.wikipedia.org/wiki/MHV_amplitudes

 help



  > I think this was all already figured out in 1986 though
They cite that paper in the third paragraph...

  Naively, the n-gluon scattering amplitude involves order n! terms. Famously, for the special case of MHV (maximally helicity violating) tree amplitudes, Parke and Taylor [11] gave a simple and beautiful, closed-form, single-term expression for all n.
It also seems to be a main talking point.

I think this is a prime example of where it is easy to think something is solved when looking at things from a high level but making an erroneous conclusion due to lack of domain expertise. Classic "Reviewer 2" move. Though I'm not a domain expert and so if there was no novelty over Parke and Taylor I'm pretty sure this will get thrashed in review.


You're right. Parke & Taylor showed the simplest nonzero amplitudes have two minus helicities while one-minus amplitudes vanish (generically). This paper claims that vanishing theorem has a loophole - a new hidden sector exists and one-minus amplitudes are secretly there, but distributional

> simplest nonzero amplitudes have two minus helicities while one-minus amplitudes vanish

Sorry but I just have to point out how this field of maths read like Star Trek technobabble too me.


Where do you think Star Trek got its technobabble from?


Cool idea but the ai readme text is so cringy in places “This is FUN, not FEAR”


[flagged]


Be careful, in the strength of your passions, that you don't become a stochastic word generator yourself.

  > Am I getting that right?
My comment was in response to the claim I responded to. Any inference you have made about my feelings about OpenAI are that of your own. You can search my comment history if you want to verify or reject your suspicion. I don't think you'll be able to verify it...


[flagged]


[flagged]


I feel for you because you kinda got baited into this by the language in the first couple comments. But whatever’s going on in your comment is so emotional that it’s hard to tell what you’re asking for that you haven’t been able to read already, tl;dr proof stuck at n=4 for years is now for arbitrary n

Yeah I kind of fell for it. I was hoping to be pleasantly surprised by a particle physicist in the openai victory lap thread or someone with insight into what “GPT 5.2 originally conjectured this” means exactly because the way it’s phrased in the preprint makes it sound like they were all doing bongrips with chatgpt and it went “man do you guys ever think about gluon tree amplitudes?” but uh, my empty post getting downvoted hours after being made empty makes it pretty clear that this is a strictly victory-lap-only thread

Fwiw I'm not trying to celebrate for OpenAI. The press piece definitely makes bolder claims than the paper.

I was just stating the facts and correcting a reaction that went too far in the other direction. By taking my comment as supporting or validating OpenAI's claim is just as bad. An error of the same magnitude.

I feel like I've been quoting Feynman a lot this week: The first principle is to not fool yourself, and you're the easiest person to fool. You're the easiest person for you to fool because you're as smart as yourself and deception is easier than proving. We all fall for these traps and the smartest people in the world (or history) are not immune to it. But it's interesting to see on a section of the internet that prides itself for its intelligence. I think we just love blinders, which is only human



It bears repeating that modern LLMs are incredibly capable, and relentless, at solving problems that have a verification test suite. It seems like this problem did (at least for some finite subset of n)!

This result, by itself, does not generalize to open-ended problems, though, whether in business or in research in general. Discovering the specification to build is often the majority of the battle. LLMs aren't bad at this, per se, but they're nowhere near as reliably groundbreaking as they are on verifiable problems.


> modern LLMs are incredibly capable, and relentless, at solving problems that have a verification test suite.

Feel like it's a bit what I tried to expressed few weeks ago https://news.ycombinator.com/item?id=46791642 namely that we are just pouring computational resources at verifiable problems then claim that astonishingly sometimes it works. Sure LLMs even have a slight bias, namely they do rely on statistics so it's not purely brute force but still the approach is pretty much the same : throw stuff at the wall, see what sticks, once something finally does report it as grandiose and claim to be "intelligent".


> throw stuff at the wall, see what sticks, once something finally does report it as grandiose and claim to be "intelligent".

What do we think humans are doing? I think it’s not unfair to say our minds are constantly trying to assemble the pieces available to them in various ways. Whether we’re actively thinking about a problem or in the background as we go about our day.

Every once in a while the pieces fit together in an interesting way and it feels like inspiration.

The techniques we’ve learned likely influence the strategies we attempt, but beyond all this what else could there be but brute force when it comes to “novel” insights?

If it’s just a matter of following a predefined formula, it’s not intelligence.

If it’s a matter of assembling these formulas and strategies in an interesting way, again what else do we have but brute force?


See what I replied just earlier https://news.ycombinator.com/item?id=47011884 namely the different regimes, within paradigm versus challenging it by going back to first principles. The ability to notice something is off beyond "just" assembling existing pieces, to backtrack within the process when failures get too many and actually understand the relationship is precisely different.

So I don’t really see why this would be a difference in kind. We’re effectively just talking about how high up the stack we’re attempting to brute force solutions, right?

How many people have tried to figure out a new maths, a GUT in physics, a more perfect human language (Esperanto for ex.) or programming language, only to fail in the vast majority of their attempts?

Do we think that anything but the majority of the attempts at a paradigm shift will end in failure?

If the majority end in failure, how is that not the same brute force methodology (brute force doesn’t mean you can’t respond to feedback from your failed experiments or from failures in the prevailing paradigms, I take it to just fundamentally mean trying “new” things with tools and information available to you, with the majority of attempts ending in failure, until something clicks, or doesn’t and you give up).


While I don't think anyone has a plausible theory that goes to this level of detail on how humans actually think, there's still a major difference. I think it's fair to say that if we are doing a brute force search, we are still astonishingly more energy efficient at it than these LLMs. The amount of energy that goes into running an LLM for 12h straight is vastly higher than what it takes for humans to think about similar problems.

at similar quality NN speed is increasing by ~5-10x per year. nothing SOTA is efficient. it's the preview for what will be efficient in 2-3 years

In the research group I am, we have usually try a few approach to each problem, let's say we get a:

Method A) 30% speed reduction and 80% precision decrease

Method B) 50% speed reduction and 5% precision increase

Method C) 740% speed reduction and 1% precision increase

and we only publish B. It's not brute force[1], but throw noodles at the wall, see what sticks, like the GP said. We don't throw spoons[1], but everything that looks like a noodle has a high chance of been thrown. It's a mix of experience[1] and not enough time to try everything.

[1] citation needed :)


I always call it the "Wacky Wallwalker" method (if you're of a certain age, this will make sense to you).

The field of medicine - pharmacology and drug discovery, is an optimized version of that. It works a bit like this:

Instead of brute-forcing with infinite options, reduce the problem space by starting with some hunch about the mechanism. Then the hard part that can take decades: synthesize compounds with the necessary traits to alter the mechanism in a favourable way, while minimizing unintended side-effects.

Then try on a live or lab grown specimen and note effectiveness. Repeat the cycle, and with every success, push to more realistic forms of testing until it reaches human trials.

Many drugs that reach the last stage - human trials - often end up being used for something completely other than what they were designed for! One example of that is minoxidil - designed to regular blood pressure, used for regrowing hair!


It’s almost like the iteration loop refines itself between checks notes in Sutton search and learning

That's also what most grad students are doing. Even in the unlikely case they completely stop improving, it's still a massive deal.

Once heard someone call it "graduate student descent" and I've never heard a more apt term!

Yes, this is where I just cannot imagine completely AI-driven software development of anything novel and complicated without extensive human input. I'm currently working in a space where none of our data models are particularly complex, but the trick is all in defining the rules for how things should work.

Our actual software implementation is usually pretty simple; often writing up the design spec takes significantly longer than building the software, because the software isn't the hard part - the requirements are. I suspect the same folks who are terrible at describing their problems are going to need help from expert folks who are somewhere between SWE, product manager, and interaction designer.


Even more generally than verification, just being tied to a loss function that represent something we actually care about. E.g. compiler and test errors, LEAN verification in Aristotle, basic physics energy configs in AlphaFold, or win conditions in e.g. RL, such as in AlphaGo.

RLHF is an attempt to push LLMs pre-trained with a dopey reconstruction loss toward something we actually care about: imagine if we could find a pre-training criterion that actually cared about truth and/or plausibility in the first place!


There's been active work in this space, including TruthRL: https://arxiv.org/html/2509.25760v1. It's absolutely not a solved problem, but reducing hallucinations is a key focus of all the labs.

That paper from the 80s (which is cited in the new one) is about "MHV amplitudes" with two negative-helicity gluons, so "double-minus amplitudes". The main significance of this new paper is to point out that "single-minus amplitudes" which had previously been thought to vanish are actually nontrivial. Moreover, GPT-5.2 Pro computed a simple formula for the single-minus amplitudes that is the analogue of the Parke-Taylor formula for the double-minus "MHV" amplitudes.

You should probably email the authors if you think that's true. I highly doubt they didn't do a literature search first though...

You should be more skeptical of marketing releases like this. This is an advertisement.

It's hard to get someone to do literature first when they get free publicity by not doing literature search and claiming some major AI assisted breakthrough...

Heck, it's hard to get authors to do literature search, period: never mind not thoroughly looking for prior art, even well known disgraced papers get citated continue to get possitive citations all the time...


They also reference Parke and Taylor. Several times...

Don't underestimate the willingness of physicists to skimp on literature review.

After last month’s Erdos problems handling by LLMs at this point everyone writing papers should be aware that literature checks are approximately free, even physicists.

> But no one has been able to greatly reduce the complexity of these expressions, providing much simpler forms.

Slightly OT, but wasn't this supposed to be largely solved with amplituhedrons?


Still pretty awesome though, if you ask me.

I think even “non-intelligent” solver like Mathematica is cool - so hell yes, this is cool.

Big difference between “derives new result” and “reproduces something likely in its training dataset”.

Sounds somehow similar to the groundbreaking application of a computer to prove the 4 color theorem. Then the researchers wrote a program to find and formally prove the numerous particular cases. Here the computer finds a simplifying pattern.

I'm not sure if GPTs ability goes beyond a formal math package's in this regard or its just its just way more convienient to ask ChatGPT rather than using these software.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: