That Putnam bench graph (middle one) is showing 49/658 solve rate. > The resulti...

darkmighty · 2025-04-30T17:24:48 1746033888

49/658 is 7%

smusamashah · 2025-04-30T17:26:08 1746033968

Sorry, forgot multiply by 100

booi · 2025-04-30T17:46:33 1746035193

I bet DeepSeek-Prover-V2 wouldn't have made that mistake

gallerdude · 2025-04-30T19:01:04 1746039664

classic human hallucination

HappyPanacea · 2025-04-30T18:47:06 1746038826

How likely is it that Putnam answers were in DeepSeek's training data?

EvgeniyZh · 2025-04-30T19:33:42 1746041622

The solutions weren't published anywhere. There is also no good automatic way to generate solutions as far as I know, even expensive ones (previous sota was 10 solutions and one before was 8 using pass@3200 for 7b model). Potentially the developers could've paid some people who are good in putnam-level math problems and lean to write solutions for LLMs. It is hard to estimate likelihood of that but it sounds like waste of money given relatively marginal problem/benchmark.

HappyPanacea · 2025-04-30T19:58:07 1746043087

AoPS seems to have a forum dedicated to Putnam (including 2024): https://artofproblemsolving.com/community/c3249_putnam and here is a pdf with solutions to Putnam 2023: https://kskedlaya.org/putnam-archive/2023s.pdf

EvgeniyZh · 2025-04-30T20:04:07 1746043447

These are still need to be formalized in Lean which can be harder than solving the problem sometimes