Brown further explained that approximately two months earlier, the IMO had invited OpenAI to participate in a formal version of the competition based on Lean, which is a programming language designed for writing mathematical proofs. The company declined because they were “focused on general reasoning in natural language without the constraints of Lean.” He stated they were “never approached about a natural language math option.”
However, an IMO coordinator told X user Mikhail Samin that OpenAI actually announced before the closing ceremony, contradicting Brown’s claim. The coordinator called OpenAI’s actions “rude and inappropriate,” noting that OpenAI “wasn’t one of the AI companies that cooperated with the IMO on testing their models.”
Hard math since 1959
The International Mathematical Olympiad, which has been running since 1959, represents one of the most challenging tests of mathematical reasoning. More than 100 countries send six participants each, with contestants facing six proof-based problems across two 4.5-hour sessions. The problems typically require deep mathematical insight and creativity rather than raw computational power. You can see the exact problems in the 2025 Olympiad posted online.
For example, problem one asks students to imagine a triangular grid of dots (like a triangular pegboard) and figure out how to cover all the dots using exactly n straight lines. The twist is that some lines are called “sunny”—these are the lines that don’t run horizontally, vertically, or diagonally at a 45º angle. The challenge is to prove that no matter how big your triangle is, you can only ever create patterns with exactly 0, 1, or 3 sunny lines—never 2, never 4, never any other number.
The timing of the OpenAI results surprised some prediction markets, which had assigned around an 18 percent probability to any AI system winning IMO gold by 2025.
Following OpenAI’s announcement and our initial publication of this article, Google DeepMind released its own IMO results, also claiming gold medal performance with its Gemini Deep Think model solving five of the six problems. Unlike OpenAI, Google worked directly with IMO organizers and had its results officially graded and certified by IMO coordinators. Google planned to adhere to the July 28 embargo but moved up its announcement after OpenAI’s early release.
DeepMind senior scientist Thang Luong told Ars Technica, “We confirmed with the IMO organization that we actually solved five perfectly. I think anyone who didn’t go through that process, we don’t know, they might have lost one point and gotten silver.”
This post was updated on July 22 with information about a new statement from Noam Brown and results from Google DeepMind’s IMO announcement.