OpenAI’s general-purpose reasoning model is different because it was not a custom geometry machine - it took a single prompt and found a new load-bearing route through Erdős’s planar unit-distance problem.
The problem sounds like something a bored carpenter might draw on plywood during lunch: put n points on a flat plane, then count how many pairs sit exactly one unit apart. Not “close enough for government work.” Exactly one.
Paul Erdős looked at this in 1946 and basically said, “The square-grid approach is probably the best framing plan.” More formally, he conjectured that the maximum number of unit-distance pairs grows only a little faster than linear, around n^(1 + O(1 / log log n)). That is math notation for “yes, it grows, but don’t start pouring a skyscraper foundation.”
What the Machine Actually Built
The model did not just throw dots at graph paper like a caffeinated apprentice with a nail gun. The proof uses algebraic number theory, a branch of math where numbers come with extra structure and everyone pretends that is a normal way to spend a Tuesday.
The short version: Erdős’s grid construction was one kind of scaffold. The AI found a different scaffold, built from more exotic coordinates, that creates more one-unit connections than the old design allowed. A later human-verified writeup describes the OpenAI-generated counterexample in cleaner form arXiv:2605.20695. Will Sawin also sharpened the idea into an explicit lower bound, proving constructions with more than n^1.014 unit distances arXiv:2605.20579.
That number may look tiny. In asymptotic math, tiny exponents can be steel beams. Once the exponent is fixed above 1, the old roofline is gone.
The Building Inspector Still Matters
This is the part where we keep our hard hat on. OpenAI has not released every detail about the model or its full internal workflow. The result is not “AI is now a tenured mathematician with bad coffee habits.” It is closer to: a reasoning model generated a serious mathematical construction, and human experts inspected the beams.
That inspection matters. Math is not a vibes-based permitting office. A proof either holds or it does not. The OpenAI result gained credibility because mathematicians such as Timothy Gowers, Noga Alon, Will Sawin, Arul Shankar, and others examined the structure and found it sound enough to discuss as real research, not just a flashy demo with fresh paint.
This fits a larger pattern. DeepMind’s AlphaGeometry solved Olympiad geometry problems by combining neural language models with symbolic search Nature, DOI: 10.1038/s41586-023-06747-5. FunSearch used language models plus program search to discover new mathematical and algorithmic constructions Nature, DOI: 10.1038/s41586-023-06924-6. AlphaProof pushed formal reasoning in Lean toward Olympiad-level theorem proving Nature, DOI: 10.1038/s41586-025-09833-y. FrontierMath, meanwhile, was built to test whether models can handle problems beyond classroom drills arXiv:2411.04872.
Different crews, same job site: make AI useful for serious reasoning without letting it sign off its own inspection report.
Why This One Has People Staring at the Plans
The unit-distance problem is not a toy benchmark. It sits in discrete geometry, where simple-looking questions often hide rebar made of combinatorics, number theory, and geometry. Wikipedia’s background on automated theorem proving and proof assistants helps explain the split here: some systems mechanically check proofs, while others try to discover them automated theorem proving, proof assistant. OpenAI’s claim lands on the discovery side, which is the part that makes researchers put down the tape measure.
If this generalizes, AI could become a serious drafting partner for mathematicians: suggesting constructions, connecting distant methods, and grinding through technical cases that humans would rather not carry up six flights of stairs. Tools like mapb2.io are a humble cousin of that idea for everyday thinking: lay out the structure, see the dependencies, find the weak joists before the whole thing sags.
But the limitations are real. Models can still hallucinate. They can produce confident nonsense with the swagger of a contractor who forgot the foundation. The difference here is that the output met human mathematical scrutiny. That is the standard to watch.
References
- Castelvecchi, D. “AI cracks 80-year-old mathematics challenge - researchers are astonished.” Nature (2026). DOI: 10.1038/d41586-026-01651-0. PMID: 42174172.
- OpenAI. “An OpenAI model has disproved a central conjecture in discrete geometry” (2026). OpenAI research post.
- Bloom, T. et al. “Remarks on the disproof of the unit distance conjecture” (2026). arXiv: 2605.20695.
- Sawin, W. “An explicit lower bound for the unit distance problem” (2026). arXiv: 2605.20579.
- Trinh, T. H. et al. “Solving olympiad geometry without human demonstrations.” Nature 625, 476-482 (2024). DOI: 10.1038/s41586-023-06747-5.
- Romera-Paredes, B. et al. “Mathematical discoveries from program search with large language models.” Nature 625, 468-475 (2024). DOI: 10.1038/s41586-023-06924-6.
- Hubert, T. et al. “Olympiad-level formal mathematical reasoning with reinforcement learning.” Nature 645, 633-638 (2025). DOI: 10.1038/s41586-025-09833-y.
- Glazer, E. et al. “FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI” (2024). arXiv: 2411.04872.
Disclaimer: This blog post is a simplified summary of published research for educational purposes. The accompanying illustration is artistic and does not depict actual model architectures, data, or experimental results. Always refer to the original paper for technical details.