The International Mathematical Olympiad (IMO), widely considered one of the most challenging high school mathematics competitions, attracts math elites from around the world to compete annually. Recently, an Artificial Intelligence (AI) program named “AlphaGeometry” has become the new focus in this field, with its capability to handle geometric proof problems almost matching that of the top human contestants, astonishing the entire mathematical community.
This technological breakthrough was achieved through a collaboration between Google’s DeepMind and researchers at New York University. As reported in a recent magazine article, this AI program is able to solve geometric problems from the past IMO question banks and, in some cases, AlphaGeometry not only solved the problems but also discovered more general solutions that had previously eluded human contestants.
Due to the unique nature of the IMO competition, where the problems are not only difficult but often require competitors to use their wits to find simple and elegant proof methods, such challenges have greatly attracted the interest of researchers in the field of artificial intelligence. However, translating mathematical proofs into a language that computers can understand is extremely difficult, and previous attempts to formalize geometric problems have faced limitations of not being able to use tools from other areas of mathematics.
In this context, Zheng Huangzhao and his team blazed a new trail by creating a purely machine-generated dataset for AI to learn from without restrictions, negating the need to translate human proofs into formalized language and thus saving considerable manpower. They started by using algorithms to create initial geometric figures and adding relevant properties, laying the foundation for AI to learn independently and demonstrate problem-solving abilities.
Exploring the essence of a geometric entity, such as a triangle, involves various elements – like the altitudes of the triangle and their intersection points. This complexity analysis is precisely what the key module in the AlphaGeometry framework – known as the “Deductive Database Arithmetic Rules” (DDAR) – excels at. This engine is composed of two parts: the first is the Deductive Database (DD), which contains various basic geometric principles and theorems, such as “a line segment drawn through the midpoints of two sides of a triangle will be parallel to the third side”; the second part is the Arithmetic Rules (AR), responsible for performing algebraic operations and transformations between edges and angles.
In this way, the reasoning engine can start from a given geometric figure and deduce what’s called a “deductive closure,” which is all the possible geometric conclusions that can be obtained without any additional constructions. These conclusions might include certain relationships of equal angles or discovering that four specific points lie on a circle. From this, the research team obtained a rich dataset consisting of “geometric theorems” and their corresponding “proofs,” laying the groundwork for subsequent research.
However, in the face of high-level competition problems such as the International Mathematical Olympiad (IMO), relying solely on the aforementioned reasoning algorithms is not enough. High-difficulty geometry problems often require contestants to creatively introduce new auxiliary structures to construct a complete proof logic. The research group led by Zheng Huangchao pointed out in their paper: “How to introduce necessary auxiliary constructions in the proof is key to the research.”
For high school students, adding non-existent auxiliary points or lines to mathematical problems is often a challenging task, and it is also a significant difficulty for computer systems. Traditional algorithms cannot easily solve this problem, but modern AI algorithms, and especially Large Language Models (LLM), have a natural advantage.
Large language models predict the probability of the next word by analyzing the vocabulary that appears previously, which is a fundamental skill that is another core module of the AlphaGeometry architecture. Using a “Geometric Theorem-Proof” dataset with geometric characteristics, the research team carefully trained the big language model. The training process includes two stages: first is pre-training, where the AI learns about one hundred million proof instances from the dataset to form a basis for necessary auxiliary constructions; second is fine-tuning, where researchers use a backstepping algorithm to optimize the constructions that the final proof relies on, eliminating all auxiliary constructions that are irrelevant to the theorem, leaving the most streamlined, essential structures.
This entire process endows the AI system with profound understanding capabilities, allowing it to identify and execute the essential auxiliary constructions in geometric proofs, thereby conquering highly difficult mathematical problems.
Despite advanced language models lacking the “rigorous deduction steps” necessary to solve geometric problems, these steps are still performed by a specific reasoning engine. In the whole system architecture, the role of the trained language model is limited to constructing auxiliary objects, such as points and lines. By understanding the workings of key components, we can better reveal the workflow of the AlphaGeometry problem-solving system.
When AlphaGeometry receives a geometry problem, the reasoning engine will start by analyzing the basic properties of the figure, generating an inference closure. If these initial analyzed qualities do not include the conclusion to be proven, then the intervention of AI becomes necessary. For example, AI may decide to add a new point to the triangle ABC in the problem — point D as the midpoint of side BC. This kind of “creativity” is an ability that AI has accumulated through a wealth of training data. In this way, AI provides the reasoning engine with a richer geometric construction basis, thereby deducing new properties. The reasoning engine and AI model can operate alternately, working together until the target conclusion is proven.
“This method seems quite reasonable, and to some extent it is similar to the training process of mathematical competition contestants.” commented Peter Scholze, a three-time International Mathematical Olympiad (IMO) gold medalist and renowned Fields Medal recipient.
To verify the capabilities of AlphaGeometry, the research team selected a series of geometry problems from the International Mathematical Olympiad (IMO) competitions since 2000 for testing. These 30 problems accounted for 75% of all geometry problems, excluding those that could not be represented in the system, such as geometric inequalities and combinatorial geometry problems. AlphaGeometry successfully solved 25 of them, while even the powerful language model GPT-4 failed to solve any, and an average IMO contestant could only solve 15.2 problems, a bronze medalist could solve 19.3, and the performance of a gold medalist was similar to that of AlphaGeometry but slightly higher, solving 25.9 problems. This demonstrates that AlphaGeometry’s problem-solving capabilities have surpassed those of most IMO contestants.
It is worth noting that before AlphaGeometry, the renowned Chinese mathematician Wu Wenjun developed a geometric problem-solving algorithm known as “Wu’s Method.” Theoretically, Wu’s Method is powerful enough to prove any conclusion that can be demonstrated based on Euclidean laws. However, it faces two major limitations: first, the proofs it generates are not human-readable, and second, it is extremely inefficient—able to solve only 10 of the 30 problems even with 4.5 hours available for each problem. Even when the time was increased tenfold, there was no significant improvement in the performance of Wu’s Method.
Fortunately, the mathematical proofs generated by AlphaGeometry are easy for humans to understand and read. Upon careful review of the AI-generated solutions to geometry problems, researchers even found some highly creative methods of proof.
In the progress of artificial intelligence in the field of mathematics, AlphaGeometry represents a landmark achievement. This AI system has demonstrated exceptional capability in mathematical proof, showing problem-solving skills similar to those of top human contestants even when tackling geometry problems at the level of the International Mathematical Olympiad (IMO). It even proposed more universal solutions to the original problems, which do not rely on all the conditions of the original problems, proving AI’s potential in understanding mathematics and generalizing problems.
Challenging mathematical problems often require great effort on the part of human contestants and are equally difficult for AI to solve. Complex and twisted problems usually require AI to provide more detailed proof processes. Although AlphaGeometry is not yet eligible to participate in the IMO competitions since the problems in the competitions are not limited to geometry, the capability it has shown in problem-solving is already attracting attention.
AlphaGeometry’s efficient performance in solving geometry problems is attributed to its uniquely designed architecture, which includes three key components: a sampler for constructing diagrams, a symbolic reasoning engine, and a backtracking algorithm capable of identifying and executing auxiliary constructions. The design and implementation of each part are significant challenges, requiring tremendous effort and a deep understanding of the field of mathematics.
To enable AI to excel in mathematics, not limited to geometry, researchers must precisely define the concepts of other mathematical fields at the code level, and consider how to construct appropriate architectures. For example, the deductive database used by AlphaGeometry is based on years of research, and the development of its backtracking algorithm relies on mathematical expertise, perfected by researchers over four years.
With the rapid development of AI, we have reason to believe that in the future, there may be silicon-based competitors participating in the International Mathematical Olympiad (IMO), competing with humans and even possibly attaining the highest honor.
[ad_2]