Over six days (28–30 December 2023 and 2–4 January 2024), our students attended daily seminars to understand approaches to Olympiad Problems in various mathematical fields and how AI, specifically ChatGPT, solves them.
Since 2014, Exact Science's teachers have hosted unique camps in Mathematics and Programming, offering a rich mix of learning materials, problem sets, and a diverse group of teachers and participants. This year, Exact Science launched its first Olympiad Mathematics Camp.
Over six days (28–30 December 2023 and 2–4 January 2024), our students attended daily seminars to understand approaches to Olympiad Problems in various mathematical fields — and to see how AI, specifically ChatGPT, attempts to solve them.
Our excitement peaked with the AIMO Prize: a $10 million challenge fund by XTX Markets to develop AI capable of winning a gold medal at the International Mathematical Olympiad (IMO).
Today, I want to demonstrate how ChatGPT attempts to solve Olympiad problems for ages 10–12 in four topics.
1. Cryptarithms
We started with a classic problem:
In this cryptarithm: BAO × BA × B = 2002. What are the values of B, A, and O?
ChatGPT's first step already led to an incorrect conclusion. The correct first step is to find the prime factorisation:
1001 is easily mistaken for a prime, but it is not. From the factorisation, logic and a few attempts lead to the unique solution. Try to find it yourself!
2. Parity
Numbers from 1 to 10 are written in a row. Is it possible to place "+" and "−" signs between them so that the value of the resulting expression is zero?
ChatGPT attempted to iterate through combinations and produced:
The correct answer is No, it is not possible. The reasoning is simple using parity:
Adding or subtracting five odd numbers and five even numbers can never yield an even result — and zero is even. We checked.
3. River Crossing Riddles
A farmer wants to cross a river with a wolf, a goat, and a cabbage. The boat fits himself plus one item. Wolf eats goat; goat eats cabbage if left alone. How does he get everything across?
ChatGPT managed the first two logical steps correctly but lost track after "The farmer leaves the goat on the other side and takes the wolf back." A classic case of the model failing to maintain state over multiple reasoning steps.
4. Olympiad Geometry
The problem was from the Junior Mathematics Challenge 2018, Question 3 — one of the simpler problems at our camp. ChatGPT was on the right track but failed to reach the correct conclusion. It was fortunate the question was multiple-choice, or it might not have noticed its error.
The answer is clearly .
Verdict
As for AI solving the International Maths Olympiad — it seems unlikely to happen in the near future. These results show that even entry-level Olympiad problems reliably expose ChatGPT's reasoning gaps.
