o1/Claude Collective Flop! Tao Zhexuan and 60+ other top mathematicians join forces to propose new math benchmarks

Epoch AI has launched FrontierMath, a new math benchmark that challenges the mathematical reasoning of top big models. Models including GPT-4o and others have solved less than 2% problems on the benchmark.Designed by more than 60 leading mathematicians, the set of benchmarks covers areas ranging from number theory to algebraic geometry and is extremely difficult. In the future, the study will add questions and optimize the evaluation process. (Quantum Bits)

Search