o1/Claude Collective Flop! Tao Zhexuan and 60+ other top mathematicians join forces to propose new math benchmarks

Epoch AI has launched FrontierMath, a new math benchmark that challenges the mathematical reasoning of top big models. Models including GPT-4o and others have solved less than 2% problems on the benchmark.Designed by more than 60 leading mathematicians, the set of benchmarks covers areas ranging from number theory to algebraic geometry and is extremely difficult. In the future, the study will add questions and optimize the evaluation process. (Quantum Bits)

o1/Claude Collective Flop! Tao Zhexuan and 60+ other top mathematicians join forces to propose new math benchmarks

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow