GoogleThe company released a technical report last week, saying Gemini The 1.5 Pro model significantly improved its math scores after being trained in a specific area of math.And successfully solved some problems of the International Mathematical Olympiad.
Google trained the Gemini 1.5 Pro model specifically for mathematical scenarios and tested it with the MATH benchmark, the American Invitational Mathematics Examination (AIME), and Google's internal HiddenMath benchmark.
According to Google, Math Gemini 1.5 Pro performs “on par with human experts” on math benchmarks, solving significantly more problems on the AIME benchmark than the standard, non-Math Gemini 1.5 Pro, and also achieving improved scores on other benchmarks.
Of the three examples shared by Google, two were solved by the math-specific Gemini 1.5 Pro, while one was incorrectly solved by the standard Gemini 1.5 Pro variant. These problems typically require solvers to recall basic math formulas from algebra and rely on their segmentation and other math rules to arrive at the correct answer.
In addition to the questions, Google also shared important details about the Gemini 1.5 Pro benchmarks, which show that Gemini 1.5 Pro is ahead of GPT-4 Turbo and Amazon's Claude in all five benchmark scores.
Google said that the mathematical derivative Gemini 1.5 Pro has a single sample MATH benchmark accuracy of 80.6%, and when sampling 256 solutions and selecting a candidate answer (rm@256), the accuracy reaches 91.1%.