Tech media outlet The Decoder published a blog post reporting that at the Chatbot Arena, theOpenAI The new AI models o1-preview and o1-mini topped the list.
Introduction to Chatbot Arena
Chatbot Arena, a platform for comparing AI models, evaluated the new OpenAI system using more than 6,000 community ratings.
result
The results show that o1-preview and o1-mini especiallyExcels in math tasks, complex prompts, and programming.
The mathematical model dominance charts provided by Lmsys clearly show that o1-preview and o1-mini scored over 1360 points, which is much higher than the performance of the other models.
The goal of O1 is to set a new generalized standard for AI reasoning, i.e., "think" " longer before answering.
However, the O1 model is not superior to GPT-4o in all respects. many tasks do not require complex logical reasoning, and sometimes GPT-4o is more responsive.
Precautions
The vote counts for o1-preview and o1-mini are much lower than for established models such as GPT-4o or Anthropic's Claude 3.5, each of which has fewer than 3,000 comments, and such small sample sizes may not accurately represent the actual results, limiting the significance of the results.