GoogleDeepMind's latestGeminiThe experimental version (Exp1114) has achieved impressive results on the Chatbot Arena platform. After more than a week of community testing and accumulating more than 6,000 votes, this new model outperforms the competition by a significant margin, showing amazing strength in several key areas.
In terms of overall scoring, Gemini-Exp-1114 tied for first place with GPT-4-latest with an outstanding score of over 40 points, surpassing the previously leading GPT-4-preview version. What is even more amazing is that the model fully topped the core areas of math, complex prompts, and creative writing, demonstrating great overall strength.
Specifically, the Gemini-Exp-1114's progress is impressive: the
Jumped from No. 3 to No. 1 in the overall rankings
Moved from 3rd to 1st place on the Math Proficiency Assessment
Complex cue processing climbs from #4 to #1
Creative writing performance improved from 2nd to 1st place
Visual processing power also tops the list
Programming level also improved from 5th to 3rd place
Google AI Studio has officially launched this new version for users to actually experience. However, the community has also expressed concerns about some specific issues, such as whether the 1000 token limit still exists, and how to deal with practical application issues such as extra-long text output.
Industry analysts believe that this breakthrough shows that Google's long-term investment in AI is starting to reap results. Interestingly, the model maintains its 4th place ranking in style control, which may suggest that the development team primarily used new post-training methods rather than making changes to the pre-trained model.
It has been suggested that this could herald the arrival of Gemini2 and that Google is becoming significantly more competitive in the large modeling space.