Data shows thatxA The team's Grok-2 and Grok-Mini models officially entered the LMSys ChatbotsArena leaderboard, where Grok-2 stands out with a good second place, surpassing OpenAI's GPT-4o (in May) and tied with the latest Gemini model, supported by active votes from more than 6,000 community users.
Grok-2 performed particularly well in math tasks, taking first place in that category, and also achieved second place in several other tasks, including complex prompts, programming, and following instructions. In contrast, Grok-2-Mini entered the rankings in fifth place, demonstrating its impressive strength.
Grok-2-Mini has also experienced a significant speed increase and now runs twice as fast as before. This leap forward improvement is due to xA The inference team completely rewrote the inference stack and used SGLang to achieve more efficient multi-host inference and improved accuracy. At the same time, the team also introduced new computing and communication kernel algorithms, as well as better batch scheduling and quantization techniques to further improve the overall performance of the model.
Although some people are skeptical about the performance of Grok-2 and believe that OpenAI's GPT-4o is better, in actual use, many users have said that Grok-2 does perform quite well in programming and mathematical tasks. The Grok-2 series of models was released in beta this month, and users can also experience it through the X platform. In addition, the model also supports image creation using the FLUX.1 image generation model.