Claude 3 ranked first in the Stanford Large Model Evaluation List, and Ali Qwen2 and Zero One Yi Large domestic models entered the top ten

Stanford UniversityThe Center for Research on Fundamental Models (CRFM) released the Massive Multitask Language Understanding on HELM ranking on June 11.Among the top ten language models, two are from Chinese manufacturers., respectively Alibaba's Qwen2 Instruct (72B) and Zero One Everything's Yi Large (Preview).

It is reported that the Large-Scale Multi-Task Language Understanding Evaluation (MMLU on HELM) uses a test method proposed by Dan Hendrycks et al. to measure the accuracy of text models in multi-task learning. This test includes basic mathematics, American history, computer science, law and other fields. 57 missionsTo get a high score on this test, a model must have extensive world knowledge and problem-solving skills. IT Home attached rankings are as follows:

Claude 3 ranked first in the Stanford Large Model Evaluation List, and Ali Qwen2 and Zero One Yi Large domestic models entered the top ten

▲ Image source: Stanford University Center for Basic Model Research official website

  • 1. Claude 3 Opus (20240229): Anthropic (USA, Amazon investment)
  • 2. GPT-4o (2024-05-13): OpenAI (USA)
  • 3. Gemini 1.5 Pro: Google (USA)
  • 4. GPT-4 (0613): OpenAI (USA)
  • 5. Qwen2 Instruct (72B): Alibaba (China)
  • 6. GPT-4 Turbo (2024-04-09): OpenAI (USA)
  • 7. Gemini 1.5 Pro (0409 preview): Google (USA)
  • 8. GPT-4 Turbo (1106 previews): OpenAI (USA)
  • 9. Llama 3 (70B): Meta (USA)
  • 10. Yi Large (Preview): Zero One Everything (China)

Qwen2 is an open-source large language model developed by Alibaba and released on June 6 this year. The Qwen2 series includes five pre-trained and instruction fine-tuning models of different sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B and Qwen2-72B; supports data training in 27 additional languages in addition to English and Chinese; Qwen2-7B-Instruct and Qwen2-72B-Instruct support long 128K The context of a token.

Yi Large is a closed-source large model developed by Zero One Everything. The Yi model series is based on 6B and 34B pre-trained language models, and then expanded to chat models,200K Long context model, deep upgrade model and visual language model. The official claimed that "it outperforms leading models such as GPT-4 and Claude 3 Opus in key benchmark scores."

 

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Qualcomm opens AI models to help developers build smart applications for the Snapdragon X Elite platform

2024-6-23 9:11:41

Information

Hugging Face CEO: More and more AI startup founders want to sell their companies

2024-6-23 9:13:53

Search