斯坦福大模型评测榜 Claude 3 排名第一，阿里 Qwen2、零一万物 Yi Large 国产模型进入前十

Stanford UniversityThe Center for Research on Fundamental Models (CRFM) released the Massive Multitask Language Understanding on HELM ranking on June 11.Among the top ten language models, two are from Chinese manufacturers., respectively Alibaba's Qwen2 Instruct (72B) and Zero One Everything's Yi Large (Preview).

It is reported that the Large-Scale Multi-Task Language Understanding Evaluation (MMLU on HELM) uses a test method proposed by Dan Hendrycks et al. to measure the accuracy of text models in multi-task learning. This test includes basic mathematics, American history, computer science, law and other fields. 57 missionsTo get a high score on this test, a model must have extensive world knowledge and problem-solving skills. IT Home attached rankings are as follows:

Claude 3 ranked first in the Stanford Large Model Evaluation List, and Ali Qwen2 and Zero One Yi Large domestic models entered the top ten

▲ Image source: Stanford University Center for Basic Model Research official website

1. Claude 3 Opus (20240229): Anthropic (USA, Amazon investment)
2. GPT-4o (2024-05-13): OpenAI (USA)
3. Gemini 1.5 Pro: Google (USA)
4. GPT-4 (0613): OpenAI (USA)
5. Qwen2 Instruct (72B): Alibaba (China)
6. GPT-4 Turbo (2024-04-09): OpenAI (USA)
7. Gemini 1.5 Pro (0409 preview): Google (USA)
8. GPT-4 Turbo (1106 previews): OpenAI (USA)
9. Llama 3 (70B): Meta (USA)
10. Yi Large (Preview): Zero One Everything (China)

Qwen2 is an open-source large language model developed by Alibaba and released on June 6 this year. The Qwen2 series includes five pre-trained and instruction fine-tuning models of different sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B and Qwen2-72B; supports data training in 27 additional languages in addition to English and Chinese; Qwen2-7B-Instruct and Qwen2-72B-Instruct support long 128K The context of a token.

Yi Large is a closed-source large model developed by Zero One Everything. The Yi model series is based on 6B and 34B pre-trained language models, and then expanded to chat models,200K Long context model, deep upgrade model and visual language model. The official claimed that "it outperforms leading models such as GPT-4 and Claude 3 Opus in key benchmark scores."

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Claude 3 ranked first in the Stanford Large Model Evaluation List, and Ali Qwen2 and Zero One Yi Large domestic models entered the top ten

Qualcomm opens AI models to help developers build smart applications for the Snapdragon X Elite platform

Hugging Face CEO: More and more AI startup founders want to sell their companies

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Qualcomm opens AI models to help developers build smart applications for the Snapdragon X Elite platform

Hugging Face CEO: More and more AI startup founders want to sell their companies

To prevent chatbots from "spreading rumors", Google Deepmind and Stanford University researchers launched AI fact-checking tools

Stanford University releases "2024 Artificial Intelligence Index Report": China ranks first in the number of AI patents, but has fewer top AI models

Stanford team apologizes for plagiarizing Tsinghua's AI model: Llama3-V model will be removed

The "HumanPlus" robot is launched: it can imitate human movements to play the piano and fold clothes, based on the platform of a Chinese company

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow