阿里巴巴开源 1100 亿参数 Qwen1.5-110B 模型，与 Meta Llama3-70B 相媲美

AlibabaIt was announced recently thatOpen Source Qwen1.5 series first 100 billion parametersModel Qwen1.5-110B, the model is comparable to Meta-Llama3-70B in the basic ability evaluation and performs well in Chat evaluations, including MT-Bench and AlpacaEval 2.0.

Main content:

According to reports, Qwen1.5-110B is similar to other Qwen1.5 models and uses the same Transformer decoder architecture. It includes grouped query attention (GQA), which is more efficient during model reasoning.The model supports a context length of 32K tokens.At the same time, it is still multilingual, supporting English, Chinese, French, Spanish, German, Russian, Japanese, Korean, Vietnamese, Arabic and other languages.

Alibaba Qwen1.5-110B model was compared with the recent SOTA language models Meta-Llama3-70B and Mixtral-8x22B. The results are as follows:

Alibaba open-sources 110 billion parameter Qwen1.5-110B model, comparable to Meta Llama3-70B

The above results show that the new 110B model is at least comparable to the Llama-3-70B model in terms of basic capabilities. In this model, Alibaba did not make major changes to the pre-training method, so they believe that the performance improvement compared to 72BMainly comes from increasing the model size.

Alibaba also conducted a Chat evaluation on MT-Bench and AlpacaEval 2.0. The results are as follows:

Alibaba open-sources 110 billion parameter Qwen1.5-110B model, comparable to Meta Llama3-70B

Alibaba said that in a benchmark evaluation of two Chat models, compared to the previously released 72B model,110B performs significantly better.The consistent improvement in evaluation results suggests that a more powerful and larger base language model can lead to better Chat models, even without drastically changing the post-training approach.

Finally, Alibaba said that Qwen1.5-110B is the largest model in the Qwen1.5 series.It is also the first model in the series to have more than 100 billion parameters.It performs well against the recently released SOTA model Llama-3-70B and significantly outperforms the 72B model.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Alibaba open-sources 110 billion parameter Qwen1.5-110B model, comparable to Meta Llama3-70B

Tsinghua University establishes School of Artificial Intelligence, with Turing Award winner Yao Qizhi as dean

Equipped with Spark AI big model, iFLYTEK will launch voice calendar product next month

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Tsinghua University establishes School of Artificial Intelligence, with Turing Award winner Yao Qizhi as dean

Equipped with Spark AI big model, iFLYTEK will launch voice calendar product next month

The largest in China! Alibaba CEO Wu Yongming: 72 billion parameter large model will be open source soon

Kai-Fu Lee's AI company Zero One Everything announced the open source Yi-9B model, claiming to have the strongest mathematical capabilities in the same series of codes

HKU Open Source OpenGraph: Overcoming the Difficulties of Graph Basic Models and Implementing a Multi-Domain Universal Graph Model

B station open source lightweight Index-1.9B series model: 2.8T training data, support role-playing

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow