OpenBuddy open source large language model team released the Chinese version of Llama3.1-8B model

Meta recently released a new generationOpen Source ModelseriesLlama3.1, including a 405B parameter version that approaches or even surpasses closed-source models such as GPT-4 in some benchmarks. Llama3.1-8B-Instruct is an 8B parameter version in the series, supporting English, German, French, Italian, Portuguese, Spanish, Hindi, and Thai, with a context length of up to 131072 tokens, and the knowledge deadline is updated to December 2023.

To enhance the capabilities of Llama3.1-8B-Instruct, Meta used more than 25 million pieces of synthetic data generated by a larger 405B model in training. This enables Llama3.1-8B-Instruct to demonstrate cognitive and reasoning capabilities similar to GPT3.5Turbo in tests such as code and mathematics.

OpenBuddy open source large language model team released the Chinese version of Llama3.1-8B model

OpenBuddyUsing the Llama3.1-8B-Instruct model, OpenBuddy-Llama3.1-8B-v22.1-131K was released by training on a small amount of Chinese data. This is a new generation of open source cross-language model with Chinese question answering and cross-language translation capabilities. Although Llama3.1 itself does not have Chinese capabilities, after training, the model can generate answers that are usually only generated by larger models on some questions that are prone to conceptual confusion, showing stronger cognitive potential.

However, due to the limitation of training dataset and time, OpenBuddy-Llama3.1-8B-v22.1 still has limitations in Chinese knowledge, especially traditional cultural knowledge. Nevertheless, the model shows relatively stable performance in tasks such as long text comprehension, which benefits from its original long text capability.

In the future, OpenBuddy plans to conduct larger-scale training of the 8B and 70B models to enhance the models’ Chinese knowledge reserves, long-text capabilities, and cognitive abilities, and explore the possibility of fine-tuning the 405B model.

Project address:https://modelscope.cn/models/OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k

 

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Mistral releases Large 2 flagship AI model with 123 billion parameters: supports more than 80 programming languages, enhances code generation, math and reasoning capabilities

2024-7-25 8:36:29

Information

Tencent Zhiying PC version launches the "Smart Canvas" function to support re-creation, cutout, elimination, expansion, etc.

2024-7-25 8:39:39

Search