Meta recently released a new generationOpen Source ModelseriesLlama3.1, including a 405B parameter version that approaches or even surpasses closed-source models such as GPT-4 in some benchmarks. Llama3.1-8B-Instruct is an 8B parameter version in the series, supporting English, German, French, Italian, Portuguese, Spanish, Hindi, and Thai, with a context length of up to 131072 tokens, and the knowledge deadline is updated to December 2023.
To enhance the capabilities of Llama3.1-8B-Instruct, Meta used more than 25 million pieces of synthetic data generated by a larger 405B model in training. This enables Llama3.1-8B-Instruct to demonstrate cognitive and reasoning capabilities similar to GPT3.5Turbo in tests such as code and mathematics.
OpenBuddyUsing the Llama3.1-8B-Instruct model, OpenBuddy-Llama3.1-8B-v22.1-131K was released by training on a small amount of Chinese data. This is a new generation of open source cross-language model with Chinese question answering and cross-language translation capabilities. Although Llama3.1 itself does not have Chinese capabilities, after training, the model can generate answers that are usually only generated by larger models on some questions that are prone to conceptual confusion, showing stronger cognitive potential.
However, due to the limitation of training dataset and time, OpenBuddy-Llama3.1-8B-v22.1 still has limitations in Chinese knowledge, especially traditional cultural knowledge. Nevertheless, the model shows relatively stable performance in tasks such as long text comprehension, which benefits from its original long text capability.
In the future, OpenBuddy plans to conduct larger-scale training of the 8B and 70B models to enhance the models’ Chinese knowledge reserves, long-text capabilities, and cognitive abilities, and explore the possibility of fine-tuning the 405B model.
Project address:https://modelscope.cn/models/OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k