AI2 releases open language model OLMo, claiming that many performances are comparable to Llama2

AI2up to dateReleased OpenLanguage Model（OLMo) framework is designed to promote research and experimentation in large-scale language models. By providing training code, models, and evaluation code on Hugging Face and GitHub, AI2 is committed to enabling academics and researchers to jointly study the science of language models, explore the impact of new pre-training data subsets on downstream performance, and study new pre-training methods and stability.

The first batch of models in the project include four 7B scale final variants corresponding to different architectures, optimizers and training hardware, and a 1B scale model, all trained on at least 2T tokens. This is a long-term plan.FirstAs the company continues to build out its product, it plans to continue releasing larger models, models with guidance tweaks, and more variants.

Each model is provided with complete training data, including code for generating training data, as well as AI2's Dolma and WIMBD for analyzing pre-trained data. In addition, complete model weights, training code, training logs, training metrics in the form of Weights & Biases logs, and inference code are also provided. More than 500 checkpoints in the training process for each model are also available as revisions on HuggingFace.

In creating a strong open model, AI2 learned from many other open and partially open models and used them as competitive benchmarks for OLMo. The technical report of the project mentioned that the OLMo7B model surpassed the OLMo7B model in aspects such as generation tasks or reading comprehension (such as truthfulQA).Llama2, but lags slightly behind on popular question answering tasks such as MMLU or Big-bench Hard.

For the 1B OLMo model, an analysis was performed using AI2’s Paloma and checkpoints available on GitHub to explore the relationship between the model’s performance in terms of language prediction and factors such as model size. AI2 emphasized that Paloma’s approach attempts to provide a more balanced representation of the many domains in which language models are used by sampling each domain evenly.

The OLMo framework adoptsup to dateMany trends in the literature, including not using bias (such as stability in PaLM), the SwigLU activation function used by PaLM and Llama, Rotary Positional Embedding (RoPE), and a modified version of the BPE base tagger of GPT-NeoX-20B, aim to reduce personally identifiable information.

This release is just the beginning of OLMo and the framework, and future work is planned to be launched in different scales, modalities, data sets, safety measures, and evaluation. AI2 encourages the use of the OLMo model, provides simple installation steps and usage examples, and says that in the future, it will launch features such as guided adjustment models, complete training logs, and wandb reports.

Blog URL: https://blog.allenai.org/olmo-open-language-model-87ccfc95f58

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.

{{userData.name}}Verify

AI2 releases open language model OLMo, claiming that many performances are comparable to Llama2

Giant Online Game AI Model GiantGPT Completes Registration

AI image generator Midjourney accidentally creates inappropriate content, violating its own guidelines

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Giant Online Game AI Model GiantGPT Completes Registration

AI image generator Midjourney accidentally creates inappropriate content, violating its own guidelines

Arcee AI releases open source language model Arcee-Nova: Based on Qwen2-72B, performance is close to GPT-4

OpenAI forms child safety team to prevent misuse of AI tools

OpenAI Altman is ambitious: raising $7 trillion to break free from the shackles of chips and promote the implementation of general artificial intelligence

OpenAI CEO Altman: GPT-5 will bring a "huge leap"

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow