Apple releases OpenELM, an efficient language model based on an open source training and reasoning framework

Apple releases OpenELM, an efficient language model based on an open source training and inference framework

Before WWDC24,appleAn “efficient language model with an open source training and inference framework” was released on the Hugging Face platform, called OpenELM.

Apple releases OpenELM, an efficient language model based on an open source training and inference framework

Of course, this is an open source language model, and its source code, pre-trained model weights, and training recipes are available in Apple's Github repository.

Apple releases OpenELM, an efficient language model based on an open source training and inference framework

The official introduction is translated as follows:

Reproducibility and transparency of large language models are critical to advancing open research, ensuring trustworthiness of results, and investigating data and model biases and potential risks. To this end, we release OpenELM, a state-of-the-art open source language model.

OpenELM uses a layered scaling strategy to effectively distribute the parameters of each layer of the Transformer model, thereby improving accuracy. For example, when the number of parameters is about 1 billion, OpenELM improves the accuracy by 2.36% compared to OLMo, while the number of pre-training tokens required is only 50%.

Unlike previous practices of only providing model weights and inference code and pre-training on private datasets, our release includes a complete framework for training and evaluating language models on public datasets, including training logs, multiple checkpoints, and pre-training configurations.

We also released code to convert the model to the MLX library for inference and fine-tuning on Apple devices. This comprehensive release is intended to empower and consolidate the open research community and pave the way for future open research work.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.

{{userData.name}}Verify

Apple releases OpenELM, an efficient language model based on an open source training and inference framework

8 AI business trends in 2024 State-of-the-art AI models are becoming increasingly expensive

Yokosuka City, Japan, tries to publish English information by "AI Mayor" for tourist guides

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

8 AI business trends in 2024 State-of-the-art AI models are becoming increasingly expensive

Yokosuka City, Japan, tries to publish English information by "AI Mayor" for tourist guides

Apple clarifies: YouTube subtitle data is not used for Apple Intelligence, OpenELM is only for research purposes

AppleIntelligence released! Apple releases a large-scale model of Siri

Apple's Intelligence features are released in batches, and the big features will have to wait until next year

OpenAI's board of directors will no longer have Microsoft and Apple as observers

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow