Apple releases OpenELM, an efficient language model based on an open source training and inference framework

Before WWDC24,appleAn “efficient language model with an open source training and inference framework” was released on the Hugging Face platform, called OpenELM.

Apple releases OpenELM, an efficient language model based on an open source training and inference framework

Of course, this is an open source language model, and its source code, pre-trained model weights, and training recipes are available in Apple's Github repository.

Apple releases OpenELM, an efficient language model based on an open source training and inference framework

The official introduction is translated as follows:

Reproducibility and transparency of large language models are critical to advancing open research, ensuring trustworthiness of results, and investigating data and model biases and potential risks. To this end, we release OpenELM, a state-of-the-art open source language model.

OpenELM uses a layered scaling strategy to effectively distribute the parameters of each layer of the Transformer model, thereby improving accuracy. For example, when the number of parameters is about 1 billion, OpenELM improves the accuracy by 2.36% compared to OLMo, while the number of pre-training tokens required is only 50%.

Apple releases OpenELM, an efficient language model based on an open source training and inference framework

Unlike previous practices of only providing model weights and inference code and pre-training on private datasets, our release includes a complete framework for training and evaluating language models on public datasets, including training logs, multiple checkpoints, and pre-training configurations.

We also released code to convert the model to the MLX library for inference and fine-tuning on Apple devices. This comprehensive release is intended to empower and consolidate the open research community and pave the way for future open research work.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
HeadlinesInformation

8 AI business trends in 2024 State-of-the-art AI models are becoming increasingly expensive

2024-4-24 10:23:04

Information

Yokosuka City, Japan, tries to publish English information by "AI Mayor" for tourist guides

2024-4-25 9:38:16

Search