Microsoft releases Phi-3.5 series AI models: context window 128K, first introduction of hybrid expert model

MicrosoftThe company released the Phi-3.5 series AI Models,The most notable of these is the launch of the first Mixed Model of Expertise (MoE) version of the series, Phi-3.5-MoE..

Microsoft releases Phi-3.5 series AI models: context window 128K, first introduction of hybrid expert model

The Phi-3.5 series released includes three lightweight AI models, Phi-3.5-MoE, Phi-3.5-vision, and Phi-3.5-mini, built on synthetic data and filtered public websites, with a 128K context window, all of which are now available on Hugging Face under the MIT license. IT Home has attached the relevant descriptions below:

Phi-3.5-MoE: the first hybrid expert model

Phi-3.5-MoE is the first model in the Phi family to utilize the Mixed Expert (MoE) technique. The model activated only 6.6 billion parameters in a 16 x 3.8B MoE model using 2 experts and was trained on 4.9T tokens using 512 H100s.

The Microsoft research team designed the model from scratch to further improve its performance. In standard AI benchmarks, Phi-3.5-MoE outperforms Llama-3.1 8B, Gemma-2-9B, and Gemini-1.5-Flash, and is close to the current leader, GPT-4o-mini.

Phi-3.5-vision: enhanced multi-frame image understanding

With a total of 4.2 billion parameters, Phi-3.5-vision uses 256 A100 GPUs trained on 500B markers and now supports multi-frame image understanding and inference.

Phi-3.5-vision has improved performance on MMMU (from 40.2 to 43.0), MMBench (from 80.5 to 81.9), and the document understanding benchmark TextVQA (from 70.9 to 72.0).

Phi-3.5-mini: lightweight and strong features

Phi-3.5-mini is a 3.8 billion parameter model, surpassing Llama3.1 8B and Mistral 7B, and even rivaling Mistral NeMo 12B.

The model was trained using 512 H100s on 3.4T tokens. With only 3.8B effective parameters, the model is competitive in multilingual tasks compared to LLMs with more effective parameters.

In addition, Phi-3.5-mini now supports 128K context windows, while its main competitor, the Gemma-2 series, only supports 8K.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

A U.S. mayoral candidate wanted to use ChatGPT to govern the city, but was banned by OpenAI

2024-8-22 9:18:17

Information

Making tea, playing the piano, practicing Wing Chun, Stardust Intelligence releases AI robot assistant Astribot S1

2024-8-22 9:20:22

Search