Microsoft Releases Phi-3.5 Series of AI Models: 128K Context Window, Introducing Hybrid Expert Models for the First Time

MicrosoftThe company released the Phi-3.5 series AI Models,The most notable of these is the launch of the first Mixed Model of Expertise (MoE) version of the series, Phi-3.5-MoE..

The Phi-3.5 series released includes three lightweight AI models, Phi-3.5-MoE, Phi-3.5-vision, and Phi-3.5-mini, built on synthetic data and filtered public websites, with a 128K context window, all of which are now available on Hugging Face under the MIT license. IT Home has attached the relevant descriptions below:

Phi-3.5-MoE: the first hybrid expert model

Phi-3.5-MoE is the first model in the Phi family to utilize the Mixed Expert (MoE) technique. The model activated only 6.6 billion parameters in a 16 x 3.8B MoE model using 2 experts and was trained on 4.9T tokens using 512 H100s.

The Microsoft research team designed the model from scratch to further improve its performance. In standard AI benchmarks, Phi-3.5-MoE outperforms Llama-3.1 8B, Gemma-2-9B, and Gemini-1.5-Flash, and is close to the current leader, GPT-4o-mini.

Phi-3.5-vision: enhanced multi-frame image understanding

With a total of 4.2 billion parameters, Phi-3.5-vision uses 256 A100 GPUs trained on 500B markers and now supports multi-frame image understanding and inference.

Phi-3.5-vision has improved performance on MMMU (from 40.2 to 43.0), MMBench (from 80.5 to 81.9), and the document understanding benchmark TextVQA (from 70.9 to 72.0).

Phi-3.5-mini: lightweight and strong features

Phi-3.5-mini is a 3.8 billion parameter model, surpassing Llama3.1 8B and Mistral 7B, and even rivaling Mistral NeMo 12B.

The model was trained using 512 H100s on 3.4T tokens. With only 3.8B effective parameters, the model is competitive in multilingual tasks compared to LLMs with more effective parameters.

In addition, Phi-3.5-mini now supports 128K context windows, while its main competitor, the Gemma-2 series, only supports 8K.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Microsoft releases Phi-3.5 series AI models: context window 128K, first introduction of hybrid expert model

A U.S. mayoral candidate wanted to use ChatGPT to govern the city, but was banned by OpenAI

Making tea, playing the piano, practicing Wing Chun, Stardust Intelligence releases AI robot assistant Astribot S1

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

A U.S. mayoral candidate wanted to use ChatGPT to govern the city, but was banned by OpenAI

Making tea, playing the piano, practicing Wing Chun, Stardust Intelligence releases AI robot assistant Astribot S1

The pie is almost divided by OpenAI, and AI startups are in a financing dilemma

Microsoft is said to launch its own new AI model "MAI-1" to compete with Google and OpenAI

Microsoft launches new Phi-3.5 series AI models, beating Google, OpenAI, etc.

Microsoft Reading Coach launches AI to create customized reading experience

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow