December 16th.No Question Core DomeAnnounced today,Open SourceMegrez-3B-Omni, a full-modal understanding miniaturization model in the no-questions-asked core dome end-side solution, and its language-only model version, Megrez-3B-Instruct.
Officially, Megrez-3B-Omni is a full-modal understanding model made for the end, with the ability to process image, audio, and text modal data simultaneously:
- existgraphic understandingOn the other hand, Megrez-3B-Omni is currently one of the most accurate image understanding models on several mainstream test sets such as OpenCompass, MME, MMMU, and OCRBench.
- existtext comprehensionOn the other hand, Megrez-3B-Omni achieves the optimal accuracy of the end-to-end model on several authoritative test sets such as C-EVAL, MMLU / MMLU Pro, AlignBench, and so on.
- existspeech understandingIn terms of this, Megrez-3B-Omni supports voice input in both Chinese and English, and is also capable of handling complex multi-round dialog scenarios, as well as supporting voice questioning of input images or text, enabling free switching between different modes.
Officials claim that the unimodal version of Megrez-3B-Instruct achieves a significant improvement in inference speed compared to its predecessor and other end-side macrolanguage models.Maximum inference speed can be ahead of the same precision model 300%.
The relevant links are as follows:
-
HuggingFace:https://huggingface.co/Infinigence/Megrez-3B-Omni
-
Infini-AI Heterogeneous Cloud:https://cloud.infini-ai.com/genstudio/model/mo-c73owqiotql7lozr
-
Modelers:https://modelers.cn/models/INFINIGENCE-AI/Megrez-3B-Omni
-
ModelScope:https://www.modelscope.cn/models/InfiniAI/Megrez-3B-Omni