Meta AI develops a compact language model MobileLLM for mobile devices with only 350 million parameters

Meta AI researchers have introduced MobileLLM, a language designed for high-performance computing on smartphones and other resource-constrained devicesModelThe study, published on June 27, 2024, challenges the prevailing view that effective AI ModelsAssumptions of necessary size.

The research team, which includes members of Meta Reality Labs, PyTorch, and Meta AI Research (FAIR), focused on optimizing models with fewer than a billion parameters, a fraction of the parameters of models like GPT-4, which are estimated to have more than a trillion.

The main innovations of MobileLLM include:

  1. Prioritize model depth over width
  2. Implementing embedded sharing and grouping query notes
  3. Utilizes a novel direct block weight sharing technique

These design choices allow MobileLLM to outperform previous models of similar size by 2.7% to 4.3% on common benchmark tasks. While these single-digit improvements may seem small, they represent significant progress in the highly competitive field of language model development.

Notably, on certain API call tasks, the 350 million parameter version of MobileLLM demonstrated comparable accuracy to the larger 7 billion parameter LLaMA-2 model, suggesting that for certain specific applications, more compact models may provide similar functionality while using less computational resources.

Meta AI develops a compact language model MobileLLM for mobile devices with only 350 million parameters

The development of MobileLLM coincides with growing interest in more efficient AI models. As progress on very large language models shows signs of slowing, researchers are increasingly exploring the potential of more compact, specialized designs. Despite the “LLM” in the name, the focus on efficiency and on-device deployment puts MobileLLM in the same category as what some researchers call small language models (SLMs).

While MobileLLM is not yet available to the public, Meta has open-sourced the pre-trained code, allowing other researchers to build on its work. As this technology develops, it could bring more advanced AI capabilities to personal devices, although the timeline and specific features remain uncertain.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Kuaishou open-sources image generation model Kolors to support text generation in the picture

2024-7-9 9:22:53

Information

Alibaba Cloud Tongyi Qianwen open-sources two voice base models, with better recognition performance than OpenAI Whisper

2024-7-9 13:30:34

Search