开源大模型DBRX：1320亿参数，比Llama2-70B快1倍

Databricks, a big data company, recently released a newDBRXMoE BigModel, which triggeredOpen Source CommunityDBRX beat open source models such as Grok-1 and Mixtral in benchmark tests and became a new open sourceKingThe total number of parameters of this model reaches 132 billion, but each activation has only 36 billion parameters, and its generation speed is 1 times faster than Llama2-70B.

DBRX is composed of 16 expert models, with 4 experts active at each reasoning and a context length of 32K. To train DBRX, the Databricks team rented 3072 H100s from cloud vendors and trained them for two months. After internal discussions, the team decided to adopt a course learning approach to improve DBRX's capabilities in specific tasks with high-quality data. This decision was successful, and DBRX reached SOTA levels in language understanding, programming, mathematics, and logic, and defeated GPT-3.5 in most benchmarks.

Databricks also released two versions of DBRX: DBRX Base and DBRX Instruct. The former is a pre-trained basic model, and the latter is fine-tuned with instructions. Chief Scientist Jonathan Frankle revealed that the team plans to conduct further research on the model and explore how DBRX can acquire additional skills in the "last week" of training.

Although DBRX is welcomed by the open source community, some people question its “open source” nature. According to the agreement published by Databricks, products built on DBRX must submit a separate application to Databricks if their monthly active users exceed 700 million.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Open source large model DBRX: 132 billion parameters, 1x faster than Llama2-70B

Chen Rui, CEO of Bilibili: 60% of AI content consumers are born after 2000

Heygen releases Avatar in Motion 1.0, a new feature that not only lip-syncs, but also copies your movements and gestures

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Chen Rui, CEO of Bilibili: 60% of AI content consumers are born after 2000

Heygen releases Avatar in Motion 1.0, a new feature that not only lip-syncs, but also copies your movements and gestures

Kai-Fu Lee's AI company Zero One Everything announced the open source Yi-9B model, claiming to have the strongest mathematical capabilities in the same series of codes

Alibaba open-sources 110 billion parameter Qwen1.5-110B model, comparable to Meta Llama3-70B

B station open source lightweight Index-1.9B series model: 2.8T training data, support role-playing

Zhipu AI announces that GLM-4-9B and CodeGeeX4-ALL-9B support Ollama deployment

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow