Russian technology giant Yandex announced the open source "YaFSDP" large language model training tool: greatly improve GPU utilization, and achieve 26% acceleration for Llama 3

Russian tech giants Yandex Launched aOpen SourceLarge language model training tool——YaFSDP, claiming to increase speed by up to 26% compared to existing tools.

According to the introduction,YaFSDP It outperforms the traditional FSDP method in terms of training speed, especially for large models. In terms of pre-training LLM, YaFSDP is 20% faster and performs better under high memory pressure conditions.

For example, YaFSDP can achieve an efficiency improvement of 21% for Llama 2 with 70 billion parameters, and 26% for Llama 3 with the same level of parameters. IT Home attached official data list:

Model	gpu-count	seq-len	num-ckpt-layers	speedup
Llama 2 7B	64	2048	0	9.92%
Llama 2 7B	64	4096	0	3.43%
Llama 2 7B	64	8192	0	2.68%
Llama 2 7B	128	2048	0	9.57%
Llama 2 7B	128	4096	0	2.42%
Llama 2 7B	128	8192	0	2.32%
Llama 2 13B	128	2048	0	12.10%
Llama 2 13B	128	4096	0	3.49%
Llama 2 34B	128	2048	0	20.70%
Llama 2 34B	256	2048	0	21.99%
Llama 2 34B	256	4096	5	8.35%
Llama 2 70B	256	2048	10	21.48%
Llama 2 70B	256	4096	50	7.17%
Llama 3 8B	64	2048	0	11.91%
Llama 3 8B	64	4096	0	7.86%
Llama 3 70B	256	2048	20	26.60%

Yandex says that by optimizing GPU usage, YaFSDP can save developers and companies a lot of money — potentially hundreds of thousands of dollars per month.

Mikhail Khruschev, a senior developer at Yandex and a member of the YaFSDP team, also mentioned that “we are currently actively trying various model architectures and parameter sizes to expand the versatility of YaFSDP.”

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Russian tech giant Yandex announces open source "YaFSDP" large language model training tool: greatly improves GPU utilization, and can achieve 26% acceleration for Llama 3

OpenAI announces new CFO, CPO, and partners with Apple

Apple executives: working hard to introduce "Apple Intelligence" into the Chinese market

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

OpenAI announces new CFO, CPO, and partners with Apple

Apple executives: working hard to introduce "Apple Intelligence" into the Chinese market

Alibaba open-sources 110 billion parameter Qwen1.5-110B model, comparable to Meta Llama3-70B

Hugging Face, the world's largest open source AI community, will provide $10 million in shared GPUs for free to help small businesses compete with large companies

B station open source lightweight Index-1.9B series model: 2.8T training data, support role-playing

Alibaba Cloud CTO Zhou Jingren: Tongyi open source model downloads exceed 20 million, firmly embrace open source

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow