Dark Side of the Moon Kimi Open Source Big Model Reasoning Architecture Mooncake with Tsinghua University and others

Dark Side of the Moon Kimi andTsinghua University MADSys Labs 2024 co-published a design for the Mooncake inference system underlying Kimi. The system is based on a KVCache-centered PD separation and store-for-store conversion architecture.Improved inference throughput.

Recently, in order to further accelerate the application and promotion of this technology framework, Kimi of the Dark Side of the Moon and MADSys Lab of Tsinghua University have joined hands with 9#AISoft, AliCloud, Huawei Storage, Noodle Intelligence, and Tendency Technology.Co-launch of the open source project MooncakeThe KVCache-centeredLarge ModelReasoning Architecture.

November 28, Mooncake technology framework has been open source online, 1AI attached address is as follows:

https://github.com/kvcache-ai/Mooncake

According to the introduction, Mooncake open source project extends from the paper, centered on the ultra-large-scale KVCache cache pool, and improves the inference throughput by drastically reducing the arithmetic overhead through the innovative concept of storage-for-computation.

This open source will use a phased approachKVCache is an open source implementation of Mooncake Store, a high-performance KVCache multi-level cache, compatible with various inference engines and underlying storage/transfer resources. The Transfer Engine part of the Transfer Engine is now open-sourced globally on GitHub.

The ultimate goal of the Mooncake open source project is to create a standard interface for a new type of high-performance in-memory semantic storage for the era of big models, and to provide a reference implementation.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Claude gets "Custom Styles" feature to make text generation styles more responsive to needs

2024-11-28 21:36:51

Information

Microsoft LlamaParse Document Parsing Capabilities Upgraded with GPT-4o Series AI Models

2024-11-29 1:25:11

Search