Dark Side of the Moon Kimi andTsinghua University MADSys Labs 2024 co-published a design for the Mooncake inference system underlying Kimi. The system is based on a KVCache-centered PD separation and store-for-store conversion architecture.Improved inference throughput.
Recently, in order to further accelerate the application and promotion of this technology framework, Kimi of the Dark Side of the Moon and MADSys Lab of Tsinghua University have joined hands with 9#AISoft, AliCloud, Huawei Storage, Noodle Intelligence, and Tendency Technology.Co-launch of the open source project MooncakeThe KVCache-centeredLarge ModelReasoning Architecture.
November 28, Mooncake technology framework has been open source online, 1AI attached address is as follows:
https://github.com/kvcache-ai/Mooncake
According to the introduction, Mooncake open source project extends from the paper, centered on the ultra-large-scale KVCache cache pool, and improves the inference throughput by drastically reducing the arithmetic overhead through the innovative concept of storage-for-computation.
This open source will use a phased approachKVCache is an open source implementation of Mooncake Store, a high-performance KVCache multi-level cache, compatible with various inference engines and underlying storage/transfer resources. The Transfer Engine part of the Transfer Engine is now open-sourced globally on GitHub.
The ultimate goal of the Mooncake open source project is to create a standard interface for a new type of high-performance in-memory semantic storage for the era of big models, and to provide a reference implementation.