Dark Side of the Moon Kimi Open Platform "Context Cache" Officially Publicly Tested Long Text Model Cost Reduction 90%

Dark Side of the Moon Kimi Open Platform "Context Cache" Officially Public Beta Long Text Model Cost Reduction 90%

Yesterday,Dark Side of the MoonUnderKimi The open platform announced that Context Caching has entered public beta. This technology can reduce the cost of using long text flagship models of up to 90% for developers without changing the API price, and significantly improve the response speed of the model.

Context Caching is an efficient data management technology that allows the system to pre-store large amounts of data or information that may be frequently requested. In this way, when you request the same information again, the system can quickly provide it directly from the cache without recalculating or retrieving it from the original data source, saving time and resources. Context Caching is particularly suitable for scenarios with frequent requests and repeated references to a large amount of initial context, which can significantly reduce the cost of long text models and improve efficiency!

Dark Side of the Moon Kimi Open Platform "Context Cache" Officially Public Beta Long Text Model Cost Reduction 90%

Specifically, "context caching" can be applied to scenarios with frequent requests and repeated references to a large number of initial contexts, bringing the following two effects:

Cost reduction of up to 90%: For example, in scenarios where a large number of questions need to be asked about fixed documents, context caching can save a lot of costs. For example, for a hardware product manual of about 90,000 words, pre-sales support personnel need to conduct multiple intensive questions and answers in a short period of time. After access to context caching, the cost can be reduced to about 10%.

The first token delay is reduced by 83%: For a request of a 128k model, it usually takes 30 seconds to return the first token. Through context caching, the first token delay can be reduced to 5 seconds on average, reducing the delay time by about 83%.

The charging model of Context Caching is mainly divided into the following three parts:

Cache creation cost:

Call the Cache creation API. After successfully creating the Cache, the actual amount of tokens in the Cache will be charged. 24 yuan/M token

Cache storage fee:

During the cache lifespan, the cache storage fee is charged per minute. 10 yuan/M token/minute

Cache call fee:

Charges for calling incremental tokens for Cache: charged according to the original price of the model

Cache call charges:

During the cache survival time, if the user requests a successfully created cache through the chat interface, and the chat message content successfully matches the surviving cache, the cache call fee will be charged according to the number of calls. 0.02 yuan/time

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.

{{userData.name}}Verify

Dark Side of the Moon Kimi Open Platform "Context Cache" Officially Public Beta Long Text Model Cost Reduction 90%

AI clothing-changing black technology MMTryon virtual try-on framework can be matched and layered as needed with one click

Runway Gen-3 Alpha text generation video model is now available to paying users

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

AI clothing-changing black technology MMTryon virtual try-on framework can be matched and layered as needed with one click

Runway Gen-3 Alpha text generation video model is now available to paying users

Welcome to the era of AIGC in China! Kimi expands capacity 5 times in a row: Approaching the expert level in any field in 10 minutes

Dark Side of the Moon: Kimi Large Model API now supports Tool Calling

Dark Side of the Moon Kimi Smart Assistant adds "Cheer for Kimi" payment option: Get priority use rights during peak hours

Kimi Open Platform will launch Context Caching internal testing: provide preset content QA Bot, fixed document collection query

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow