Dark Side of the MoonOfficial Announcement Kimi The open platform Context Caching function will start internal testing.Supports long text and large models, and can implement context caching function.
▲ Image source: Kimi Open Platform official public account, the same below
According to reports, Context Caching is an advanced feature provided by the Kimi open platform. It can reduce the cost of users requesting the same content by caching repeated Tokens content. The principle is as follows:
Officially, Context Caching CanImprove the API interface response speed(or first word return speed). In large-scale, high-repetition prompt scenarios, the benefits brought by the Context Caching function are greater.
Context Caching is suitable forFrequent requests, repeated references to a large number of initial contextsIn this case, reusing cached content can improve efficiency and reduce costs. The applicable business scenarios are as follows:
-
Provides a large number of QA Bots with preset content, such as Kimi API Assistant.
-
Frequent queries on a fixed set of documents, such as a question-and-answer tool for information disclosure by listed companies.
-
Periodic analysis of static code bases or knowledge bases, such as various Copilot Agents.
-
Popular AI applications with huge instant traffic, such as Honghong Simulator and LLM Riddles.
-
Agent-type applications with complex interaction rules, such as Kimi+, a popular app.
The official will release the best practices/billing plans/technical documents for the Context Caching function in the future. IT Home will keep an eye on it and bring relevant reports as soon as possible.