Kimi Multimodal Image Understanding Model API Released, 1M tokens priced from $12

January 15, 2011 - Dark Side of the Moon today released the Kimi MultimodalityImage Understanding Model APIThe new multimodal picture comprehension model moonshot-v1-vision-preview(hereinafter referred to as "Vision model") completes the multimodal capabilities of the moonshot-v1 model family.

Description of model capabilities

Image Recognition

Vision models are equipped with image recognition capabilities, recognizing complex details and nuances in images, be it food or animals, and being able to distinguish between similar but not identical objects.

In the example below, 16 similar images of blueberry muffins and chihuahuas that are harder for the human eye to distinguish have been officially pieced together, with the Vision model recognizing and labeling the image types in order.Whether it's a blueberry muffin or a Chihuahua, the model can accurately differentiate and identify the.

Text recognition and comprehension

Vision models have advanced image recognition capabilities that are more accurate than ordinary document scanning and OCR recognition software in OCR text recognition and image understanding scenarios.Handwritten scribbles such as receipts / courier bills can be accurately recognized..

Kimi Multimodal Image Understanding Model API Released, 1M tokens priced from $12

Taking this bar chart of "A student's final exam results" as an example, the official asked the model to extract and analyze the exam results and analyze the bar chart from the perspective of aesthetic style. The Vision model is also able to accurately identify the score values corresponding to each subject name in the bar chart and do a comparison of the scores, and at the same time, it can identify the style formatting and color of the bar chart.

Kimi Multimodal Image Understanding Model API Released, 1M tokens priced from $12

model billing

Vision models are billed on a per-volume basisThe price of the model call varies according to the model selected, with the following distinctions:

Model	Billing Unit	price
moonshot-v1-8k-vision-preview	1M tokens	¥12.00
moonshot-v1-32k-vision-preview	1M tokens	¥24.00
moonshot-v1-128k-vision-preview	1M tokens	¥60.00

Description of model constraints

Features supported by the Vision visual model include:

many rounds of dialogue
streaming output
Tool Call
JSON Mode
Partial Mode

The following features are not supported or partially supported at this time:

Internet search: not supported
Context Caching:Creating a Context Cache with image content is not supported.The Vision model can be called with a Cache that has already been created.
URL-formatted images: not supported, currently only base64-encoded image content is supported

Other Platform Updates

Support for organizational project management functions
Support for one business entity to authenticate multiple accounts
Add File file resource management function: intuitively manage and view file resources.
Optimize mouse hover copy for resource management list
Context Caching has been released to full users.
Cache renewals are no longer charged for creation

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Kimi Multimodal Image Understanding Model API Released, 1M tokens priced from $12

XF Starfire Deep Reasoning Model X1 Released: The Only Nationally Produced Arithmetic Training, the First in China for Many Indicators

To outperform OpenAI GPT-4, Meta spares Llama 3 training using controversial data

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

XF Starfire Deep Reasoning Model X1 Released: The Only Nationally Produced Arithmetic Training, the First in China for Many Indicators

To outperform OpenAI GPT-4, Meta spares Llama 3 training using controversial data

Dark Side of the Moon: Kimi Large Model API now supports Tool Calling

Google Releases Multimodal Live Streaming API: Unlocking Watching, Listening, and Speaking, Opening a New Experience in AI Audio and Video Interaction

OpenAI launches Batch batch processing API: 50% discount, output results within 24 hours

Dark Side of the Moon Kimi Smart Assistant upgrade: support for new models, search result tracing

Please enter the code

....Payment confirmation in progress....

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow