Google releases multimodal live streaming API: unlocking watching, listening and speaking, opening a new experience of AI audio and video interaction

GoogleAlong with the release of Gemini 2.0 yesterday, the newMultimodalityLive streaming (Multimodal Live)API,Helps developers create applications with real-time audio and video streaming capabilities.

The API enables low-latency, bi-directional text, audio, and video interactions with audio and text output for a more natural, smooth, human-like dialog experience. Users can interrupt the model at any time and interact with it via shared camera input or screen recording to ask questions about the content.

The model's video comprehension capabilities extend the communication paradigm by enabling users to use the camera to take or share a desktop in real time and ask relevant questions. The API has been made available to developers and a demo application of the multimodal real-time assistant is also available to users.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Google Releases Multimodal Live Streaming API: Unlocking Watching, Listening, and Speaking, Opening a New Experience in AI Audio and Video Interaction

Generative AI's Copyright Dilemma: New Clues Suggest OpenAI Uses Game Content to Train Sora Video Generation Models

Harvard, Google release 1 million public domain books to provide legitimate data for AI training

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Generative AI's Copyright Dilemma: New Clues Suggest OpenAI Uses Game Content to Train Sora Video Generation Models

Harvard, Google release 1 million public domain books to provide legitimate data for AI training

Google "stole" key information of GPT-3.5 model: the cost is as low as 150 yuan, and you can get it by calling API

Google launches multimodal VLOGGER AI: making static portraits move and "talk"

Google releases API support for Gemini 1.5 Pro to developers

EU takes action again, Google and Samsung are under antitrust investigation for AI cooperation

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow