All Tags

Voice Model

OpenAI Releases New Generation of Speech Models to Enable AI Intelligents to Speak More Naturally

March 21 news, OpenAI yesterday (March 20) released a blog post, announcing the launch of speech-to-text (speech-to-text) and text-to-speech (text-to-speech) models, to improve voice processing capabilities, support developers to build more accurate, customizable voice interaction system, and further promote the commercialization of AI voice technology applications. In terms of speech-to-text models, OpenAI has launched gpt-4o-transcribe and gpt-4o-mini-transcribe...
Information
- 768
3 weeks ago
MiniMax Halo Speech AI Product Launched: Supports 17 Languages and Up to 10,000 Characters

January 21st, MiniMax announced yesterday that it has brought the newly upgraded T2A-01 series of voice models and launched Conch AI products globally. According to the introduction, relying on the T2A-01 series of voice models, users can generate natural and smooth super humanoid voices by inputting text in Conch AI, and the maximum length of input can be up to 10,000 characters. At the same time, users can freely configure the mood, speech rate, pitch, and even adjust the timbre effect of the output voice to meet the refined needs of complex scenarios. 1AI notes that Conch Voice supports Chinese,...
Information
- 2.1k
2 months ago
Wisdom Spectrum Clear Speech Launches Emotional Speech Model GLM-4-Voice: Understanding Emotions, Emotional Expression and Empathy

Wisdom Spectrum announced the launch of GLM-4-Voice end-to-end emotional voice model. Officially, GLM-4-Voice is able to understand emotions, express and resonate emotions, self-adjust its speech rate, support multiple languages and dialects, have lower latency, and can be interrupted at any time, which can be experienced by users on the "Wisdom Spectrum Clear Speech" App from now on. According to the introduction, GLM-4-Voice has the following features: Emotional expression and emotional resonance: the voice has different emotions and subtle changes, such as happy, sad, angry, scared, etc. Adjusting speech speed: In the same round of conversation, you can ask TA to speak faster or slower...
Information
- 7.9k
5 months ago
Alibaba releases new voice model Qwen2-Audio, surpassing OpenAI Whisper

Recently, Alibaba launched a new open source voice model Qwen2-Audio based on its Qwen-Audio. This model not only performs well in voice recognition, translation and audio analysis, but also achieves significant improvements in functions and performance. Qwen2-Audio provides a basic version and a command fine-tuning version. Users can ask questions to the audio model through voice, and recognize and analyze the content. For example, users can ask a woman to say a paragraph, and Qwen2-Audio can determine her age or analyze her emotions; if a noisy voice is input…
Information
- 9.8k
8 months ago
Claiming to be better than XTTS! VoiceCraft: A voice model that supports voice cloning and modifying original audio text

Recently, a voice model called VoiceCraft has attracted widespread attention in the industry. According to official announcements, the performance of this model has surpassed XTTS, which undoubtedly brings new breakthroughs in the field of AI audio processing. Project address: https://github.com/jasonppy/VoiceCraft The biggest highlight of VoiceCraft is its powerful audio cloning ability. Users only need to provide a piece of original audio, and VoiceCraft can use deep learning technology to copy new audio that is extremely similar to the original audio.
Information
- 3.7k
1 year ago

Voice Model

OpenAI Releases New Generation of Speech Models to Enable AI Intelligents to Speak More Naturally

MiniMax Halo Speech AI Product Launched: Supports 17 Languages and Up to 10,000 Characters

Wisdom Spectrum Clear Speech Launches Emotional Speech Model GLM-4-Voice: Understanding Emotions, Emotional Expression and Empathy

Alibaba releases new voice model Qwen2-Audio, surpassing OpenAI Whisper

Claiming to be better than XTTS! VoiceCraft: A voice model that supports voice cloning and modifying original audio text

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Voice Model

OpenAI Releases New Generation of Speech Models to Enable AI Intelligents to Speak More Naturally

MiniMax Halo Speech AI Product Launched: Supports 17 Languages and Up to 10,000 Characters

Wisdom Spectrum Clear Speech Launches Emotional Speech Model GLM-4-Voice: Understanding Emotions, Emotional Expression and Empathy

Alibaba releases new voice model Qwen2-Audio, surpassing OpenAI Whisper

Claiming to be better than XTTS! VoiceCraft: A voice model that supports voice cloning and modifying original audio text

Please enter the code

公告

....Payment confirmation in progress....

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow