Smart Spectrum Announces the Launch of GLM-4-Voice End-to-End EmotionVoice Model. Officially, itsAbility to understand emotions, emotional expression, emotional empathyThe program, which is self-adjusting, supports multiple languages and dialects, and features lower latency and the ability to interrupt at any time, is available to users now at "Zhipu Qingyan"Experience on the App.
The GLM-4-Voice is described as having the following features:
- Emotional expression and emotional resonance:Voices have different emotions and subtleties, such as happy, sad, angry, and scared.
- Adjust the speed of speech:In the same round of conversation, you can ask the TA to speak faster or slower.
- Interrupt at any time and enter instructions flexibly:Adjust the content and style of voice output based on real-time user commands to support more flexible dialog interactions.
- Multi-language and multi-dialect support:At present, GLM-4-Voice supports Chinese and English voices as well as dialects from all over China, and is especially good at Cantonese, Chongqing and Beijing.
- Combined with video calling, you can see and talk:A video calling feature will be available soon.
In addition, AutoGLM is equipped with phone use capability, which allows it to simulate human operation of a cell phone by receiving simple text/voice commands. It is not limited to simple task scenarios or API calls, nor does it require users to manually build complex and cumbersome workflows, and its operation logic is similar to that of humans.
GLM-4-Voice is open-sourced in the same period, and is officially called the first open-sourced end-to-end multimodal model of Smart Spectrum.IT Home with address:
Code Repository:
- https://github.com/THUDM/GLM-4-Voice