The first end-to-end voice model in China, Lingo, was officially launched at the Bund Conference

On September 5, at the Bund Conference’s “The Creativity Boundary and Application Imagination of Big Models” forum, the big model startup Xihu Xinchen officially launched the first end-to-endVoice big modelLingo”.

"XinchenLingo"It realizes end-to-end voice technology, directly understands voice when processing conversations, captures tone, rhythm and emotion, and makes voice replies, reducing the loss in the information processing process and making the "machine" understand people better. As the first end-to-end voice large model in China, it has created a new way of human-computer interaction.

The first end-to-end voice model in China, Lingo, was officially launched at the Bund Conference

(Xihu Xinchen CEO released China's first end-to-end voice model, Xinchen Lingo)

IDC, a global authoritative consulting firm, predicts that by 2030, the value of the global intelligent voice service market will reach approximately US$73.16 billion, with an annual compound growth rate of 27%. Technology companies around the world have keenly captured the growth potential of this field and have invested in the development of intelligent voice technology. A new revolution in human-computer interaction is being ignited.

"Xinchen Lingo can capture subtle changes in voice. It can not only understand what you said, but also understand what you wanted to express. It truly gives AI 'high emotional intelligence' so that it can accurately perceive the implied meaning. This is also another important technological breakthrough for Westlake Lingo as it continues to deepen its large-model emotional intelligence capabilities." Westlake Lingo CEO Xingchen said at the press conference.

It is reported that the capabilities of the Xinchen Lingo voice model have been enhanced in multiple fields and in Chinese, making the Chinese voice effect of Xinchen Lingo better than GPT4o. There are three main technical features.

The first is native speech understanding. As an end-to-end model, Xinchen Lingo can not only recognize text information in speech, but also accurately capture other important features such as emotion, tone, pitch, and even ambient sound, to more comprehensively understand the speech content, thereby providing a more natural and vivid interactive experience. The second is the expression of multiple voice styles. Xinchen Lingo can adaptively adjust the speed, pitch, and noise intensity of the voice according to the context and user instructions, and can generate voice responses in various styles such as dialogue, singing, and crosstalk, effectively improving the flexibility and adaptability of the model in different application scenarios. Third, super compression of speech modality. Xinchen Lingo uses a speech codec with a compression rate of hundreds of times, which can compress speech to an extremely short length, significantly reducing computing and storage costs while helping the model generate high-quality speech content.

Less than 10 days after Lingo opened for internal beta testing, over a thousand corporate users have made appointments for testing, covering eight major industries including education, finance, healthcare, government and public services, media and entertainment, retail and commercial services, manufacturing and engineering, and dozens of actual usage scenarios.

The market has given positive feedback on the application space of Xinchen Lingo. In the scenario of mental health consultation, a hospital plans to use Xinchen Lingo's voice technology to provide psychological counseling and intervention for patients, and provide emotional support to patients through the AI intelligent dialogue system to help them cope with the psychological pressure caused by the disease; in the scenario of customer service and support, a well-known property insurance company hopes to use Xinchen Lingo's voice technology for customer service and outbound calls, and improve service efficiency and customer satisfaction through intelligent voice systems for automatic outbound call scenarios such as cancellation retention and renewal retention. In addition, a series of personalized needs have emerged in the field of companionship, such as game voice companionship, social assistance assistants, voice maternal and child care, etc. The diversity and innovation of these needs provide Xinchen Lingo with broad application prospects.

At the launch event, Xingchen revealed that Xihu Xinchen will release three vertical voice models for child companionship, psychological counseling, and sales services in October based on deep domain training of Xinchen Lingo. The company will work with more industry leaders to jointly promote the innovation and application of AI technology and open a new chapter in intelligent services.

Westlake Xinchen is an innovative enterprise dedicated to the research and industrial application of artificial intelligence multimodal large model technology, backed by China's new research university "Westlake University". During the Bund Conference, Westlake Xinchen founder Lan Zhenzhong also won the first Ant InTech Technology Award. At present, Westlake Xinchen has received tens of millions of US dollars in investment from well-known institutions such as Tomcat, BlueRun Ventures, Kaitai Capital, Baidu Ventures, Westlake Science and Technology Ventures, and Westlake Education Foundation Sustainable Development Platform.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Hippo Aixue merges into Doubao, ByteDance creates a full range of AI products

2024-9-6 9:34:09

Information

Tencent releases a new generation of large model "Hunyuan Turbo": reasoning efficiency increased by 100%, cost reduced by 50%

2024-9-6 9:36:45

Search