China Telecom TeleAI Star Voice Model Upgraded to Support Bilingual Chinese, English and 40 Dialects

November 3 News.China TelecomArtificial Intelligence Research Institute (TeleAIIn May of this year, the industry's first free mashup of 30 dialects was released.Speech recognition macromodel -- Star Super Multi-Dialect Speech Recognition Large Model.

China Telecom TeleAI Star Voice Model Upgraded to Support Bilingual Chinese, English and 40 Dialects

After less than half a year, the multi-dialect capability of the TeleAI Star Speech Grand Model has been upgraded again, conquering such dialects as Zhanjiang, Yibin, Luoyang and Yantai.Upgraded the number of dialect types from 30 to 40 and introduced the recognition of English.

Compared to traditional labeled training methods, TeleAI pre-trains speech recognition models by utilizing massive amounts of unlabeled data for pre-training and then fine-tuning them with small amounts of labeled data.

Since dialectal speech data is generally characterized by more unlabeled data and less labeled data, this "Pre-training + fine-tuning"The modeling scheme and the needs of the dialect scene can be highly compatible.

TeleAI also innovates in model structure and cost optimization, achieving a significant reduction of about 50 times in the amount of manually annotated data required and guaranteeing that the model results are comparable to the level of supervised training of dialect models.

With GitHub open source address: https://github.com/Tele-AI/TeleSpeech-ASR

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

AI-generated games are controversial: Oasis model allegedly copied My World, and quality is questionable

2024-11-3 20:30:18

Information

West China Hospital and Huawei Data Storage Release "Huaxi HCM" Medical Model: Integrating More Than 10 Types of General Models and More Than 50 Types of Pendant Domain Models

2024-11-4 1:25:03

Search