November 3 News.China TelecomArtificial Intelligence Research Institute (TeleAIIn May of this year, the industry's first free mashup of 30 dialects was released.Speech recognition macromodel -- Star Super Multi-Dialect Speech Recognition Large Model.
After less than half a year, the multi-dialect capability of the TeleAI Star Speech Grand Model has been upgraded again, conquering such dialects as Zhanjiang, Yibin, Luoyang and Yantai.Upgraded the number of dialect types from 30 to 40 and introduced the recognition of English.
Compared to traditional labeled training methods, TeleAI pre-trains speech recognition models by utilizing massive amounts of unlabeled data for pre-training and then fine-tuning them with small amounts of labeled data.
Since dialectal speech data is generally characterized by more unlabeled data and less labeled data, this "Pre-training + fine-tuning"The modeling scheme and the needs of the dialect scene can be highly compatible.
TeleAI also innovates in model structure and cost optimization, achieving a significant reduction of about 50 times in the amount of manually annotated data required and guaranteeing that the model results are comparable to the level of supervised training of dialect models.
With GitHub open source address: https://github.com/Tele-AI/TeleSpeech-ASR