OpenAI Upgrades Whisper Speech Transcription AI Model to Be 8x Faster Without Sacrificing Quality

OpenAI During the DevDay event day held on October 1, it was announced that the launch of the Whisper large-v3-turbo phonetic transcription modelThe total number of parameters is 809 million, with almost no decrease in quality.8x faster than large-v3.

The Whisper large-v3-turbo speech transcription model is an optimized version of large-v3 and has only 4 Decoder Layers, in contrast to large-v3 which has 32 layers.

The Whisper large-v3-turbo speech transcription model has a total of 809 million parameters, which is slightly larger than the medium model with 769 million parameters, but much smaller than the large model with 1.55 billion parameters.

OpenAI says Whisper large-v3-turbo is 8x faster than large modelsThe VRAM required for the large model is 6GB, while the large model requires 10GB.

OpenAI Upgrades Whisper Speech Transcription AI Model to Be 8x Faster Without Sacrificing Quality

The Whisper large-v3-turbo speech transcription model is 1.6GB in size, and OpenAI continues to make Whisper (including code and model weights) available under the MIT license.

GitHub: https://github.com/openai/whisper/discussions/2363

Model Download: https://huggingface.co/openai/whisper-large-v3-turbo

Online experience: https://huggingface.co/spaces/hf-audio/whisper-large-v3-turbo

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Microsoft Copilot Becomes a News Anchor: 4 Minutes, 4 Voices, AI Gathers Hot Topics and User Preferences

2024-10-2 11:01:41

Information

Google DeepMind Partners with BioNTech to Create AI Science Assistant: Planning Experiments, Predicting Results, and Helping to Transform Science and Technology

2024-10-3 15:49:46

Search