Groq The latest Whisper Large-V3 model has been launched, allowing users to use the API in Playground or local projects to realize voice transcription and translation functions. The model supports multi-language transcription, very fast transcription speed, and supports translation from other languages to English.
Playground Links.https://console.groq.com/playground
Currently, users can experience and use this feature for free on Playground, and it only takes about 3 seconds to transcribe a 4 minute 30 second video. At the same time, Groq also provides an API interface for users to integrate it in their local projects.
The interface design of the Whisper API follows compatibility standards with OpenAI, providing users with access paths to two core functions: speech-to-text and speech translation. Users can easily integrate these functions into their own applications and enjoy a convenient development experience, whether they are developing an intelligent assistant or an automated translation system.
In terms of performanceThe Whisper API utilizes an advanced "whisper-large-v3" model that ensures top performance in speech-to-text and translation tasks.
In addition, the API has clear support standards for audio file formats and sizes, including mp3, mp4, wav, and other common formats, but requires that the file size does not exceed 25MB. of particular note is that for files containing multiple audio tracks, the Whisper API will only process the first track, which requires that the user performs the proper audio preprocessing before uploading.
In order to improve the quality and efficiency of the transcription, Whisper API downsamples the audio on the server side to a mono 16,000 Hz. Groq recommends that users complete this pre-processing step on the client side, which not only helps to reduce the file size, but also allows for longer audio files to be uploaded and processed.
API Interface.
Speech to text:https://api.groq.com/openai/v1/audio/transcriptions
Voice translation:https://api.groq.com/openai/v1/audio/translations