Fish Speech It is an open source text-to-speech TTS model developed by Fish Audio, which aims to provide users with high-quality multilingual speech synthesis capabilities. The model supports multiple languages and has been trained with 150,000 hours of audio data in languages such as English, Chinese, and Japanese to ensure natural fluency and high accuracy of speech synthesis.
If you want to deploy, currently only WIN and Liunx operating modes are supported. Of course, if you don't want to deploy, the official also provides a website that can be used directly out of the box.
Fish Speech Features
- Multi-language support: Supports English, Chinese and Japanese, and can generate high-quality voice content after being trained with a large amount of data.
- Efficient generation speed: The only open source TTS model that can generate speech at 20 tokens per second.
- High-quality speech synthesis: Ensure the stability and fluency of generated speech by expanding the model size and increasing the amount of data.
- Open source and customizability: It supports local deployment and allows users to fine-tune and experiment based on their own data.
Fish Speech is suitable for
Content creation: Suitable for creators who need to generate voice content, such as video bloggers, podcast producers, etc. The voice generated by Fish Speech can be used for dubbing, narration, etc.
Education: Teachers and educational content developers can use Fish Speech to generate teaching audio to help students better understand and master the learning content.
Customer Service: Companies can use Fish Speech to provide natural voice responses for their customer service systems, improving customer satisfaction.
Accessibility tools: For the visually impaired and those with dyslexia, Fish Speech can convert written content into speech, helping them to obtain information more conveniently.
Official website address:https://speech.fish.audio/
Github: https://github.com/fishaudio/fish-speech
Experience address: https://fish.audio/zh-CN