GPT-SoVITSIt is a powerfulAI tone cloning software. By inputting a 5-second vocal sample, users can immediately experience text-to-speech functionality. At the same time, with only 1 minute of training data, the model can be fine-tuned to improve speech similarity and realism.
Project address:https://www.1ai.net/2975.html
In addition, the product supports cross-language, and currently supports inference in English, Japanese and Chinese. The product also integrates tools such as sound accompaniment separation, automatic training set segmentation, Chinese ASR, and text annotation, which can help beginners create training datasets and GPT/SoVITS models.
It also supports running in a Windows environment and has been tested with Python 3.9, PyTorch 2.0.1, and CUDA11, and a quick installation guide is provided.
Product Core Functions.
- Enter a 5-second voice sample for text-to-speech conversion.
- Model fine-tuning with only 1 minute of training data; Cross-language support, including English, Japanese and Chinese.
- Integrate auxiliary tools such as voice accompaniment separation, automatic training set segmentation, Chinese ASR and text annotation.
- Supported in Windows environment, tested with Python 3.9, PyTorch 2.0.1 and CUDA11.