GPT-SoVITS4
Word count
336 words
Reading time
3 minutes
GPT-SoVITS4 is a leading multilingual speech generation tool that supports multiple languages and dialects, including Mandarin, English, and Japanese, offering high-quality audio, instant response, rich expressiveness, and easy deployment..
Main Interface
None
If you see the output below, it means the service has started successfully.
json
{ "code": 0, "message": "install success, only api call" }Launch Methods
- Launch via URL
http://127.0.0.1:8000/launcher?project=gpt_sovits- Launch via Command
bash
uv run cli.py install -n gpt_sovits -p 8015 --start- Native Launch (API Only)
bash
uv run api_v2_.py --bind_addr 0.0.0.0 --port 8015Output Logs
log
Python 3.11.11
Using extensions/gpt_sovits/.venv/scripts/python.exe
Checked 179 packages in 6ms
All installed packages are compatible
Audited 42 packages in 14ms
...
extensions/gpt_sovits/.venv/scripts/python.exe api_v2_.py --bind_addr 0.0.0.0 --port 8015
2025-12-24 00:09:50.152 | INFO cbinstaller.py - ✅ GPT-SoVITS started (PID: 11300)
2025-12-24 00:09:50.153 | WARNING cbinstaller.py - No UI, API calls only
...
--------------------------------------------- TTS Config ---------------------------------------------
device : cpu
is_half : False
version : v2Pro
t2s_weights_path : GPT_SoVITS/pretrained_models/s1v3.ckpt
vits_weights_path : GPT_SoVITS/pretrained_models/v2Pro/s2Gv2Pro.pth
bert_base_path : GPT_SoVITS/pretrained_models/chinese-roberta-wwm-ext-large
cnhuhbert_base_path : GPT_SoVITS/pretrained_models/chinese-hubert-base
----------------------------------------------------------------------------------------------------
Loading Text2Semantic weights from GPT_SoVITS/pretrained_models/s1v3.ckpt
Loading VITS weights from GPT_SoVITS/pretrained_models/v2Pro/s2Gv2Pro.pth (all keys matched successfully)
Loading BERT weights from GPT_SoVITS/pretrained_models/chinese-roberta-wwm-ext-large
Loading CNHuBERT weights from GPT_SoVITS/pretrained_models/chinese-hubert-base
INFO: Started server process [12392]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8015 (Press CTRL+C to quit)Configuration Parameters
json
{
"request": {
"top_k": 5,
"top_p": 1.0,
"temperature": 1.0,
"text_split_method": "cut5",
"batch_size": 1,
"batch_threshold": 0.75,
"split_bucket": true,
"fragment_interval": 0.3,
"media_type": "wav",
"streaming_mode": false,
"parallel_infer": true,
"repetition_penalty": 1.35,
"sample_steps": 32,
"super_sampling": false,
"overlap_length": 2,
"min_chunk_length": 16
},
"models": {
"test": {
"gpt": "webapp/extensions/gpt_sovits/train/weights/GPT_weights_v2Pro/test-e4.ckpt",
"sovits": "webapp/extensions/gpt_sovits/train/weights/SoVITS_weights_v2Pro/test_e4_s80.pth"
}
}
}Model Training
See the 《Model Training》 section in the documentation.
Notes
request: inference request parameters,models: custom model configurationIn the dubbing module, you can fill in local or remote addresses to enable batch speech synthesis, for example:
- Local access:
http://127.0.0.1:8015 - Remote access:
https://xxx.gradio.live、https://xxx.ngrok-free.app、https://xxx.loca.lt - Your own public IP or domain
- Local access: