CosyVoice3
Word count
317 words
Reading time
2 minutes
CosyVoice3 is a leading multilingual speech generation tool that supports Mandarin Chinese, English, Japanese, and many other languages and dialects. It provides high audio quality, instant response, rich expressiveness, and easy deployment. CosyVoice also supports voice cloning and cross-lingual speech synthesis.
Main Interface

Launch Methods
- Launch via URL
http://127.0.0.1:8000/launcher?project=cosyvoice- Launch via Command
bash
uv run cli.py install -n cosyvoice -p 8013 --start- Native Launch
bash
uv run webui.py --port 8013Output Logs
log
Python 3.10.19
Using extensions/cosyvoice/.venv/scripts/python.exe
Checked 159 packages in 3ms
All installed packages are compatible
Audited 40 packages in 15ms
git 2.51.0
Using D:/Develop Files/Git/cmd/git.EXE
_ .-') _ .-') _ ('-. .-') _ (`-. ('-.
( '.( OO )_ ( ( OO) ) _( OO) ( OO ). ( (OO ) _( OO)
,--. ,--.).-'),-----. \ .'_ (,------.,--. (_)---\_) .-----. .-'),-----. _.` \(,------.
| `.' |( OO' .-. ',`'--..._) | .---'| |.-') / _ | ' .--./ ( OO' .-. '(__...--'' | .---'
| |/ | | | || | \ ' | | | | OO )\ :` `. | |('-. / | | | | | / | | | |
| |'.'| |\_) | |\| || | ' |(| '--. | |`-' | '..`''.) /_) |OO )\_) | |\| | | |_.' |(| '--.
| | | | \ | | | || | / : | .--'(| '---.'.-._) \ || |`-'| \ | | | | | .___.' | .--'
| | | | `' '-' '| '--' / | `---.| | \ /(_' '--'\ `' '-' ' | | | `---.
`--' `--' `-----' `-------' `------'`------' `-----' `-----' `-----' `--' `------'
Downloading Model from https://www.modelscope.cn to directory:
D:\Program Files\CreatorBox\creatorbox\extensions\cosyvoice\pretrained_models\CosyVoice3-0.5B
Successfully downloaded model: FunAudioLLM/Fun-CosyVoice3-0.5B-2512.
extensions/cosyvoice/.venv/scripts/python.exe webui3.py --port 8013
2025-12-23 23:58:44.483 | INFO cbinstaller.py - ✅ CosyVoice started
failed to import ttsfrd, using wetext instead
* Running on local URL: http://0.0.0.0:8013Configuration Parameters
None
Notes
By default, use
webui3.pyFor native version, usewebui.pyIn the dubbing module, you can fill in local or remote addresses to enable batch speech synthesis, for example:
- Local access:
http://127.0.0.1:8013 - Remote access:
https://xxx.gradio.live、https://xxx.ngrok-free.app、https://xxx.loca.lt - Your own public IP or domain name
- Local access: