域名历史记录查询网站,皇马logo做网站,wordpress eaccelerator,linux vps网站搬家命令在当今科技飞速发展的时代#xff0c;AI 语音合成技术正逐渐改变着我们的生活。今天#xff0c;就为大家介绍一款卓越的语音合成工具——CosyVoice。
一、安装步骤
克隆和安装#xff1a; 克隆仓库#xff1a;git clone --recursive https://github.com/FunAudioLLM/Cos…在当今科技飞速发展的时代AI 语音合成技术正逐渐改变着我们的生活。今天就为大家介绍一款卓越的语音合成工具——CosyVoice。
一、安装步骤
克隆和安装 克隆仓库git clone --recursive https://github.com/FunAudioLLM/CosyVoice.git。如果克隆子模块失败可以运行命令cd CosyVoice; git submodule update --init --recursive。 安装 Conda请参考https://docs.conda.io/en/latest/miniconda.html。创建 Conda 环境 conda create -n cosyvoice python3.8。conda activate cosyvoice。conda install -y -c conda-forge pynini2.1.5。pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ --trusted-hostmirrors.aliyun.com。 解决 sox 兼容性问题 Ubuntusudo apt-get install sox libsox-dev。CentOSsudo yum install sox sox-devel。
二、模型下载
强烈建议下载预训练的CosyVoice - 300M、CosyVoice - 300M - SFT、CosyVoice - 300M - Instruct模型和CosyVoice - ttsfrd资源。
SDK 模型下载from modelscope import snapshot_download
snapshot_download(iic/CosyVoice - 300M, local_dirpretrained_models/CosyVoice - 300M)
snapshot_download(iic/CosyVoice - 300M - SFT, local_dirpretrained_models/CosyVoice - 300M - SFT)
snapshot_download(iic/CosyVoice - 300M - Instruct, local_dirpretrained_models/CosyVoice - 300M - Instruct)
snapshot_download(iic/CosyVoice - ttsfrd, local_dirpretrained_models/CosyVoice - ttsfrd)git 模型下载确保已安装 git lfsmkdir -p pretrained_models
git clone https://www.modelscope.cn/iic/CosyVoice - 300M.git pretrained_models/CosyVoice - 300M
git clone https://www.modelscope.cn/iic/CosyVoice - 300M - SFT.git pretrained_models/CosyVoice - 300M - SFT
git clone https://www.modelscope.cn/iic/CosyVoice - 300M - Instruct.git pretrained_models/CosyVoice - 300M - Instruct
git clone https://www.modelscope.cn/iic/CosyVoice - ttsfrd.git pretrained_models/CosyVoice - ttsfrd可选步骤解压ttsfrd资源并安装ttsfrd包以获得更好的文本归一化性能但这不是必需的。若不安装将默认使用WeTextProcessing。cd pretrained_models/CosyVoice - ttsfrd/
unzip resource.zip -d.
pip install ttsfrd - 0.3.6 - cp38 - cp38 - linux_x86_64.whl三、基本用法
对于不同的推理需求选择不同的模型 零样本/跨语言推理请使用CosyVoice - 300M模型。SFT 推理请使用CosyVoice - 300M - SFT模型。指令推理请使用CosyVoice - 300M - Instruct模型。 首先将third_party/Matcha - TTS添加到PYTHONPATH。export PYTHONPATHthird_party/Matcha - TTS示例代码from cosyvoice.cli.cosyvoice import CosyVoice
from cosyvoice.utils.file_utils import load_wav
import torchaudiocosyvoice CosyVoice(pretrained_models/CosyVoice - 300M - SFT)
# sft usage
print(cosyvoice.list_avaliable_spks())
# change streamTrue for chunk stream inference
for i, j in enumerate(cosyvoice.inference_sft(你好我是通义生成式语音大模型请问有什么可以帮您的吗, 中文女, streamFalse)):torchaudio.save(sft_{}.wav.format(i), j[tts_speech], 22050)cosyvoice CosyVoice(pretrained_models/CosyVoice - 300M)
# zero_shot usage, |zh||en||jp||yue||ko| for Chinese/English/Japanese/Cantonese/Korean
prompt_speech_16k load_wav(zero_shot_prompt.wav, 16000)
for i, j in enumerate(cosyvoice.inference_zero_shot(收到好友从远方寄来的生日礼物那份意外的惊喜与深深的祝福让我心中充满了甜蜜的快乐笑容如花儿般绽放。, 希望你以后能够做的比我还好呦。, prompt_speech_16k, streamFalse)):torchaudio.save(zero_shot_{}.wav.format(i), j[tts_speech], 22050)
# cross_lingual usage
prompt_speech_16k load_wav(cross_lingual_prompt.wav, 16000)
for i, j in enumerate(cosyvoice.inference_cross_lingual(|en|And then later on, fully acquiring that company. So keeping management in line, interest in line with the asset that\s coming into the family is a reason why sometimes we don\t buy the whole thing., prompt_speech_16k, streamFalse)):torchaudio.save(cross_lingual_{}.wav.format(i), j[tts_speech], 22050)cosyvoice CosyVoice(pretrained_models/CosyVoice - 300M - Instruct)
# instruct usage, support laughter/laughterstrong/strong[laughter][breath]
for i, j in enumerate(cosyvoice.inference_instruct(在面对挑战时他展现了非凡的strong勇气/strong与strong智慧/strong。, 中文男, Theo \Crimson\, is a fiery, passionate rebel leader. Fights with fervor for justice, but struggles with impulsiveness., streamFalse)):torchaudio.save(instruct_{}.wav.format(i), j[tts_speech], 22050)四、启动 Web 演示
可以使用 Web 演示页面快速熟悉 CosyVoice支持 sft/零样本/跨语言/指令推理。具体详情请参考演示网站。 示例命令python3 webui.py --port 50000 --model_dir pretrained_models/CosyVoice - 300M可根据需要更改模型。
五、高级用法
对于高级用户examples/libritts/cosyvoice/run.sh中提供了训练和推理脚本可以按照此示例熟悉 CosyVoice。
六、构建用于部署
若要使用 grpc 进行服务部署可执行以下步骤否则可忽略此步骤。
构建 docker 镜像cd runtime/python
docker build -t cosyvoice:v1.0.运行 docker 容器根据需要选择推理模式 grpc 用法docker run -d --runtimenvidia -p 50000:50000 cosyvoice:v1.0 /bin/bash -c cd /opt/CosyVoice/CosyVoice/runtime/python/grpc python3 server.py --port 50000 --max_conc 4 --model_dir iic/CosyVoice - 300M sleep infinity
cd grpc python3 client.py --port 50000 --mode sft|zero_shot|cross_lingual|instructfastapi 用法docker run -d --runtimenvidia -p 50000:50000 cosyvoice:v1.0 /bin/bash -c cd /opt/CosyVoice/CosyVoice/runtime/python/fastapi python3 server.py --port 50000 --model_dir iic/CosyVoice - 300M sleep infinity
cd fastapi python3 client.py --port 50000 --mode sft|zero_shot|cross_lingual|instructCosyVoice 以其强大的功能和灵活的使用方式为我们带来了全新的语音合成体验。快来尝试吧