Evaluation

Install packages for evaluation:

pip install -e .[eval]

Generating Samples for Evaluation

Prepare Test Datasets

Seed-TTS testset: Download from seed-tts-eval.
LibriSpeech test-clean: Download from OpenSLR.
Unzip the downloaded datasets and place them in the data/ directory.
Our filtered LibriSpeech-PC 4-10s subset: data/librispeech_pc_test_clean_cross_sentence.lst

Batch Inference for Test Set

To run batch inference for evaluations, execute the following commands:

# if not setup accelerate config yet
accelerate config

# if only perform inference
bash src/f5_tts/eval/eval_infer_batch.sh --infer-only

# if inference and with corresponding evaluation, setup the following tools first
bash src/f5_tts/eval/eval_infer_batch.sh

Objective Evaluation on Generated Results

Download Evaluation Model Checkpoints

Chinese ASR Model: Paraformer-zh
English ASR Model: Faster-Whisper
WavLM Model: Download from Google Drive.

Note

ASR model will be automatically downloaded if --local not set for evaluation scripts.
Otherwise, you should update the asr_ckpt_dir path values in eval_librispeech_test_clean.py or eval_seedtts_testset.py.

WavLM model must be downloaded and your wavlm_ckpt_dir path updated in eval_librispeech_test_clean.py and eval_seedtts_testset.py.

Objective Evaluation Examples

Update the path with your batch-inferenced results, and carry out WER / SIM / UTMOS evaluations:

# Evaluation [WER] for Seed-TTS test [ZH] set
python src/f5_tts/eval/eval_seedtts_testset.py --eval_task wer --lang zh --gen_wav_dir <GEN_WAV_DIR> --gpu_nums 8

# Evaluation [SIM] for LibriSpeech-PC test-clean (cross-sentence)
python src/f5_tts/eval/eval_librispeech_test_clean.py --eval_task sim --gen_wav_dir <GEN_WAV_DIR> --librispeech_test_clean_path <TEST_CLEAN_PATH>

# Evaluation [UTMOS]. --ext: Audio extension
python src/f5_tts/eval/eval_utmos.py --audio_dir <WAV_DIR> --ext wav

Note

Evaluation results can also be found in _*_results.jsonl files saved in <GEN_WAV_DIR>/<WAV_DIR>.

2.3 KiB Raw Blame History

Evaluation

Generating Samples for Evaluation

Prepare Test Datasets

Batch Inference for Test Set

Objective Evaluation on Generated Results

Download Evaluation Model Checkpoints

Objective Evaluation Examples

2.3 KiB

Raw Blame History