diff --git a/src/f5_tts/infer/README.md b/src/f5_tts/infer/README.md index cba4eb2..fe48a78 100644 --- a/src/f5_tts/infer/README.md +++ b/src/f5_tts/infer/README.md @@ -12,6 +12,8 @@ To avoid possible inference failures, make sure you have seen through the follow - Uppercased letters will be uttered letter by letter, so use lowercased letters for normal words. - Add some spaces (blank: " ") or punctuations (e.g. "," ".") to explicitly introduce some pauses. - Preprocess numbers to Chinese letters if you want to have them read in Chinese, otherwise in English. +- If the generation output is blank (pure silence), check for ffmpeg installation (various tutorials online, blogs, videos, etc.). +- Try turn off use_ema if using an early-stage finetuned checkpoint (which goes just few updates). ## Gradio App diff --git a/src/f5_tts/train/README.md b/src/f5_tts/train/README.md index 90d7c85..d269449 100644 --- a/src/f5_tts/train/README.md +++ b/src/f5_tts/train/README.md @@ -48,6 +48,8 @@ Discussion board for Finetuning [#57](https://github.com/SWivid/F5-TTS/discussio Gradio UI training/finetuning with `src/f5_tts/train/finetune_gradio.py` see [#143](https://github.com/SWivid/F5-TTS/discussions/143). +The `use_ema = True` is harmful for early-stage finetuned checkpoints (which goes just few updates, thus ema weights still dominated by pretrained ones), try turn it off and see if provide better results. + ### 3. Wandb Logging The `wandb/` dir will be created under path you run training/finetuning scripts.