Commit Graph

170 Commits

Author SHA1 Message Date
SWivid
254e5e6d30 update finetune-cli -gradio 2024-10-24 15:23:55 +08:00
SWivid
b4abb3cbd6 update infer_gradio 2024-10-24 13:51:06 +08:00
Yushen CHEN
ff690b7ffb Add back /data for local editable use case 2024-10-24 01:26:14 +08:00
SWivid
c1ba121dce tweak 2024-10-24 01:23:53 +08:00
SWivid
d3951b93a7 tweak 2024-10-24 01:19:01 +08:00
SWivid
8e0edfcf8f final structure. prepared to solve dependencies 2024-10-24 00:55:41 +08:00
SWivid
8ed1beac1e make a structure first 2024-10-24 00:07:14 +08:00
Yushen CHEN
213adf4e6f Delete tests directory 2024-10-23 23:12:22 +08:00
SWivid
d8638a6c32 . 2024-10-23 23:05:25 +08:00
Yushen CHEN
c4eee0f96b convert to pkg, reorganize repo (#228)
* group files in f5_tts directory

* add setup.py

* use global imports

* simplify demo

* add install directions for library mode

* fix old huggingface_hub version constraint

* move finetune to package

* change imports to f5_tts.model

* bump version

* fix bad merge

* Update inference-cli.py

* fix HF space

* reformat

* fix utils.py vocab.txt import

* fix format

* adapt README for f5_tts package structure

* simplify app.py

* add gradio.Dockerfile and workflow

* refactored for pyproject.toml

* refactored for pyproject.toml

* added in reference to packaged files

* use fork for testing docker image

* added in reference to packaged files

* minor tweaks

* fixed inference-cli.toml path

* fixed inference-cli.toml path

* fixed inference-cli.toml path

* fixed inference-cli.toml path

* refactor eval_infer_batch.py

* fix typo

* added eval_infer_batch to scripts

---------

Co-authored-by: Roberts Slisans <rsxdalv@gmail.com>
Co-authored-by: Adam Kessel <adam@rosi-kessel.org>
Co-authored-by: Roberts Slisans <roberts.slisans@gmail.com>
2024-10-23 21:07:59 +08:00
lpscr
32c3ee7701 fix tts_api after clear in memory (#216) 2024-10-22 19:56:16 +08:00
SWivid
608e224f9d minor fix for api.py device 2024-10-22 18:53:38 +08:00
SWivid
8d18494d07 Update README.md. onnx version by DakeQQ 2024-10-22 18:33:13 +08:00
SWivid
198d44db65 minor fix 2024-10-22 17:54:54 +08:00
SWivid
752f6f5ea8 minor fix 2024-10-22 17:50:47 +08:00
lpscr
cd3c4afa69 fix. #213 correct device initialization 2024-10-22 17:48:48 +08:00
SWivid
f8eb8ab740 Update README.md 2024-10-22 13:15:19 +08:00
SWivid
92e5f55c46 Merge branch 'main' of github.com:SWivid/F5-TTS into main 2024-10-22 01:16:32 +08:00
SWivid
992dbb5c24 fix save last ckpt. make sure work paid off 2024-10-22 01:15:45 +08:00
lpscr
99190a03cb add model test in finetune and update some stuff (#207)
* add test model tab and some updates
* small updates label audios
* small updates label text
* small updates and to or in model load
2024-10-22 00:28:28 +08:00
SWivid
256f3f1320 Update. change asr pipeline back to whisper-large-v3-turbo 2024-10-21 22:17:44 +08:00
lpscr
a79199e54d small fix in api remove_silence (#201)
* fix remove_silence
* change  remove_silence false
* add seed vaule
2024-10-21 18:43:11 +08:00
SWivid
e80addf1e8 fix. utils_infer.py ref_audio_len misplace 2024-10-21 18:36:07 +08:00
SWivid
d15ef3679a fix address #191 2024-10-21 17:55:58 +08:00
SWivid
b899a35b88 load asr pipeline only if needed 2024-10-21 17:45:06 +08:00
Haitao
795cb19e4f allow for passing in custom mel spec module (#200) 2024-10-21 17:00:48 +08:00
lpscr
25cdc5182f add api for easy use (#186)
* add api
* update infer limits
2024-10-21 16:57:24 +08:00
Yushen CHEN
0f9f878be1 Merge pull request #196 from thunn/add_pre_commit_tooling
add and run pre-commit with ruff
2024-10-21 13:19:15 +08:00
Tom Hunn
a4ca14b5f6 add and run pre-commit with ruff 2024-10-21 14:46:45 +10:00
SWivid
77e00db01b Use main voice if can't find voice tag or specified voice. 2024-10-21 11:53:44 +08:00
SWivid
2c0924378d add sanity check ensuring mono audio input for training 2024-10-21 04:14:52 +08:00
SWivid
5600d9079a minor fix. 2024-10-21 03:40:20 +08:00
SWivid
d3badb95cf fp16 inference only for cuda devices now 2024-10-21 03:34:28 +08:00
SWivid
bd16a8c281 minor fix for hf space 2024-10-21 02:52:11 +08:00
SWivid
03a20e0258 reorganize inference scripts with shared funcs 2024-10-21 02:21:13 +08:00
SWivid
b4f81425f3 disable fp16 for cpu device 2024-10-20 22:45:54 +08:00
SWivid
073092d0d3 Merge branch 'main' of github.com:SWivid/F5-TTS into main 2024-10-20 20:43:25 +08:00
SWivid
28b46d32d3 rewrite without einx & einops; clean up 2024-10-20 20:41:24 +08:00
chigkim
765a2ae390 Load model once in the beginning. 2024-10-20 08:01:22 -04:00
SWivid
554f3189e1 Use default fp16 inference 2024-10-20 16:32:18 +08:00
SWivid
69850fa236 fix. address #179 2024-10-20 12:43:01 +08:00
SWivid
aaf1fa7efa Update README.md 2024-10-20 12:32:58 +08:00
Zhikang Niu
f618db7290 Update README.md 2024-10-20 10:27:03 +08:00
Zhikang Niu
532fbe8f02 Merge pull request #166 from cocktailpeanut/wandb_usability
User-friendly wandb support
2024-10-20 10:25:17 +08:00
chigkim
8831701897 REorganized cli output to be less verbose. 2024-10-19 18:28:31 -04:00
SWivid
a016d6f89c fix address #178 2024-10-19 21:59:52 +08:00
Yushen CHEN
84cb6e5f00 Merge pull request #173 from lpscr/main
add new args in interface-cli.py for pass model and vocab
2024-10-19 17:03:10 +08:00
unknown
60f1b31446 update read me for new arg 2024-10-19 11:29:59 +03:00
unknown
5663bac2a8 add new arg for vocab_file and ckpt_file to easy load any model 2024-10-19 11:24:31 +03:00
Zhikang Niu
925ce4b0dd Update README.md 2024-10-19 12:31:54 +08:00