F5-TTS

mirror of https://github.com/SWivid/F5-TTS.git synced 2025-12-29 14:15:18 -08:00

Author	SHA1	Message	Date
SWivid	254e5e6d30	update finetune-cli -gradio	2024-10-24 15:23:55 +08:00
SWivid	b4abb3cbd6	update infer_gradio	2024-10-24 13:51:06 +08:00
Yushen CHEN	ff690b7ffb	Add back /data for local editable use case	2024-10-24 01:26:14 +08:00
SWivid	c1ba121dce	tweak	2024-10-24 01:23:53 +08:00
SWivid	d3951b93a7	tweak	2024-10-24 01:19:01 +08:00
SWivid	8e0edfcf8f	final structure. prepared to solve dependencies	2024-10-24 00:55:41 +08:00
SWivid	8ed1beac1e	make a structure first	2024-10-24 00:07:14 +08:00
Yushen CHEN	213adf4e6f	Delete tests directory	2024-10-23 23:12:22 +08:00
SWivid	d8638a6c32	.	2024-10-23 23:05:25 +08:00
Yushen CHEN	c4eee0f96b	convert to pkg, reorganize repo (#228 ) * group files in f5_tts directory * add setup.py * use global imports * simplify demo * add install directions for library mode * fix old huggingface_hub version constraint * move finetune to package * change imports to f5_tts.model * bump version * fix bad merge * Update inference-cli.py * fix HF space * reformat * fix utils.py vocab.txt import * fix format * adapt README for f5_tts package structure * simplify app.py * add gradio.Dockerfile and workflow * refactored for pyproject.toml * refactored for pyproject.toml * added in reference to packaged files * use fork for testing docker image * added in reference to packaged files * minor tweaks * fixed inference-cli.toml path * fixed inference-cli.toml path * fixed inference-cli.toml path * fixed inference-cli.toml path * refactor eval_infer_batch.py * fix typo * added eval_infer_batch to scripts --------- Co-authored-by: Roberts Slisans <rsxdalv@gmail.com> Co-authored-by: Adam Kessel <adam@rosi-kessel.org> Co-authored-by: Roberts Slisans <roberts.slisans@gmail.com>	2024-10-23 21:07:59 +08:00
lpscr	32c3ee7701	fix tts_api after clear in memory (#216 )	2024-10-22 19:56:16 +08:00
SWivid	608e224f9d	minor fix for api.py device	2024-10-22 18:53:38 +08:00
SWivid	8d18494d07	Update README.md. onnx version by DakeQQ	2024-10-22 18:33:13 +08:00
SWivid	198d44db65	minor fix	2024-10-22 17:54:54 +08:00
SWivid	752f6f5ea8	minor fix	2024-10-22 17:50:47 +08:00
lpscr	cd3c4afa69	fix. #213 correct device initialization	2024-10-22 17:48:48 +08:00
SWivid	f8eb8ab740	Update README.md	2024-10-22 13:15:19 +08:00
SWivid	92e5f55c46	Merge branch 'main' of github.com:SWivid/F5-TTS into main	2024-10-22 01:16:32 +08:00
SWivid	992dbb5c24	fix save last ckpt. make sure work paid off	2024-10-22 01:15:45 +08:00
lpscr	99190a03cb	add model test in finetune and update some stuff (#207 ) * add test model tab and some updates * small updates label audios * small updates label text * small updates and to or in model load	2024-10-22 00:28:28 +08:00
SWivid	256f3f1320	Update. change asr pipeline back to whisper-large-v3-turbo	2024-10-21 22:17:44 +08:00
lpscr	a79199e54d	small fix in api remove_silence (#201 ) * fix remove_silence * change remove_silence false * add seed vaule	2024-10-21 18:43:11 +08:00
SWivid	e80addf1e8	fix. utils_infer.py ref_audio_len misplace	2024-10-21 18:36:07 +08:00
SWivid	d15ef3679a	fix address #191	2024-10-21 17:55:58 +08:00
SWivid	b899a35b88	load asr pipeline only if needed	2024-10-21 17:45:06 +08:00
Haitao	795cb19e4f	allow for passing in custom mel spec module (#200 )	2024-10-21 17:00:48 +08:00
lpscr	25cdc5182f	add api for easy use (#186 ) * add api * update infer limits	2024-10-21 16:57:24 +08:00
Yushen CHEN	0f9f878be1	Merge pull request #196 from thunn/add_pre_commit_tooling add and run pre-commit with ruff	2024-10-21 13:19:15 +08:00
Tom Hunn	a4ca14b5f6	add and run pre-commit with ruff	2024-10-21 14:46:45 +10:00
SWivid	77e00db01b	Use main voice if can't find voice tag or specified voice.	2024-10-21 11:53:44 +08:00
SWivid	2c0924378d	add sanity check ensuring mono audio input for training	2024-10-21 04:14:52 +08:00
SWivid	5600d9079a	minor fix.	2024-10-21 03:40:20 +08:00
SWivid	d3badb95cf	fp16 inference only for cuda devices now	2024-10-21 03:34:28 +08:00
SWivid	bd16a8c281	minor fix for hf space	2024-10-21 02:52:11 +08:00
SWivid	03a20e0258	reorganize inference scripts with shared funcs	2024-10-21 02:21:13 +08:00
SWivid	b4f81425f3	disable fp16 for cpu device	2024-10-20 22:45:54 +08:00
SWivid	073092d0d3	Merge branch 'main' of github.com:SWivid/F5-TTS into main	2024-10-20 20:43:25 +08:00
SWivid	28b46d32d3	rewrite without einx & einops; clean up	2024-10-20 20:41:24 +08:00
chigkim	765a2ae390	Load model once in the beginning.	2024-10-20 08:01:22 -04:00
SWivid	554f3189e1	Use default fp16 inference	2024-10-20 16:32:18 +08:00
SWivid	69850fa236	fix. address #179	2024-10-20 12:43:01 +08:00
SWivid	aaf1fa7efa	Update README.md	2024-10-20 12:32:58 +08:00
Zhikang Niu	f618db7290	Update README.md	2024-10-20 10:27:03 +08:00
Zhikang Niu	532fbe8f02	Merge pull request #166 from cocktailpeanut/wandb_usability User-friendly wandb support	2024-10-20 10:25:17 +08:00
chigkim	8831701897	REorganized cli output to be less verbose.	2024-10-19 18:28:31 -04:00
SWivid	a016d6f89c	fix address #178	2024-10-19 21:59:52 +08:00
Yushen CHEN	84cb6e5f00	Merge pull request #173 from lpscr/main add new args in interface-cli.py for pass model and vocab	2024-10-19 17:03:10 +08:00
unknown	60f1b31446	update read me for new arg	2024-10-19 11:29:59 +03:00
unknown	5663bac2a8	add new arg for vocab_file and ckpt_file to easy load any model	2024-10-19 11:24:31 +03:00
Zhikang Niu	925ce4b0dd	Update README.md	2024-10-19 12:31:54 +08:00

1 2 3 4

170 Commits