F5-TTS

mirror of https://github.com/SWivid/F5-TTS.git synced 2026-01-19 08:11:29 -08:00

Author	SHA1	Message	Date
Zhikang Niu	2d26fba7bf	Update requirements.txt add tomli	2024-10-15 16:00:25 +08:00
SWivid	49b465f5d8	add credit of gradio multiple speech-type gen to jpgallegoar	2024-10-15 12:32:04 +08:00
Yushen CHEN	524eaba3a8	Merge pull request #90 from SWivid/fakerybakery-patch-1 Fix unexpected indent issue	2024-10-15 12:20:58 +08:00
mrfakename	923e95cadb	Fix unexpected indent issue	2024-10-14 21:07:39 -07:00
Yushen CHEN	3acf3e2a9b	Update dataset.py, fix typo	2024-10-15 12:05:15 +08:00
Yushen CHEN	658cfa5f31	Merge pull request #88 from JarodMica/main Update to make passing in custom paths easier for finetuning/training	2024-10-15 12:03:39 +08:00
Yushen CHEN	4cdcccf7a3	Update dataset.py	2024-10-15 12:03:16 +08:00
Yushen CHEN	03048d6142	Merge pull request #86 from chigkim/cli Added -f option to read text from a file.	2024-10-15 11:58:14 +08:00
Yushen CHEN	a8b455f44c	Merge pull request #80 from fakerybakery/sync-hf Sync to Hugging Face Space	2024-10-15 11:51:03 +08:00
Jarod Mica	6fda7e5f6f	Update to make passing in custom paths easier for finetuning/training	2024-10-14 20:13:07 -07:00
Chi Kim	ca8b596976	Added -f to read text from a file.	2024-10-14 22:57:00 -04:00
mrfakename	0297be2541	Add GPU decorator to add compatibility for ZeroGPU	2024-10-14 15:00:19 -07:00
mrfakename	f534c93e9b	Sync to Hugging Face space	2024-10-14 14:38:58 -07:00
Yushen CHEN	3287ba5f18	Merge pull request #79 from fakerybakery/patch-1 Reorganize Gradio app	2024-10-15 04:34:33 +08:00
mrfakename	f6b1de2251	Reorganize Gradio app	2024-10-14 13:30:25 -07:00
SWivid	53c772584e	Update README.md	2024-10-15 04:08:58 +08:00
SWivid	12ef9d23f3	address #76 #29	2024-10-15 03:45:14 +08:00
SWivid	2cd03c9134	Update README.md. #76	2024-10-15 03:27:25 +08:00
Yushen CHEN	2b37f5d4ff	Merge pull request #77 from AWAS666/main Speed slider in gradio app	2024-10-15 03:24:41 +08:00
AWAS666	664533a0b3	Merge branch 'main' of https://github.com/SWivid/F5-TTS	2024-10-14 21:11:51 +02:00
AWAS666	ff4e797aab	feat: speed slider in gradio	2024-10-14 21:11:50 +02:00
SWivid	f2b892a61b	address #75	2024-10-15 02:37:39 +08:00
SWivid	e54fee3b7f	redirect to split hf ckpt repos	2024-10-15 02:12:20 +08:00
SWivid	372f6ab44e	address #74	2024-10-15 01:53:35 +08:00
Yushen CHEN	40687a54a6	Merge pull request #73 from jpgallegoar/main Added back parse_emotional_text	2024-10-15 01:39:51 +08:00
jpgallegoar	894acd3c43	Added back parse_emotional_text	2024-10-14 19:33:04 +02:00
SWivid	1cec6ddf34	minor fix	2024-10-15 01:28:11 +08:00
Yushen CHEN	b648e8b04a	Merge pull request #71 from jpgallegoar/main Added multiple speech types generation	2024-10-15 00:20:23 +08:00
Yushen CHEN	18736b7de3	Merge pull request #72 from chigkim/cli Fixed not saving output file when remove_silence is false.	2024-10-15 00:16:10 +08:00
jpgallegoar	3d2e8fd2d1	Added multiple speech types generation	2024-10-14 18:04:51 +02:00
SWivid	ac672f363d	Update README.md	2024-10-15 00:01:10 +08:00
Chi Kim	0f5fd5e13d	Fixed not saving output file when remove_silence is false.	2024-10-14 11:55:16 -04:00
SWivid	9d2b8cb3da	fix inference-cli; clean-up	2024-10-14 23:40:31 +08:00
Yushen CHEN	9ec24868a9	Merge pull request #67 from chigkim/cli Command Line Interface for Inference	2024-10-14 22:37:48 +08:00
Chi Kim	d393827475	Inference commandline interface.	2024-10-14 10:16:36 -04:00
SWivid	408075fa58	Update README.md; add python version, numpy<2.x instruct.	2024-10-14 12:37:22 +08:00
SWivid	ac36558bfd	Update README.md	2024-10-14 10:42:56 +08:00
SWivid	d3e15e3fd4	Update README.md	2024-10-14 10:35:19 +08:00
SWivid	e938b40bee	add more detailed instruct. on inference. address #49 #50	2024-10-14 10:15:40 +08:00
SWivid	ddb68eea89	minor fix	2024-10-14 01:18:46 +08:00
SWivid	49706f2ebc	address #43 #45	2024-10-14 01:10:54 +08:00
SWivid	615d183a0d	add code-switch friendly synth. and a smoother silence remover	2024-10-14 00:29:30 +08:00
Yushen CHEN	56222196b7	Merge pull request #38 from RootingInLoad/Batch-Inference&Podcast-Generation Batch Inference & Podcast Generation	2024-10-13 23:20:39 +08:00
RootingInLoad	30d2f0be16	Batch Inference & Podcast Generation Here's what the Batch Inference part does: - Try to put as much characters as possible into one batch (200 max) - If it's not possible, it'll try to do a cut whenever there's a semicolon character - If it's not possible, it'll try to do a cut whenever there's a comma character - If it's not possible, it'll try to do a cut after the most logical word (thus, therefore etc.) --> There's a list at the top of the Gradio script, and it's possible to modify it in Advanced Settings - If nothing above worked, it's just going to go past that 200 line (realistically, if your text isn't gibberish, this shouldn't happen :D) The Podcast Generation feature has these features built in: - Takes two reference speeches and two reference texts (or empty and then transcribed automatically) - You have to give a name to each of the two speakers - You can then paste the podcast script, with one speaker's name followed by a semicolon and then their text, you can do the same with the other speaker, all as long as you want (because it's using the same batch inference as before) All in all, the batch inference feature allow for a little bit more than real-time inference. (I might do another pull request with real-time streaming) Immense thanks to all of those who worked on this project, it's really great. There's of course still room for improvement, but I think this is a step forward in terms of OSS TTS, so thanks !	2024-10-13 16:35:27 +02:00
SWivid	46d391a876	fix replacement of ckpt keys when do finetune training	2024-10-13 17:20:18 +08:00
SWivid	0d7b47bc3b	enable correct ckpt loading for finetune	2024-10-13 14:41:08 +08:00
SWivid	83fbd34dc8	convert all input audio to mono	2024-10-13 13:39:16 +08:00
SWivid	68b4ce0f2b	minor fix	2024-10-13 12:58:42 +08:00
SWivid	9395289d7a	add ckpt load opt. for .safetensor	2024-10-13 10:55:18 +08:00
Zhikang Niu	edc189fa96	Update trainer.py	2024-10-13 10:04:13 +08:00

1 2

66 Commits