F5-TTS

mirror of https://github.com/SWivid/F5-TTS.git synced 2026-03-12 21:02:50 -07:00

Author	SHA1	Message	Date
Yushen CHEN	c279a2b7d5	Merge pull request #1256 from ZhikangNiu/main change prepare_csv_wavs from relative path to absolute path and get duration info with soundfile and torchaudio	2026-01-22 20:00:54 +08:00
ZhikangNiu	5d473e980c	add tqdm in convert text to pinyin	2026-01-22 16:34:39 +08:00
ZhikangNiu	2aefa7c5f7	fix many tensorboard writer and only log in main_process	2026-01-22 13:36:09 +08:00
ZhikangNiu	97fdc7fbb4	change prepare_csv_wavs from relative path to absolute path and get duration info with soundfile and torchaudio	2026-01-22 12:27:23 +08:00
SWivid	1d2f7c5389	Formatting	2026-01-21 13:42:19 +00:00
Yushen CHEN	37a2633d35	Fix epoch updates count logic as drop_last	2026-01-21 21:36:57 +08:00
Raivis Dejus	bc1df7a4fa	Adding support for hf:// links on CLI (#1252 )	2026-01-21 16:38:30 +08:00
Raivis Dejus	eca786ee0c	Merge pull request #1250 from raivisdejus/add-latvian-community-model Adding Latvian model to shared community models list	2026-01-17 19:29:23 +08:00
Yushen CHEN	27e20fcf39	Merge pull request #1242 from acadarmeria/fix-speech-edit-mel-domain Fix speech editing boundary artifacts by working in mel domain	2025-12-26 17:35:57 +08:00
acadarmeria	dff57ebd2a	Fix speech editing boundary artifacts by working in mel domain Previously, speech_edit.py worked in wav domain (inserting zeros into the waveform before computing mel spectrogram), which caused boundary artifacts when mel spectrogram windows straddled zeros and real audio. This commit refactors the approach to work in mel domain: - Compute mel spectrogram on the clean original audio first - Insert zero frames in mel domain instead of zero samples in wav domain - Use frame-level granularity throughout for consistency Benefits: - Eliminates boundary artifacts - More consistent behavior regardless of small float variations in input times - Cleaner edit boundaries Changes to speech_edit.py (lines 148-220): - Convert audio to mel using model.mel_spec() before editing - Build mel_cond by concatenating original mel frames + zero frames - Calculate all time-based values at frame level first, then convert to samples - Pass mel_cond directly to model.sample() instead of raw audio	2025-12-26 08:49:57 +00:00
SWivid	46ccc575c5	v1.1.15 workaround for gr.Accordion default open=False bug (#1239 ) 1.1.15	2025-12-21 15:06:44 +08:00
SWivid	39617fcf7a	v1.1.12 bump gradio from 5.0 to 6.0, several fixes to ensure compatibility with new gradio version 1.1.12	2025-12-20 18:44:43 +08:00
Yushen Chen	5b82f97c26	fix #1239 , use gradio>=6.0; add more clear instruction for ffmpeg installation (#1234 )	2025-12-20 16:08:13 +08:00
SWivid	9ae46c8360	Replace jieba pkg with rjieba - a jieba-rs Python binding 1.1.10	2025-11-28 13:08:07 +00:00
SWivid	3eecd94baa	support back avg upsampling for batch, cover up non-mask case	2025-11-09 11:56:03 +00:00
SWivid	d9a69452ce	formatting	2025-11-09 18:25:30 +08:00
Yushen CHEN	bc15df2b57	Merge pull request #1212 from QingyuLiu0521/fix/AverageUpsampling Fix Average Upsampling conflict logic, introduced from the previous batch inference fix.	2025-11-09 18:23:38 +08:00
QingyuLiu0521	9b2357a1b9	Fix Average Upsampling	2025-11-08 18:39:06 -05:00
Yushen CHEN	1dcb4e10f7	Add torchcodec dependency to pyproject.toml	2025-11-03 16:44:11 +08:00
SWivid	529d856133	clean-up eval scripts	2025-10-27 14:38:57 +00:00
SWivid	7abadc4c72	fix typo in eval scripts	2025-10-26 14:28:17 +00:00
SWivid	e67d50841e	runtime trtllm: fix batch inference skipping last words in shorter sentences #1039 #1179	2025-10-24 09:12:08 +00:00
SWivid	6b07fb03b2	clean-up ruff lint	2025-10-24 08:30:55 +00:00
SWivid	a051a68552	pytorch imple. fix batch inference skipping last words in shorter sentences issue #1039 #1179	2025-10-24 05:50:25 +00:00
Yushen CHEN	f2a4f8581f	Update runtime README	2025-10-22 08:37:32 +08:00
SWivid	a17c5ae435	pytorch imple.: fix batch 1 inference from last commit	2025-10-22 00:31:56 +00:00
SWivid	a0b8fb5df2	runtime trtllm: minor fixes. pytorch: update text_embedding logic to correct v0 batching.	2025-10-22 00:19:45 +00:00
SWivid	c8bfc3aa3d	runtime trtllm: support v1 and custom	2025-10-21 22:02:25 +00:00
SWivid	8d3ec72159	runtime trtllm: clean-up v0 code, several fixes.	2025-10-20 10:30:58 +00:00
SWivid	65ada48a62	set attn related default value for unet-t backbone: #1192	2025-10-09 06:51:25 +00:00
SWivid	77d3ec623b	v1.1.9 1.1.9	2025-09-13 13:42:33 +08:00
SWivid	186799d6dc	remove numpy<=1.26.4 for python_version>=3.11 #1162 ; update links	2025-09-13 13:40:55 +08:00
Yushen CHEN	31bb78f2ab	Update badge links	2025-09-03 15:12:24 +08:00
SWivid	e61824009a	v1.1.8 1.1.8	2025-08-28 12:33:37 +00:00
SWivid	06a74910bd	add option for text embedding late average upsampling	2025-08-28 11:46:11 +00:00
Yushen CHEN	ac3c43595c	delete .github/workflows/sync-hf.yaml for online space stablility	2025-08-27 06:52:18 +08:00
Jim	605fa13b42	Fix raw.arrow missing rows (#1145 ) * fix raw.arrow missing rows --------- Co-authored-by: SWivid <swivid@qq.com>	2025-07-22 19:38:44 +08:00
Yushen CHEN	5f35f27230	update pyproject.toml	2025-07-15 17:28:41 +08:00
Yushen CHEN	c96c3aeed8	Update pyproject.toml 1.1.7	2025-07-14 14:36:26 +08:00
Yushen CHEN	9b60fe6a34	update pyproject.toml, set gradio<=5.35.0 until fix #1126	2025-07-14 14:29:19 +08:00
SWivid	a275798a2f	last fix patch-1	2025-07-08 18:44:47 +08:00
SWivid	efc7a7498b	fix #1111 #1037 remove redundant unwrap_model for AcceleratedOptimizer; which has no attribute '_modules' thus conflict with has_compiled_regions check introduced in accelerate v1.7.0	2025-07-08 18:39:43 +08:00
SWivid	9842314127	update slicer in finetune_gradio, legacy min_length 2s changed to 20s	2025-07-08 16:59:46 +08:00
SWivid	69b0e0110e	v1.1.6 fla support, several changed for finetune and infer-cli 1.1.6	2025-07-03 00:08:42 +08:00
SWivid	52c84776e5	fine-grained speed control for infer-cli. #1112	2025-07-02 23:41:55 +08:00
Danh Tran	ebbd7bd91f	Update WAV File Naming and Dependencies 📝🔊 (#1091 ) * Update infer_cli.py * Update pyproject.toml * formalized --------- Co-authored-by: SWivid <swivid@qq.com>	2025-06-24 23:23:00 +08:00
Yushen CHEN	ac42286d04	update finetune_gradio.py, not to force lower case Not to force lower case, otherwise train infer mismatch with main infer code	2025-06-23 16:37:51 +08:00
Yushen CHEN	d937efa6f3	fix finetune_gradio.py, not to force lower case	2025-06-23 16:22:33 +08:00
Yushen CHEN	8975fca803	Merge pull request #1084 from starkwj/main Speedup inference by batching CFG in DiT	2025-06-12 03:54:04 +08:00
SWivid	8b0053ad0c	backward compatibility	2025-06-12 03:52:12 +08:00

1 2 3 4 5 ...

656 Commits