Commit Graph

  • 2ae2c9bd9b Merge pull request #1298 from Kaihui-AMD/docs/update-amd-rocm-install main Yushen CHEN 2026-05-18 20:46:11 +08:00
  • 0556b2c1a5 update AMD ROCm install to 7.2 for RDNA 3.5/4 support Kaihui-AMD 2026-05-18 16:39:43 +08:00
  • 2f53ded68e Merge pull request #1294 from AAtomical/fix/path-traversal-finetune-gradio Yushen CHEN 2026-05-13 10:06:36 +08:00
  • 25dc4e8668 fix: path traversal in finetune Gradio handlers (closes #1293) sysy 2026-05-12 16:04:28 -04:00
  • 6f91022519 v1.1.20: refactor cache handling in DiT, MMDiT, and UNetT classes (lazyinit), to fix training bug (EMA deepcopy failure) 1.1.20 SWivid 2026-04-20 15:28:10 +08:00
  • 650c177b14 Merge pull request #1290 from will422-l/fix/gradio-version-cap Yushen CHEN 2026-04-17 12:01:39 +08:00
  • 761643b1ff fix: cap Gradio version to <6.11 to prevent UI freeze will422-l 2026-04-17 09:11:15 +08:00
  • 25874ca255 Bump version from 1.1.18 to 1.1.19 1.1.19 Yushen CHEN 2026-04-16 10:56:40 +08:00
  • 428050aa80 Fixes #1287 #1288 SWivid 2026-04-16 10:49:31 +08:00
  • 22299b38f7 Merge pull request #1285 from ZhikangNiu/main Yushen CHEN 2026-04-04 16:50:43 +08:00
  • 5486a158d4 reuse resamplers and cache vocos MelSpectrogram instances, it will reduce some training cost ZhikangNiu 2026-04-04 14:31:00 +08:00
  • 82fc4fe622 Bump version from 1.1.17 to 1.1.18 1.1.18 Yushen CHEN 2026-03-24 20:20:15 +08:00
  • 1a63dda3df Several fixes for utils_infer.py; separate streaming and non-streaming func and add back parallelism SWivid 2026-03-24 20:03:01 +08:00
  • 2414e3d492 Merge pull request #1281 from zhuxiaoxuhit/fix/remove-ineffective-threadpoolexecutor Yushen CHEN 2026-03-24 19:31:29 +08:00
  • 543fe4facf remove ineffective ThreadPoolExecutor in infer_batch_process zhuxiaoxu 2026-03-24 16:10:46 +08:00
  • deb5540edb Merge pull request #1280 from ZhikangNiu/main Yushen CHEN 2026-03-23 20:14:40 +08:00
  • a25de67cbd F5TTS v1 Small + LibriTTS training config ZhikangNiu 2026-03-23 16:14:33 +08:00
  • 623c96c294 Add Arabic model details to SHARED.md (#1279) Karim Ouda 2026-03-16 09:13:45 +01:00
  • 5005714c4c Remove pydantic<=2.10.6 restriction to fit latest gradio version SWivid 2026-03-07 19:37:59 +08:00
  • 4533426c72 Bump version from 1.1.16 to 1.1.17 1.1.17 Yushen CHEN 2026-03-04 19:34:03 +08:00
  • b5ab1afa16 Merge pull request #1270 from ZhikangNiu/main Zhikang Niu-SII 2026-03-04 19:31:52 +08:00
  • ab75dc2837 Merge pull request #1271 from mlxu995/patch-1 Yushen CHEN 2026-03-04 18:56:46 +08:00
  • 4361b0b94f Add show_info parameter to preprocess_ref_audio_text Menglong Xu 2026-03-04 17:04:20 +08:00
  • 097772c917 Merge pull request #1269 from ZhikangNiu/main Yushen CHEN 2026-02-27 01:20:09 +08:00
  • 76c00b127e when use flash_attn, log_sample should under autocast context ZhikangNiu 2026-02-25 08:10:00 +08:00
  • d7c7a117fa feat:add mmdit flash attn support ZhikangNiu 2026-02-23 20:39:35 +08:00
  • 54c50eb8f6 Bump version from 1.1.15 to 1.1.16 1.1.16 Yushen CHEN 2026-02-16 12:37:19 +08:00
  • 65250152da Merge pull request #1267 from QingyuLiu0521/qyl/pr-dit-only Yushen CHEN 2026-02-16 12:28:17 +08:00
  • c817d6a21d Unify seq_len naming in DiT get_input_embed QingyuLiu0521 2026-02-15 23:24:11 -05:00
  • 04459f71e6 Merge pull request #1266 from ZhikangNiu/main Yushen CHEN 2026-02-16 11:10:17 +08:00
  • 57dc698c16 Apply ruff formatting QingyuLiu0521 2026-02-15 21:41:17 -05:00
  • 6b6ce47d2e Optimize DiT text embedding with batched per-sample seq handling QingyuLiu0521 2026-02-15 21:31:19 -05:00
  • 6768b1bcff Make wandb project/run_name/resume_id configurable via Hydra yaml, backward compatible with defaults ZhikangNiu 2026-02-16 10:02:33 +08:00
  • ecfdccb890 Merge pull request #1265 from ZhikangNiu/main Yushen CHEN 2026-02-14 11:14:57 +08:00
  • bb5526fc5b Use torch.utils.checkpoint in mmdit forward loop when enabled to reduce memory usage. ZhikangNiu 2026-02-14 11:05:08 +08:00
  • 655fbca552 Update run_asr_wer method in utils_eval.py for compat with jiwer>=4.0.0 SWivid 2026-02-02 17:48:16 +08:00
  • fc0fa67a03 Update eval README with ctranslate2 installation instructions Yushen CHEN 2026-01-28 19:15:57 +08:00
  • a3c2ea9784 Merge pull request #1261 from ZhikangNiu/main Yushen CHEN 2026-01-26 18:58:47 +08:00
  • d71a69d528 Ignore padding at the end of the ground truth mel spectrogram when training sample ZhikangNiu 2026-01-26 09:40:37 +08:00
  • b9d923088c Increase default max_duration from 4096 to 65536 in cfm.py (#1260) Yushen CHEN 2026-01-25 23:58:21 +08:00
  • c279a2b7d5 Merge pull request #1256 from ZhikangNiu/main Yushen CHEN 2026-01-22 20:00:54 +08:00
  • 5d473e980c add tqdm in convert text to pinyin ZhikangNiu 2026-01-22 16:34:39 +08:00
  • 2aefa7c5f7 fix many tensorboard writer and only log in main_process ZhikangNiu 2026-01-22 13:36:09 +08:00
  • 97fdc7fbb4 change prepare_csv_wavs from relative path to absolute path and get duration info with soundfile and torchaudio ZhikangNiu 2026-01-22 12:27:23 +08:00
  • 1d2f7c5389 Formatting SWivid 2026-01-21 13:42:19 +00:00
  • 37a2633d35 Fix epoch updates count logic as drop_last Yushen CHEN 2026-01-21 21:36:57 +08:00
  • bc1df7a4fa Adding support for hf:// links on CLI (#1252) Raivis Dejus 2026-01-21 10:38:30 +02:00
  • eca786ee0c Merge pull request #1250 from raivisdejus/add-latvian-community-model Raivis Dejus 2026-01-17 13:29:23 +02:00
  • 27e20fcf39 Merge pull request #1242 from acadarmeria/fix-speech-edit-mel-domain Yushen CHEN 2025-12-26 17:35:57 +08:00
  • dff57ebd2a Fix speech editing boundary artifacts by working in mel domain acadarmeria 2025-12-26 08:38:07 +00:00
  • 46ccc575c5 v1.1.15 workaround for gr.Accordion default open=False bug (#1239) 1.1.15 SWivid 2025-12-21 15:06:44 +08:00
  • 39617fcf7a v1.1.12 bump gradio from 5.0 to 6.0, several fixes to ensure compatibility with new gradio version 1.1.12 SWivid 2025-12-20 18:44:43 +08:00
  • 5b82f97c26 fix #1239, use gradio>=6.0; add more clear instruction for ffmpeg installation (#1234) Yushen Chen 2025-12-20 16:08:13 +08:00
  • 9ae46c8360 Replace jieba pkg with rjieba - a jieba-rs Python binding 1.1.10 SWivid 2025-11-28 13:08:07 +00:00
  • 3eecd94baa support back avg upsampling for batch, cover up non-mask case SWivid 2025-11-09 11:56:03 +00:00
  • d9a69452ce formatting SWivid 2025-11-09 18:25:30 +08:00
  • bc15df2b57 Merge pull request #1212 from QingyuLiu0521/fix/AverageUpsampling Yushen CHEN 2025-11-09 18:23:38 +08:00
  • 9b2357a1b9 Fix Average Upsampling QingyuLiu0521 2025-11-08 18:39:06 -05:00
  • 1dcb4e10f7 Add torchcodec dependency to pyproject.toml Yushen CHEN 2025-11-03 16:44:11 +08:00
  • 529d856133 clean-up eval scripts SWivid 2025-10-27 14:38:57 +00:00
  • 7abadc4c72 fix typo in eval scripts SWivid 2025-10-26 14:28:17 +00:00
  • e67d50841e runtime trtllm: fix batch inference skipping last words in shorter sentences #1039 #1179 SWivid 2025-10-24 09:12:08 +00:00
  • 6b07fb03b2 clean-up ruff lint SWivid 2025-10-24 08:30:55 +00:00
  • a051a68552 pytorch imple. fix batch inference skipping last words in shorter sentences issue #1039 #1179 SWivid 2025-10-24 05:50:25 +00:00
  • f2a4f8581f Update runtime README Yushen CHEN 2025-10-22 08:37:32 +08:00
  • a17c5ae435 pytorch imple.: fix batch 1 inference from last commit SWivid 2025-10-22 00:31:56 +00:00
  • a0b8fb5df2 runtime trtllm: minor fixes. pytorch: update text_embedding logic to correct v0 batching. SWivid 2025-10-22 00:19:45 +00:00
  • c8bfc3aa3d runtime trtllm: support v1 and custom SWivid 2025-10-21 22:02:25 +00:00
  • 8d3ec72159 runtime trtllm: clean-up v0 code, several fixes. SWivid 2025-10-20 10:30:58 +00:00
  • 65ada48a62 set attn related default value for unet-t backbone: #1192 SWivid 2025-10-09 06:51:25 +00:00
  • 77d3ec623b v1.1.9 1.1.9 SWivid 2025-09-13 13:42:33 +08:00
  • 186799d6dc remove numpy<=1.26.4 for python_version>=3.11 #1162; update links SWivid 2025-09-13 13:40:55 +08:00
  • 31bb78f2ab Update badge links Yushen CHEN 2025-09-03 15:12:24 +08:00
  • e61824009a v1.1.8 1.1.8 SWivid 2025-08-28 12:33:37 +00:00
  • 06a74910bd add option for text embedding late average upsampling SWivid 2025-08-28 11:46:11 +00:00
  • ac3c43595c delete .github/workflows/sync-hf.yaml for online space stablility Yushen CHEN 2025-08-27 06:52:18 +08:00
  • 605fa13b42 Fix raw.arrow missing rows (#1145) Jim 2025-07-22 13:38:44 +02:00
  • 5f35f27230 update pyproject.toml Yushen CHEN 2025-07-15 17:28:41 +08:00
  • c96c3aeed8 Update pyproject.toml 1.1.7 Yushen CHEN 2025-07-14 14:36:26 +08:00
  • 9b60fe6a34 update pyproject.toml, set gradio<=5.35.0 until fix #1126 Yushen CHEN 2025-07-14 14:29:19 +08:00
  • a275798a2f last fix patch-1 SWivid 2025-07-08 18:44:47 +08:00
  • efc7a7498b fix #1111 #1037 remove redundant unwrap_model for AcceleratedOptimizer; which has no attribute '_modules' thus conflict with has_compiled_regions check introduced in accelerate v1.7.0 SWivid 2025-07-08 18:39:43 +08:00
  • 9842314127 update slicer in finetune_gradio, legacy min_length 2s changed to 20s SWivid 2025-07-08 16:59:46 +08:00
  • 69b0e0110e v1.1.6 fla support, several changed for finetune and infer-cli 1.1.6 SWivid 2025-07-03 00:08:42 +08:00
  • 52c84776e5 fine-grained speed control for infer-cli. #1112 SWivid 2025-07-02 23:41:55 +08:00
  • ebbd7bd91f Update WAV File Naming and Dependencies 📝🔊 (#1091) Danh Tran 2025-06-24 22:23:00 +07:00
  • ac42286d04 update finetune_gradio.py, not to force lower case Yushen CHEN 2025-06-23 16:37:51 +08:00
  • d937efa6f3 fix finetune_gradio.py, not to force lower case Yushen CHEN 2025-06-23 16:22:33 +08:00
  • 8975fca803 Merge pull request #1084 from starkwj/main Yushen CHEN 2025-06-12 03:54:04 +08:00
  • 8b0053ad0c backward compatibility SWivid 2025-06-12 03:52:12 +08:00
  • b3ef4ed1d7 correct imple., minor fixes SWivid 2025-06-12 03:32:19 +08:00
  • b1a9438496 Batch cfg DiT forward starkwj 2025-06-11 09:03:30 +00:00
  • 0914170e98 Add flash_attn2 support attn_mask, minor fixes (#1066) Zhikang Niu 2025-06-11 12:14:32 +08:00
  • c6ebad0220 switch sync-hf workflow logic on release, avoid hidden space error with pypi/local_editable mismatch SWivid 2025-06-06 07:23:54 +08:00
  • cfaba6387f refresh hf-space first SWivid 2025-06-06 07:22:02 +08:00
  • 646f34b20f v1.1.5 pypi 1.1.5 SWivid 2025-06-06 07:08:59 +08:00
  • 2e2acc6ea2 Update: Empirically Pruned Step Sampling (#1077) Jerrister Zheng 2025-06-04 22:59:30 +08:00
  • 6fbe7592f5 rebase default sample_rate to 24khz for runtime SWivid 2025-06-04 11:22:31 +08:00
  • 7e37bc5d9a Fix the duration computation in triton_trtllm/client_grpc.py (#1071) Alice Yanagi 2025-06-04 11:18:00 +08:00
  • 35f130ee85 minor update for infer-gradio SWivid 2025-06-04 06:11:49 +08:00