696 Commits

Author SHA1 Message Date
AWAS666 ff4e797aab feat: speed slider in gradio 2024-10-14 21:11:50 +02:00
SWivid f2b892a61b address #75 2024-10-15 02:37:39 +08:00
SWivid e54fee3b7f redirect to split hf ckpt repos 2024-10-15 02:12:20 +08:00
SWivid 372f6ab44e address #74 2024-10-15 01:53:35 +08:00
Yushen CHEN 40687a54a6 Merge pull request #73 from jpgallegoar/main
Added back parse_emotional_text
2024-10-15 01:39:51 +08:00
jpgallegoar 894acd3c43 Added back parse_emotional_text 2024-10-14 19:33:04 +02:00
SWivid 1cec6ddf34 minor fix 2024-10-15 01:28:11 +08:00
Yushen CHEN b648e8b04a Merge pull request #71 from jpgallegoar/main
Added multiple speech types generation
2024-10-15 00:20:23 +08:00
Yushen CHEN 18736b7de3 Merge pull request #72 from chigkim/cli
Fixed not saving output file when remove_silence is false.
2024-10-15 00:16:10 +08:00
jpgallegoar 3d2e8fd2d1 Added multiple speech types generation 2024-10-14 18:04:51 +02:00
SWivid ac672f363d Update README.md 2024-10-15 00:01:10 +08:00
Chi Kim 0f5fd5e13d Fixed not saving output file when remove_silence is false. 2024-10-14 11:55:16 -04:00
SWivid 9d2b8cb3da fix inference-cli; clean-up 2024-10-14 23:40:31 +08:00
Yushen CHEN 9ec24868a9 Merge pull request #67 from chigkim/cli
Command Line Interface for Inference
2024-10-14 22:37:48 +08:00
Chi Kim d393827475 Inference commandline interface. 2024-10-14 10:16:36 -04:00
SWivid 408075fa58 Update README.md; add python version, numpy<2.x instruct. 2024-10-14 12:37:22 +08:00
SWivid ac36558bfd Update README.md 2024-10-14 10:42:56 +08:00
SWivid d3e15e3fd4 Update README.md 2024-10-14 10:35:19 +08:00
SWivid e938b40bee add more detailed instruct. on inference. address #49 #50 2024-10-14 10:15:40 +08:00
SWivid ddb68eea89 minor fix 2024-10-14 01:18:46 +08:00
SWivid 49706f2ebc address #43 #45 2024-10-14 01:10:54 +08:00
SWivid 615d183a0d add code-switch friendly synth. and a smoother silence remover 2024-10-14 00:29:30 +08:00
Yushen CHEN 56222196b7 Merge pull request #38 from RootingInLoad/Batch-Inference&Podcast-Generation
Batch Inference & Podcast Generation
2024-10-13 23:20:39 +08:00
RootingInLoad 30d2f0be16 Batch Inference & Podcast Generation
Here's what the Batch Inference part does:

- Try to put as much characters as possible into one batch (200 max)
- If it's not possible, it'll try to do a cut whenever there's a semicolon character
- If it's not possible, it'll try to do a cut whenever there's a comma character
- If it's not possible, it'll try to do a cut after the most logical word (thus, therefore etc.) --> There's a list at the top of the Gradio script, and it's possible to modify it in Advanced Settings
- If nothing above worked, it's just going to go past that 200 line (realistically, if your text isn't gibberish, this shouldn't happen :D)

The Podcast Generation feature has these features built in:
- Takes two reference speeches and two reference texts (or empty and then transcribed automatically)
- You have to give a name to each of the two speakers
- You can then paste the podcast script, with one speaker's name followed by a semicolon and then their text, you can do the same with the other speaker, all as long as you want (because it's using the same batch inference as before)

All in all, the batch inference feature allow for a little bit more than real-time inference. (I might do another pull request with real-time streaming)

Immense thanks to all of those who worked on this project, it's really great. There's of course still room for improvement, but I think this is a step forward in terms of OSS TTS, so thanks !
2024-10-13 16:35:27 +02:00
SWivid 46d391a876 fix replacement of ckpt keys when do finetune training 2024-10-13 17:20:18 +08:00
SWivid 0d7b47bc3b enable correct ckpt loading for finetune 2024-10-13 14:41:08 +08:00
SWivid 83fbd34dc8 convert all input audio to mono 2024-10-13 13:39:16 +08:00
SWivid 68b4ce0f2b minor fix 2024-10-13 12:58:42 +08:00
SWivid 9395289d7a add ckpt load opt. for .safetensor 2024-10-13 10:55:18 +08:00
Zhikang Niu edc189fa96 Update trainer.py 2024-10-13 10:04:13 +08:00
SWivid 77abf4e98a separate torch pkg install 2024-10-13 09:46:03 +08:00
Yushen CHEN 0e2d4e866b Merge pull request #19 from fakerybakery/main
Add Gradio app, MPS support
2024-10-13 09:19:22 +08:00
Yushen CHEN 3180d25452 Merge pull request #17 from Mateleo/patch-1
Use https instead of ssh
2024-10-13 09:19:09 +08:00
mrfakename 3365e96075 Add Gradio app, MPS support 2024-10-12 14:36:15 -07:00
Mateleo 93af1f939c use https instead of ssh
ssh git cloning will trigger an error
2024-10-12 18:24:30 +02:00
SWivid ed0b71aa70 Update README.md 2024-10-11 22:08:53 +08:00
Zhikang Niu 09e398ff4e Update README.md 2024-10-11 21:32:18 +08:00
Zhikang Niu c01b988360 Update README.md 2024-10-11 21:26:51 +08:00
SWivid a621c223ec add speech edit test script 2024-10-11 00:41:23 +08:00
SWivid 39ce201c4e disable mask for single infer to save mem; add custom trans for vocab to address oov 2024-10-10 17:05:39 +08:00
Yushen CHEN f6e3b782c4 Update README.md, add paper link 2024-10-10 12:11:14 +08:00
Yushen CHEN 2fae7c0b13 Update README.md for some instruct. on single inference 2024-10-10 11:29:58 +08:00
Yushen CHEN b22fe71ef1 Update LICENSE, switch to MIT 2024-10-10 09:45:55 +08:00
SWivid a6938d56c6 add demo page link 2024-10-08 22:39:18 +08:00
Yushen CHEN 406a7923d9 add suppl. 2024-10-08 22:07:39 +08:00
SWivid 074881635d basic 2024-10-08 21:56:51 +08:00