TACOTRON2_GRIFFINLIM_PHONE_LJSPEECH¶

torchaudio.pipelines.TACOTRON2_GRIFFINLIM_PHONE_LJSPEECH¶

基於音素的 TTS 管線，使用在 LJSpeech [Ito and Johnson, 2017] 資料集上訓練了 1,500 個 epoch 的 Tacotron2 模型，並使用 GriffinLim 作為聲碼器。

文字處理器基於音素對輸入文字進行編碼。它使用 DeepPhonemizer 將字形轉換為音素。該模型 (en_us_cmudict_forward) 是在 CMUDict 上訓練的。

您可以在此處找到訓練指令碼。文字處理器設定為 “english_phonemes”。

示例 - “Hello world! T T S stands for Text to Speech!”

示例 - “The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired,”

文件