A neural phonetic aligner for automatic phone segmentation

Charsiu is a phonetic alignment tool, which can:

  • recognize phonemes in a given audio file;
  • perform forced alignment using phone transcriptions created in the previous step or provided by the user;
  • directly predict the phone-to-audio alignment from audio (text-independent alignment).

Charsiu project is publicly available on github.

Pretrained models are available at the 🤗HuggingFace model hub.