Jian Zhu | Charsiu

Charsiu is a phonetic alignment tool, which can:

recognize phonemes in a given audio file;
perform forced alignment using phone transcriptions created in the previous step or provided by the user;
directly predict the phone-to-audio alignment from audio (text-independent alignment).

Charsiu project is publicly available on github.

Pretrained models are available at the 🤗HuggingFace model hub.