Charsiu is a phonetic alignment tool, which can:
- recognize phonemes in a given audio file;
- perform forced alignment using phone transcriptions created in the previous step or provided by the user;
- directly predict the phone-to-audio alignment from audio (text-independent alignment).
Charsiu project is publicly available on github.
Pretrained models are available at the 🤗HuggingFace model hub.