Detail publikace

Towards Lower Error Rates in Phoneme Recognition

Originální název

Towards Lower Error Rates in Phoneme Recognition

Anglický název

Towards Lower Error Rates in Phoneme Recognition

Jazyk

en

Originální abstrakt

We investigate techniques for acoustic modeling in automatic recognition of context-independent phoneme strings from the TIMIT database. The baseline phoneme recognizer is based on TempoRAl Patterns (TRAP). This recognizer is simplified to shorten processing times and reduce computational requirements. More states per phoneme and bi-gram language models are incorporated into the system and evaluated. The question of insufficient amount of training data is discussed and the system is improved. All modifications lead to a faster system with about 23.6% relative improvement over the baseline in phoneme error rate.

Anglický abstrakt

We investigate techniques for acoustic modeling in automatic recognition of context-independent phoneme strings from the TIMIT database. The baseline phoneme recognizer is based on TempoRAl Patterns (TRAP). This recognizer is simplified to shorten processing times and reduce computational requirements. More states per phoneme and bi-gram language models are incorporated into the system and evaluated. The question of insufficient amount of training data is discussed and the system is improved. All modifications lead to a faster system with about 23.6% relative improvement over the baseline in phoneme error rate.

BibTex


@inproceedings{BUT17585,
  author="Petr {Schwarz} and Pavel {Matějka} and Jan {Černocký}",
  title="Towards Lower Error Rates in Phoneme Recognition",
  annote="We investigate techniques for acoustic modeling in automatic
recognition of context-independent phoneme strings from the TIMIT
database. The baseline phoneme recognizer is based on TempoRAl Patterns
(TRAP). This recognizer is simplified to shorten processing times and
reduce computational requirements. More states per phoneme and bi-gram
language models are incorporated into the system and evaluated. The
question of insufficient amount of training data is discussed and the
system is improved. All modifications lead to a faster system with
about 23.6% relative improvement over the baseline in phoneme error
rate.",
  address="Springer Verlag",
  booktitle="Proceedings of 7th International Conference Text,Speech and Dialoque 2004",
  chapter="17585",
  institution="Springer Verlag",
  year="2004",
  month="september",
  pages="465",
  publisher="Springer Verlag",
  type="conference paper"
}