Publication detail
Automatic Language Identification using Phoneme and Automatically Derived Unit Strings
MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J.
Original Title
Automatic Language Identification using Phoneme and Automatically Derived Unit Strings
English Title
Automatic Language Identification using Phoneme and Automatically Derived Unit Strings
Type
conference paper
Language
en
Original Abstract
Language identification (LID) based on phono-tactic modeling is presented in this paper. Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM (EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The results show superiority of Czech phoneme recognizer while used in LID and promising trends using the EHMM-derived units.
English abstract
Language identification (LID) based on phono-tactic modeling is presented in this paper. Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM (EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The results show superiority of Czech phoneme recognizer while used in LID and promising trends using the EHMM-derived units.
Keywords
language identificaton, phoneme recognizer, speech processing, ergodic hidden Markov model
RIV year
2004
Released
08.09.2004
Publisher
Springer
Location
Brno
ISBN
3-540-23049-1
Book
Proceedings of 7th International Conference Text,Speech and Dialoque 2004
Pages from
147
Pages to
154
Pages count
8
Documents
BibTex
@inproceedings{BUT11955,
author="Pavel {Matějka} and Igor {Szőke} and Petr {Schwarz} and Jan {Černocký}",
title="Automatic Language Identification using Phoneme and Automatically Derived Unit Strings",
annote="Language identification (LID) based on phono-tactic modeling is presented in this paper.
Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM
(EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI
multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The
results show superiority of Czech phoneme recognizer while used in LID and promising trends using the EHMM-derived units.",
address="Springer",
booktitle="Proceedings of 7th International Conference Text,Speech and Dialoque 2004",
chapter="11955",
institution="Springer",
year="2004",
month="september",
pages="147",
publisher="Springer",
type="conference paper"
}