Publication detail

Automatic Language Identification using Phoneme and Automatically Derived Unit Strings

MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J.

Original Title

Type

journal article - other

Language

English

Original Abstract

Language identification (LID) based on phono-tactic modeling is presented in this paper. Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM (EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The results show superiority of Czech phoneme recognizer while used in LID and promising trends using
the EHMM-derived units.

Keywords

language identificaton, phoneme recognizer, speech processing, ergodic hidden Markov model

Authors

MATĚJKA, P., SZŐKE, I., SCHWARZ, P., ČERNOCKÝ, J.

RIV year

2004

Released

8. 9. 2004

ISBN

0302-9743

Periodical

Lecture Notes in Computer Science

Year of study

2004

Number

3206

State

Federal Republic of Germany

Pages from

147

Pages to

154

Pages count

URL

http://www.springerlink.com/index/CUFLYEGQA8W1LNBE

BibTex

@article{BUT45738,
  author="Pavel {Matějka} and Igor {Szőke} and Petr {Schwarz} and Jan {Černocký}",
  title="Automatic Language Identification using Phoneme and Automatically Derived Unit Strings",
  journal="Lecture Notes in Computer Science",
  year="2004",
  volume="2004",
  number="3206",
  pages="8",
  issn="0302-9743",
  url="http://www.springerlink.com/index/CUFLYEGQA8W1LNBE"
}

VUT

Faculties

University Institutes

Parts

Automatic Language Identification using Phoneme and Automatically Derived Unit Strings