Detail publikace

Automatic Vocal Effort Detection for Reliable Speech Recognition

Originální název

Automatic Vocal Effort Detection for Reliable Speech Recognition

Anglický název

Automatic Vocal Effort Detection for Reliable Speech Recognition

Jazyk

en

Originální abstrakt

This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.

Anglický abstrakt

This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.

BibTex


@inproceedings{BUT35916,
  author="Petr {Zelinka} and Milan {Sigmund}",
  title="Automatic Vocal Effort Detection for Reliable Speech Recognition",
  annote="This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.",
  booktitle="Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010)",
  chapter="35916",
  howpublished="print",
  year="2010",
  month="september",
  pages="349--354",
  type="conference paper"
}