Publication detail

Automatic Vocal Effort Detection for Reliable Speech Recognition

ZELINKA, P. SIGMUND, M.

Original Title

Automatic Vocal Effort Detection for Reliable Speech Recognition

English Title

Automatic Vocal Effort Detection for Reliable Speech Recognition

Type

conference paper

Language

en

Original Abstract

This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.

English abstract

This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.

Keywords

vocal effort detection, speech recognition, hidden Markov models

RIV year

2010

Released

01.09.2010

Location

Kittilä

ISBN

978-1-4244-7876-7

Book

Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010)

Pages from

349

Pages to

354

Pages count

6

Documents

BibTex


@inproceedings{BUT35916,
  author="Petr {Zelinka} and Milan {Sigmund}",
  title="Automatic Vocal Effort Detection for Reliable Speech Recognition",
  annote="This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.",
  booktitle="Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010)",
  chapter="35916",
  howpublished="print",
  year="2010",
  month="september",
  pages="349--354",
  type="conference paper"
}