Publication detail
Automatic Vocal Effort Detection for Reliable Speech Recognition
ZELINKA, P. SIGMUND, M.
Original Title
Automatic Vocal Effort Detection for Reliable Speech Recognition
English Title
Automatic Vocal Effort Detection for Reliable Speech Recognition
Type
conference paper
Language
en
Original Abstract
This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.
English abstract
This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.
Keywords
vocal effort detection, speech recognition, hidden Markov models
RIV year
2010
Released
01.09.2010
Location
Kittilä
ISBN
978-1-4244-7876-7
Book
Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010)
Pages from
349
Pages to
354
Pages count
6
Documents
BibTex
@inproceedings{BUT35916,
author="Petr {Zelinka} and Milan {Sigmund}",
title="Automatic Vocal Effort Detection for Reliable Speech Recognition",
annote="This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.",
booktitle="Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010)",
chapter="35916",
howpublished="print",
year="2010",
month="september",
pages="349--354",
type="conference paper"
}