Detail publikace
Automatic Vocal Effort Detection for Reliable Speech Recognition
ZELINKA, P. SIGMUND, M.
Originální název
Automatic Vocal Effort Detection for Reliable Speech Recognition
Anglický název
Automatic Vocal Effort Detection for Reliable Speech Recognition
Jazyk
en
Originální abstrakt
This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.
Anglický abstrakt
This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.
Dokumenty
BibTex
@inproceedings{BUT35916,
author="Petr {Zelinka} and Milan {Sigmund}",
title="Automatic Vocal Effort Detection for Reliable Speech Recognition",
annote="This paper describes an approach for enhancing the robustness of isolated words recognizer by extending its flexibility in the domain of speaker's variable vocal effort level. An analysis of spectral properties of spoken vowels in four various speaking modes (whispering, soft, normal, and loud) confirm consistent spectral tilt changes. Severe impact of vocal effort variability on the accuracy of a speaker-dependent word recognizer is presented and an efficient remedial measure using multiple-model framework paired with accurate speech mode detector is proposed.",
booktitle="Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010)",
chapter="35916",
howpublished="print",
year="2010",
month="september",
pages="349--354",
type="conference paper"
}