Publication detail

Improving the computational complexity and word recognition rate for dysarthria speech using robust frame selection algorithm

VYAS, G. DUTTA, M. PŘINOSIL, J.

Original Title

Improving the computational complexity and word recognition rate for dysarthria speech using robust frame selection algorithm

English Title

Improving the computational complexity and word recognition rate for dysarthria speech using robust frame selection algorithm

Type

journal article

Language

en

Original Abstract

Dysarthria is a speech syndrome caused by the neurological damage in motor speech glands. In this paper, a robust frame selection algorithm has been employed to recognise the dysarthria speech with less time consumption. This algorithm determines the more informative frames which in turn reduce the size of feature matrix used for recognising the speech. This method results in a significant reduction in computational complexity without compromising with the word recognition rate (WRR) which may support a real time application. The amalgamation of four prosodic features: Mel frequency cepstral coefficients (MFCCs), Log of energy per frame, differential MFCCs and double differential MFCCs has been used for training and testing the Hidden Markov Models (HMMs) for speech recognition. Several try-outs were performed on the high, medium and low intelligibility audio clips with a vocabulary size of 29 isolated words. The time complexity of the whole system is reduced up to 54.8% with respect to the time taken by the system without implementing RFS. The proposed scheme is gender, speaker and age independent

English abstract

Dysarthria is a speech syndrome caused by the neurological damage in motor speech glands. In this paper, a robust frame selection algorithm has been employed to recognise the dysarthria speech with less time consumption. This algorithm determines the more informative frames which in turn reduce the size of feature matrix used for recognising the speech. This method results in a significant reduction in computational complexity without compromising with the word recognition rate (WRR) which may support a real time application. The amalgamation of four prosodic features: Mel frequency cepstral coefficients (MFCCs), Log of energy per frame, differential MFCCs and double differential MFCCs has been used for training and testing the Hidden Markov Models (HMMs) for speech recognition. Several try-outs were performed on the high, medium and low intelligibility audio clips with a vocabulary size of 29 isolated words. The time complexity of the whole system is reduced up to 54.8% with respect to the time taken by the system without implementing RFS. The proposed scheme is gender, speaker and age independent

Keywords

Dysarthria; Hidden Markov models; MFCCs; Robust frame selection; Speech intelligibility; Speech recognition

Released

02.08.2017

Pages from

136

Pages to

145

Pages count

10

BibTex


@article{BUT139450,
  author="Garima {Vyas} and Malay Kishore {Dutta} and Jiří {Přinosil}",
  title="Improving the computational complexity and word recognition rate for dysarthria speech using robust frame selection algorithm",
  annote="Dysarthria is a speech syndrome caused by the neurological damage in motor speech glands. In this paper, a robust frame selection algorithm has been employed to recognise the dysarthria speech with less time consumption. This algorithm determines the more informative frames which in turn reduce the size of feature matrix used for recognising the speech. This method results in a significant reduction in computational complexity without compromising with the word recognition rate (WRR) which may support a real time application. The amalgamation of four prosodic features: Mel frequency cepstral coefficients (MFCCs), Log of energy per frame, differential MFCCs and double differential MFCCs has been used for training and testing the Hidden Markov Models (HMMs) for speech recognition. Several try-outs were performed on the high, medium and low intelligibility audio clips with a vocabulary size of 29 isolated words. The time complexity of the whole system is reduced up to 54.8% with respect to the time taken by the system without implementing RFS. The proposed scheme is gender, speaker and age independent",
  chapter="139450",
  doi="10.1504/IJSISE.2017.10006783",
  howpublished="online",
  number="3",
  volume="10",
  year="2017",
  month="august",
  pages="136--145",
  type="journal article"
}