Detail publikace

Improved MLP Structures for Data-Driven Feature Extraction for ASR

ZHU, Q., CHEN, B., GRÉZL, F., MORGAN, N.

Originální název

Improved MLP Structures for Data-Driven Feature Extraction for ASR

Anglický název

Improved MLP Structures for Data-Driven Feature Extraction for ASR

Jazyk

en

Originální abstrakt

In this paper, we present our recent progress on multi-layer perceptron (MLP) based data-driven feature extraction using improved MLP structures. Four-layer MLPs are used in this study. Different signal processing methods are applied before the input layer of the MLP. We show that the first hidden
layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane. KLT-based dimension reduction along time is applied as a modulation frequency filter. The new feature extraction was tested on a large
vocabulary continuous speech recognition (LVCSR) task using the NIST 2001 evaluation set. We achieved 11.6% relative word error rate (WER) reduction compared to the traditional PLP-based baseline feature. This is also a
significant improvement compared to our previously published results on the same task using MLP-based features with three-layer MLPs.

Anglický abstrakt

In this paper, we present our recent progress on multi-layer perceptron (MLP) based data-driven feature extraction using improved MLP structures. Four-layer MLPs are used in this study. Different signal processing methods are applied before the input layer of the MLP. We show that the first hidden
layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane. KLT-based dimension reduction along time is applied as a modulation frequency filter. The new feature extraction was tested on a large
vocabulary continuous speech recognition (LVCSR) task using the NIST 2001 evaluation set. We achieved 11.6% relative word error rate (WER) reduction compared to the traditional PLP-based baseline feature. This is also a
significant improvement compared to our previously published results on the same task using MLP-based features with three-layer MLPs.

Dokumenty

BibTex


@inproceedings{BUT18257,
  author="Qifeng {Zhu} and Barry {Chen} and František {Grézl} and Nelson {Morgan}",
  title="Improved MLP Structures for Data-Driven Feature Extraction for ASR",
  annote="In this paper, we present our recent progress on multi-layer perceptron (MLP) based data-driven feature extraction using improved MLP structures. Four-layer MLPs are used in this study. Different signal processing methods are applied before the input layer of the MLP. We show that the first hidden
layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane. KLT-based dimension reduction along time is applied as a modulation frequency filter. The new feature extraction was tested on a large
vocabulary continuous speech recognition (LVCSR) task using the NIST 2001 evaluation set. We achieved 11.6% relative word error rate (WER) reduction compared to the traditional PLP-based baseline feature. This is also a
significant improvement compared to our previously published results on the same task using MLP-based features with three-layer MLPs.
", booktitle="Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology", chapter="18257", journal="5th European Conference EUROSPEECH 97", year="2005", month="september", pages="2129", type="conference paper" }