Detail publikace

Complementarity of Speech Recognition Systems and System Combination

BURGET, L.

Originální název

Complementarity of Speech Recognition Systems and System Combination

Anglický název

Complementarity of Speech Recognition Systems and System Combination

Jazyk

en

Originální abstrakt

In the past, many speech recognition systems diering in feature extraction method, classication method, training algorithm, etc. have been developed. A powerful tech- nique to obtain better recognition results is combination of such systems at dier- ent levels (feature-level combination, ROVER-combination of recognition outputs and others). The choice of systems for combination was however done ad-hoc or by an exhaustive search over all possible combinations. This thesis addresses primarily the problem of choice of systems suitable for combi- nation. We assume that systems good for combination must produce complementary outputs. To evaluate this complementarity, we rst dene complementarity measures for pairs of recognition systems based on simultaneous and dependent errors the two systems make. ROVER-like alignment of their text outputs is used to count the er- rors and to derive the measures. These measures are then extended to a denition of complementarity measures of a set of recognition systems. To verify experimentally the coherence of proposed measures with actual perfor- mance of combined systems, a small, yet representative data-set, based on AURORA database is dened. The coherence of measures with the recognition results is con- rmed for the following three combination methods: ROVER-like combination of text outputs of recognition systems (similar technique as that used for derivation of com- plementarity measures), feature-based combination of recognition systems and combi- nation of likelihoods in a multi-stream HMM. For feature-level combination, this thesis addresses also the suitability of linear transforms for de-correlation and dimensionality reduction of the feature space. PCA, LDA, HLDA are studied and plausible reasons of failure of these approaches are dis- cussed. Based on this analysis, two robust modications of HLDA - Smoothed HLDA (SHLDA) and Clustered HLDA (CHLDA) balancing the advantages of HLDA and LDA are suggested. Experiments have shown their superiority over PCA, LDA and HLDA.

Anglický abstrakt

In the past, many speech recognition systems diering in feature extraction method, classication method, training algorithm, etc. have been developed. A powerful tech- nique to obtain better recognition results is combination of such systems at dier- ent levels (feature-level combination, ROVER-combination of recognition outputs and others). The choice of systems for combination was however done ad-hoc or by an exhaustive search over all possible combinations. This thesis addresses primarily the problem of choice of systems suitable for combi- nation. We assume that systems good for combination must produce complementary outputs. To evaluate this complementarity, we rst dene complementarity measures for pairs of recognition systems based on simultaneous and dependent errors the two systems make. ROVER-like alignment of their text outputs is used to count the er- rors and to derive the measures. These measures are then extended to a denition of complementarity measures of a set of recognition systems. To verify experimentally the coherence of proposed measures with actual perfor- mance of combined systems, a small, yet representative data-set, based on AURORA database is dened. The coherence of measures with the recognition results is con- rmed for the following three combination methods: ROVER-like combination of text outputs of recognition systems (similar technique as that used for derivation of com- plementarity measures), feature-based combination of recognition systems and combi- nation of likelihoods in a multi-stream HMM. For feature-level combination, this thesis addresses also the suitability of linear transforms for de-correlation and dimensionality reduction of the feature space. PCA, LDA, HLDA are studied and plausible reasons of failure of these approaches are dis- cussed. Based on this analysis, two robust modications of HLDA - Smoothed HLDA (SHLDA) and Clustered HLDA (CHLDA) balancing the advantages of HLDA and LDA are suggested. Experiments have shown their superiority over PCA, LDA and HLDA.

Dokumenty

BibTex


@phdthesis{BUT66735,
  author="Lukáš {Burget}",
  title="Complementarity of Speech Recognition Systems and System Combination",
  annote="In the past, many speech recognition systems diering in feature extraction method, classication method, training algorithm, etc. have been developed. A powerful tech- nique to obtain better recognition results is combination of such systems at dier- ent levels (feature-level combination, ROVER-combination of recognition outputs and others). The choice of systems for combination was however done ad-hoc or by an exhaustive search over all possible combinations. This thesis addresses primarily the problem of choice of systems suitable for combi- nation. We assume that systems good for combination must produce complementary outputs. To evaluate this complementarity, we rst dene complementarity measures for pairs of recognition systems based on simultaneous and dependent errors the two systems make. ROVER-like alignment of their text outputs is used to count the er- rors and to derive the measures. These measures are then extended to a denition of complementarity measures of a set of recognition systems. To verify experimentally the coherence of proposed measures with actual perfor- mance of combined systems, a small, yet representative data-set, based on AURORA database is dened. The coherence of measures with the recognition results is con- rmed for the following three combination methods: ROVER-like combination of text outputs of recognition systems (similar technique as that used for derivation of com- plementarity measures), feature-based combination of recognition systems and combi- nation of likelihoods in a multi-stream HMM. For feature-level combination, this thesis addresses also the suitability of linear transforms for de-correlation and dimensionality reduction of the feature space. PCA, LDA, HLDA are studied and plausible reasons of failure of these approaches are dis- cussed. Based on this analysis, two robust modications of HLDA - Smoothed HLDA (SHLDA) and Clustered HLDA (CHLDA) balancing the advantages of HLDA and LDA are suggested. Experiments have shown their superiority over PCA, LDA and HLDA.",
  address="Faculty of Information Technology BUT",
  chapter="66735",
  institution="Faculty of Information Technology BUT",
  year="2004",
  month="september",
  publisher="Faculty of Information Technology BUT",
  type="dissertation"
}