Detail publikace

Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training

GRÉZL, F. KARAFIÁT, M.

Originální název

Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training

Anglický název

Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training

Jazyk

en

Originální abstrakt

This paper presents bootstrapping approach for training the Bottle-Neck neural network feature extractor which provides features  for subsequent GMM-HMM recognizer. One can use this recognizer to automatically transcribe the unsupervised data and assign the confidence of the transcription. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. The automatic transcription can recover 40-55% in comparison to manually transcribed data. This is 3 to 5% absolute improvement over NN trained on supervised data only. Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result. Dropping the rest of the data prevents training on low quality transcripts.

Anglický abstrakt

This paper presents bootstrapping approach for training the Bottle-Neck neural network feature extractor which provides features  for subsequent GMM-HMM recognizer. One can use this recognizer to automatically transcribe the unsupervised data and assign the confidence of the transcription. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. The automatic transcription can recover 40-55% in comparison to manually transcribed data. This is 3 to 5% absolute improvement over NN trained on supervised data only. Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result. Dropping the rest of the data prevents training on low quality transcripts.

Dokumenty

BibTex


@inproceedings{BUT105972,
  author="František {Grézl} and Martin {Karafiát}",
  title="Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training",
  annote="This paper presents bootstrapping approach for training the Bottle-Neck neural
network feature extractor which provides features  for subsequent GMM-HMM
recognizer. One can use this recognizer to automatically transcribe the
unsupervised data and assign the confidence of the transcription. Based on the
confidence, segments are selected and mixed with supervised data and new NNs are
trained. The automatic transcription can recover 40-55% in comparison to manually
transcribed data. This is 3 to 5% absolute improvement over NN trained on
supervised data only. Using 70-85% of automatically transcribed segments with the
highest confidence was found optimal to achieve this result. Dropping the rest of
the data prevents training on low quality transcripts.",
  address="IEEE Signal Processing Society",
  booktitle="Proceedings of ASRU 2013",
  chapter="105972",
  edition="NEUVEDEN",
  howpublished="print",
  institution="IEEE Signal Processing Society",
  year="2013",
  month="december",
  pages="470--475",
  publisher="IEEE Signal Processing Society",
  type="conference paper"
}