Detail publikace

Audio Enhancing With DNN Autoencoder For Speaker Recognition

PLCHOT, O. BURGET, L. ARONOWITZ, H. MATĚJKA, P.

Originální název

Audio Enhancing With DNN Autoencoder For Speaker Recognition

Anglický název

Audio Enhancing With DNN Autoencoder For Speaker Recognition

Jazyk

en

Originální abstrakt

In this paper we present a design of a DNN-based autoencoder for speech enhancement and its use for speaker recognition systems for distant microphones and noisy data. We started with augmenting the Fisher database with artificially noised and reverberated data and trained the autoencoder to map noisy and reverberated speech to its clean version. We use the autoencoder as a preprocessing step in the later stage of modelling in state-of-the-art text-dependent and text-independent speaker recognition systems. We report relative improvements up to 50% for the text-dependent system and up to 48% for the text-independent one. With text-independent system, we present a more detailed analysis on various conditions of NIST SRE 2010 and PRISM suggesting that the proposed preprocessig is a promising and efficient way to build a robust speaker recognition system for distant microphone and noisy data.

Anglický abstrakt

In this paper we present a design of a DNN-based autoencoder for speech enhancement and its use for speaker recognition systems for distant microphones and noisy data. We started with augmenting the Fisher database with artificially noised and reverberated data and trained the autoencoder to map noisy and reverberated speech to its clean version. We use the autoencoder as a preprocessing step in the later stage of modelling in state-of-the-art text-dependent and text-independent speaker recognition systems. We report relative improvements up to 50% for the text-dependent system and up to 48% for the text-independent one. With text-independent system, we present a more detailed analysis on various conditions of NIST SRE 2010 and PRISM suggesting that the proposed preprocessig is a promising and efficient way to build a robust speaker recognition system for distant microphone and noisy data.

Dokumenty

BibTex


@inproceedings{BUT130961,
  author="Oldřich {Plchot} and Lukáš {Burget} and Hagai {Aronowitz} and Pavel {Matějka}",
  title="Audio Enhancing With DNN Autoencoder For Speaker Recognition",
  annote="In this paper we present a design of a DNN-based autoencoder for speech
enhancement and its use for speaker recognition systems for distant microphones
and noisy data. We started with augmenting the Fisher database with artificially
noised and reverberated data and trained the autoencoder to map noisy and
reverberated speech to its clean version. We use the autoencoder as
a preprocessing step in the later stage of modelling in state-of-the-art
text-dependent and text-independent speaker recognition systems. We report
relative improvements up to 50% for the text-dependent system and up to 48% for
the text-independent one. With text-independent system, we present a more
detailed analysis on various conditions of NIST SRE 2010 and PRISM suggesting
that the proposed preprocessig is a promising and efficient way to build a robust
speaker recognition system for distant microphone and noisy data.",
  address="IEEE Signal Processing Society",
  booktitle="Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016",
  chapter="130961",
  doi="10.1109/ICASSP.2016.7472647",
  edition="NEUVEDEN",
  howpublished="electronic, physical medium",
  institution="IEEE Signal Processing Society",
  year="2016",
  month="march",
  pages="5090--5094",
  publisher="IEEE Signal Processing Society",
  type="conference paper"
}