Detail publikace

Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge

NOVOTNÝ, O. MATĚJKA, P. PLCHOT, O. GLEMBEK, O. BURGET, L. ČERNOCKÝ, J.

Originální název

Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge

Anglický název

Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge

Jazyk

en

Originální abstrakt

In this paper, we summarize our efforts for the Speakers In The Wild (SITW) challenge, and we present our findings with this new dataset for speaker recognition. Apart from the standard comparison of different SRE systems, we analyze the use of diarization for dealing with audio segments containing multiple speakers, as in part of the newly introduced enrollment and test protocols, diarization is a necessary system component. Our state-of-the-art systems used in this work utilize both cepstral and DNN-based bottleneck features and are based on i-vectors followed by Probabilistic Linear Discriminant Analysis (PLDA) classifier and logistic regression calibration/fusion. We present both narrow-band (8 kHz) and wide-band (16 kHz) systems together with their fusions.

Anglický abstrakt

In this paper, we summarize our efforts for the Speakers In The Wild (SITW) challenge, and we present our findings with this new dataset for speaker recognition. Apart from the standard comparison of different SRE systems, we analyze the use of diarization for dealing with audio segments containing multiple speakers, as in part of the newly introduced enrollment and test protocols, diarization is a necessary system component. Our state-of-the-art systems used in this work utilize both cepstral and DNN-based bottleneck features and are based on i-vectors followed by Probabilistic Linear Discriminant Analysis (PLDA) classifier and logistic regression calibration/fusion. We present both narrow-band (8 kHz) and wide-band (16 kHz) systems together with their fusions.

Dokumenty

BibTex


@inproceedings{BUT132599,
  author="Ondřej {Novotný} and Pavel {Matějka} and Oldřich {Plchot} and Ondřej {Glembek} and Lukáš {Burget} and Jan {Černocký}",
  title="Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge",
  annote="In this paper, we summarize our efforts for the Speakers In The Wild (SITW)
challenge, and we present our findings with this new dataset for speaker
recognition. Apart from the standard comparison of different SRE systems, we
analyze the use of diarization for dealing with audio segments containing
multiple speakers, as in part of the newly introduced enrollment and test
protocols, diarization is a necessary system component. Our state-of-the-art
systems used in this work utilize both cepstral and DNN-based bottleneck features
and are based on i-vectors followed by Probabilistic Linear Discriminant Analysis
(PLDA) classifier and logistic regression calibration/fusion. We present both
narrow-band (8 kHz) and wide-band (16 kHz) systems together with their fusions.",
  address="International Speech Communication Association",
  booktitle="Proceedings of Interspeech 2016",
  chapter="132599",
  doi="10.21437/Interspeech.2016-981",
  edition="NEUVEDEN",
  howpublished="online",
  institution="International Speech Communication Association",
  year="2016",
  month="september",
  pages="828--832",
  publisher="International Speech Communication Association",
  type="conference paper"
}