Detail publikace

Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge

MATĚJKA, P. PLCHOT, O. ZEINALI, H. MOŠNER, L. SILNOVA, A. BURGET, L. NOVOTNÝ, O. GLEMBEK, O.

Originální název

Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge

Anglický název

Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge

Jazyk

en

Originální abstrakt

This paper is a post-evaluation analysis of our efforts in VOiCES 2019 Speaker Recognition challenge. All systems in the fixed condition are based on x-vectors with different features and DNN topologies. The single best system reaches minDCF of 0.38 (5.25% EER) and a fusion of 3 systems yields minDCF of 0.34 (4.87% EER).We also analyze how speaker verification (SV) systems evolved in last few years and show results also on SITW 2016 Challenge. EER on the core-core condition of the SITW 2016 challenge dropped from 5.85% to 1.65% for system fusions submitted for SITW 2016 and VOiCES 2019, respectively. The less restrictive open condition allowed us to use external data for PLDA adaptation and achieve additional small performance improvement. In our submission to open condition, we used three x-vector systems and also one system based on i-vectors.

Anglický abstrakt

This paper is a post-evaluation analysis of our efforts in VOiCES 2019 Speaker Recognition challenge. All systems in the fixed condition are based on x-vectors with different features and DNN topologies. The single best system reaches minDCF of 0.38 (5.25% EER) and a fusion of 3 systems yields minDCF of 0.34 (4.87% EER).We also analyze how speaker verification (SV) systems evolved in last few years and show results also on SITW 2016 Challenge. EER on the core-core condition of the SITW 2016 challenge dropped from 5.85% to 1.65% for system fusions submitted for SITW 2016 and VOiCES 2019, respectively. The less restrictive open condition allowed us to use external data for PLDA adaptation and achieve additional small performance improvement. In our submission to open condition, we used three x-vector systems and also one system based on i-vectors.

Dokumenty

BibTex


@inproceedings{BUT159997,
  author="Pavel {Matějka} and Oldřich {Plchot} and Hossein {Zeinali} and Ladislav {Mošner} and Anna {Silnova} and Lukáš {Burget} and Ondřej {Novotný} and Ondřej {Glembek}",
  title="Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge",
  annote="This paper is a post-evaluation analysis of our efforts in VOiCES 2019 Speaker
Recognition challenge. All systems in the fixed condition are based on x-vectors
with different features and DNN topologies. The single best system reaches minDCF
of 0.38 (5.25% EER) and a fusion of 3 systems yields minDCF of 0.34 (4.87%
EER).We also analyze how speaker verification (SV) systems evolved in last few
years and show results also on SITW 2016 Challenge. EER on the core-core
condition of the SITW 2016 challenge dropped from 5.85% to 1.65% for system
fusions submitted for SITW 2016 and VOiCES 2019, respectively. The less
restrictive open condition allowed us to use external data for PLDA adaptation
and achieve additional small performance improvement. In our submission to open
condition, we used three x-vector systems and also one system based on
i-vectors.",
  address="International Speech Communication Association",
  booktitle="Proceedings of Interspeech",
  chapter="159997",
  doi="10.21437/Interspeech.2019-2471",
  edition="NEUVEDEN",
  howpublished="online",
  institution="International Speech Communication Association",
  number="9",
  year="2019",
  month="september",
  pages="2448--2452",
  publisher="International Speech Communication Association",
  type="conference paper"
}