Detail publikace

DNN-based SRE Systems in Multi-Language Conditions

NOVOTNÝ, O. MATĚJKA, P. GLEMBEK, O. PLCHOT, O. GRÉZL, F. BURGET, L. ČERNOCKÝ, J.

Originální název

DNN-based SRE Systems in Multi-Language Conditions

Anglický název

DNN-based SRE Systems in Multi-Language Conditions

Jazyk

en

Originální abstrakt

This work studies the usage of the (currently state-of-the-art) Deep Neural Networks (DNN) i-vector/PLDA-based speaker recognition systems in multi-language (especially non-English) conditions. On the ``Language Pack'' of the PRISM set, we evaluate the systems' performance using NIST's standard metrics. We study the use of multi-lingual DNN in place of the original English DNN on these multi-language conditions. We show that not only the gain from using DNNs vanishes, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions.

Anglický abstrakt

This work studies the usage of the (currently state-of-the-art) Deep Neural Networks (DNN) i-vector/PLDA-based speaker recognition systems in multi-language (especially non-English) conditions. On the ``Language Pack'' of the PRISM set, we evaluate the systems' performance using NIST's standard metrics. We study the use of multi-lingual DNN in place of the original English DNN on these multi-language conditions. We show that not only the gain from using DNNs vanishes, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions.

Dokumenty

BibTex


@techreport{BUT168427,
  author="Ondřej {Novotný} and Pavel {Matějka} and Ondřej {Glembek} and Oldřich {Plchot} and František {Grézl} and Lukáš {Burget} and Jan {Černocký}",
  title="DNN-based SRE Systems in Multi-Language Conditions",
  annote="This work studies the usage of the (currently state-of-the-art) Deep Neural
Networks (DNN) i-vector/PLDA-based speaker recognition systems in multi-language
(especially non-English) conditions. On the ``Language Pack'' of the PRISM set,
we evaluate the systems' performance using NIST's standard metrics. We study the
use of multi-lingual DNN in place of the original English DNN on these
multi-language conditions. We show that not only the gain from using DNNs
vanishes, but also the DNN-based systems tend to produce de-calibrated scores
under the studied conditions. This work gives suggestions for directions of
future research rather than any particular solutions.",
  address="Faculty of Information Technology BUT",
  chapter="168427",
  edition="NEUVEDEN",
  howpublished="print",
  institution="Faculty of Information Technology BUT",
  year="2016",
  month="july",
  pages="0--0",
  publisher="Faculty of Information Technology BUT",
  type="report"
}