Detail publikace

BAT System Description for NIST LRE 2015

PLCHOT, O. MATĚJKA, P. FÉR, R. GLEMBEK, O. NOVOTNÝ, O. PEŠÁN, J. VESELÝ, K. ONDEL, L. KARAFIÁT, M. GRÉZL, F. KESIRAJU, S. BURGET, L. BRUMMER, N. SWART, A. CUMANI, S. MALLIDI, S. LI, R.

Originální název

BAT System Description for NIST LRE 2015

Anglický název

BAT System Description for NIST LRE 2015

Jazyk

en

Originální abstrakt

In this paper, we summarize our efforts in the NIST Language Recognition (LRE) 2015 Evaluations which resulted in systems providing very competitive performance. We provide both the descriptions and the analysis of the systems that we included in our submission. We start by detailed description of the datasets that we used for training and development, and we follow by describing the models and methods that were used to produce the final scores. These include the front-end (i.e., the voice activity detection and feature extraction), the back-end (i.e., the final classifier), and the calibration and fusion stages. Apart from the techniques commonly used in the field (such as i-vectors, DNN bottle-Neck features, NN classifiers, Gaussian Back-ends, etc.), we present less-common methods, such as Sequence Summarizing Neural Networks (SSNN), and Automatic Unit Discovery. We present the performance of the systems both on the Fixed condition (where participants are required to use predefined data sets only), and the Open condition (where participants are allowed to use any publicly available resource) of the NIST LRE 2015.

Anglický abstrakt

In this paper, we summarize our efforts in the NIST Language Recognition (LRE) 2015 Evaluations which resulted in systems providing very competitive performance. We provide both the descriptions and the analysis of the systems that we included in our submission. We start by detailed description of the datasets that we used for training and development, and we follow by describing the models and methods that were used to produce the final scores. These include the front-end (i.e., the voice activity detection and feature extraction), the back-end (i.e., the final classifier), and the calibration and fusion stages. Apart from the techniques commonly used in the field (such as i-vectors, DNN bottle-Neck features, NN classifiers, Gaussian Back-ends, etc.), we present less-common methods, such as Sequence Summarizing Neural Networks (SSNN), and Automatic Unit Discovery. We present the performance of the systems both on the Fixed condition (where participants are required to use predefined data sets only), and the Open condition (where participants are allowed to use any publicly available resource) of the NIST LRE 2015.

Dokumenty

BibTex


@inproceedings{BUT131004,
  author="Oldřich {Plchot} and Pavel {Matějka} and Radek {Fér} and Ondřej {Glembek} and Ondřej {Novotný} and Jan {Pešán} and Karel {Veselý} and Lucas Antoine Francois {Ondel} and Martin {Karafiát} and František {Grézl} and Santosh {Kesiraju} and Lukáš {Burget} and Niko {Brummer} and Albert du Preez {Swart} and Sandro {Cumani} and Sri Harish {Mallidi} and Ruizhi {Li}",
  title="BAT System Description for NIST LRE 2015",
  annote="In this paper, we summarize our efforts in the NIST Language Recognition (LRE)
2015 Evaluations which resulted in systems providing very competitive
performance. We provide both the descriptions and the analysis of the systems
that we included in our submission. We start by detailed description of the
datasets that we used for training and development, and we follow by describing
the models and methods that were used to produce the final scores. These include
the front-end (i.e., the voice activity detection and feature extraction), the
back-end (i.e., the final classifier), and the calibration and fusion stages.
Apart from the techniques commonly used in the field (such as i-vectors, DNN
bottle-Neck features, NN classifiers, Gaussian Back-ends, etc.), we present
less-common methods, such as Sequence Summarizing Neural Networks (SSNN), and
Automatic Unit Discovery. We present the performance of the systems both on the
Fixed condition (where participants are required to use predefined data sets
only), and the Open condition (where participants are allowed to use any publicly
available resource) of the NIST LRE 2015.",
  address="International Speech Communication Association",
  booktitle="Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop",
  chapter="131004",
  doi="10.21437/Odyssey.2016-24",
  edition="NEUVEDEN",
  howpublished="online",
  institution="International Speech Communication Association",
  number="06",
  year="2016",
  month="june",
  pages="166--173",
  publisher="International Speech Communication Association",
  type="conference paper"
}