Publication detail

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification

KOCKMANN, M. FERRER, L. BURGET, L. ČERNOCKÝ, J.

Original Title

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification

English Title

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification

Type

conference paper

Language

en

Original Abstract

This publication is about the first results on the use of total variability modeling of the mean supervector space for a set of prosodic features. We show that this iVector approach outperforms the standard JFA approach originally proposed for these features. We note that this improvement over JFA is observed only when the iVectors are modeled using the PLDA back end.

English abstract

This publication is about the first results on the use of total variability modeling of the mean supervector space for a set of prosodic features. We show that this iVector approach outperforms the standard JFA approach originally proposed for these features. We note that this improvement over JFA is observed only when the iVectors are modeled using the PLDA back end.

Keywords

speaker verification, prosody, JFA, iVector, SMM, fusion

RIV year

2011

Released

27.08.2011

Publisher

International Speech Communication Association

Location

Florence

ISBN

978-1-61839-270-1

Book

Proceedings of Interspeech 2011

Edition

NEUVEDEN

Edition number

NEUVEDEN

Pages from

265

Pages to

268

Pages count

4

URL

Documents

BibTex


@inproceedings{BUT76448,
  author="Marcel {Kockmann} and Luciana {Ferrer} and Lukáš {Burget} and Jan {Černocký}",
  title="iVector Fusion of Prosodic and Cepstral Features for Speaker Verification",
  annote="This publication is about the first results on the use of total variability
modeling of the mean supervector space for a set of prosodic features. We show
that this iVector approach outperforms the standard JFA approach originally
proposed for these features. We note that this improvement over JFA is observed
only when the iVectors are modeled using the PLDA back end.",
  address="International Speech Communication Association",
  booktitle="Proceedings of Interspeech 2011",
  chapter="76448",
  edition="NEUVEDEN",
  howpublished="print",
  institution="International Speech Communication Association",
  number="8",
  year="2011",
  month="august",
  pages="265--268",
  publisher="International Speech Communication Association",
  type="conference paper"
}