Detail publikace

Score Fusion in Text-Dependent Speaker Recognition Systems

Originální název

Score Fusion in Text-Dependent Speaker Recognition Systems

Anglický název

Score Fusion in Text-Dependent Speaker Recognition Systems

Jazyk

en

Originální abstrakt

According to some significant advantages, the text-dependent speaker recognition is still widely used in biometric systems. These systems are, in comparison with the text-independent, more accurate and resistant against the replay attacks. There are many approaches regarding the text-dependent recognition. This paper introduces a combination of classifiers based on fractional distances, biometric dispersion matcher and dynamic time warping. The first two mentioned classifiers are based on a voice imprint. They have low memory requirements while the recognition procedure is fast. This is advantageous especially in low-cost biometric systems supplied by batteries. It is shown that using the trained score fusion, it is possible to reach successful detection rate equal to 98.98 % and 92.19 % in case of microphone mismatch. During verification, system reached equal error rate 2.55 % and 6.77 % when assuming the microphone mismatch. System was tested using Catalan database which consists of 48 speakers (three 3 s training samples per speaker).

Anglický abstrakt

According to some significant advantages, the text-dependent speaker recognition is still widely used in biometric systems. These systems are, in comparison with the text-independent, more accurate and resistant against the replay attacks. There are many approaches regarding the text-dependent recognition. This paper introduces a combination of classifiers based on fractional distances, biometric dispersion matcher and dynamic time warping. The first two mentioned classifiers are based on a voice imprint. They have low memory requirements while the recognition procedure is fast. This is advantageous especially in low-cost biometric systems supplied by batteries. It is shown that using the trained score fusion, it is possible to reach successful detection rate equal to 98.98 % and 92.19 % in case of microphone mismatch. During verification, system reached equal error rate 2.55 % and 6.77 % when assuming the microphone mismatch. System was tested using Catalan database which consists of 48 speakers (three 3 s training samples per speaker).

BibTex


@article{BUT75059,
  author="Jiří {Mekyska} and Marcos {Faúndez Zanuy} and Zdeněk {Smékal} and Joan {Fabregas}",
  title="Score Fusion in Text-Dependent Speaker Recognition Systems",
  annote="According to some significant advantages, the text-dependent speaker recognition is still widely used in biometric systems. These systems are, in comparison with the text-independent, more accurate and resistant against the replay attacks. There are many approaches regarding the text-dependent recognition. This paper introduces a combination of classifiers based on fractional distances, biometric dispersion matcher and dynamic time warping. The first two mentioned classifiers are based on a voice imprint. They have low memory requirements while the recognition procedure is fast. This is advantageous especially in low-cost biometric systems supplied by batteries. It is shown that using the trained score fusion, it is possible to reach successful detection rate equal to 98.98 % and 92.19 % in case of microphone mismatch. During verification, system reached equal error rate 2.55 % and 6.77 % when assuming the microphone mismatch. System was tested using Catalan database which consists of 48 speakers (three 3 s training samples per speaker).",
  address="Springer",
  chapter="75059",
  institution="Springer",
  number="12",
  volume="6800",
  year="2011",
  month="november",
  pages="120--132",
  publisher="Springer",
  type="journal article"
}