Detail publikace

Non-speech Activity Pause Detection in Noisy and Clean Speech Conditions

Originální název

Non-speech Activity Pause Detection in Noisy and Clean Speech Conditions

Anglický název

Non-speech Activity Pause Detection in Noisy and Clean Speech Conditions

Jazyk

en

Originální abstrakt

Nowadays, successful pause detection plays the most important role in the process of speech recognition. There is a need for a robust detection algorithm if we consider that most speech recordings are taken under very adverse conditions. This paper presents a comparison of several algorithms for empty pause detection on spontaneous speech recordings made in noisy environments. The input signal is transformed into log spectral energy and divided into specific frequency bands. Each band is smoothed and tracked by a dynamically adjusted threshold based on pause (noise) energy estimation. Then the correction of post-processing edges follows. All the proposed algorithms are capable of processing a real-time input.

Anglický abstrakt

Nowadays, successful pause detection plays the most important role in the process of speech recognition. There is a need for a robust detection algorithm if we consider that most speech recordings are taken under very adverse conditions. This paper presents a comparison of several algorithms for empty pause detection on spontaneous speech recordings made in noisy environments. The input signal is transformed into log spectral energy and divided into specific frequency bands. Each band is smoothed and tracked by a dynamically adjusted threshold based on pause (noise) energy estimation. Then the correction of post-processing edges follows. All the proposed algorithms are capable of processing a real-time input.

Dokumenty

BibTex


@article{BUT43294,
  author="Vojtěch {Stejskal} and Anna {Esposito} and Zdeněk {Smékal}",
  title="Non-speech Activity Pause Detection in Noisy and Clean Speech Conditions",
  annote="Nowadays, successful pause detection plays the most important role in the process of speech recognition. There is a need for a robust detection algorithm if we consider that most speech recordings are taken under very adverse conditions. This paper presents a comparison of several algorithms for empty pause detection on spontaneous speech recordings made in noisy environments. The input signal is transformed into log spectral energy and divided into specific frequency bands. Each band is smoothed and tracked by a dynamically adjusted threshold based on pause (noise) energy estimation. Then the correction of post-processing edges follows. All the proposed algorithms are capable of processing a real-time input.",
  address="IOS Press BV",
  chapter="43294",
  institution="IOS Press BV",
  journal="NATO Security through Science Series
Sub-Series E: Human and Societal Dynamics",
  number="4",
  volume="18",
  year="2007",
  month="january",
  pages="170--178",
  publisher="IOS Press BV",
  type="journal article - other"
}