Course detail

Speech Signal Processing

FIT-ZREAcad. year: 2018/2019

Applications of speech processing, digital processing of speech signals, production and perception of speech, introduction to phonetics, pre-processing and basic parameters of speech, linear-predictive model, cepstrum, fundamental frequency estimation, coding - time domain and vocoders, recognition - DTW and HMM, synthesis. Software and libraries for speech processing.

Learning outcomes of the course unit

The students will get familiar with basic characteristics of speech signal in relation to production and hearing of speech by humans. They will understand basic algorithms of speech analysis common to many applications. They will be given an overview of applications (recognition, synthesis, coding) and be informed about practical aspects of speech algorithms implementation. The students will be able to design a simple system for speech processing (speech activity detector, recognizer of limited number of isolated words), including its implementation into application programs.

Prerequisites

Not applicable.

Co-requisites

Not applicable.

Recommended optional programme components

Not applicable.

Recommended or required reading

  • Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN  80-200-0203-0
  • Gold, B., Morgan, N.: Speech and Audio Signal Processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7

  • Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN  80-200-0203-0
  • Gold, B., Morgan, N.: Speech and Audio Signal Processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7 
  • Krčmová, N.: Fonetika a fonologie: zvuková stavba současné češtiny. ISBN 80-210-0137-2. Masarykova univerzita, Brno, 1990
  • Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition, Signal Processing, Prentice Hall, Engelwood Cliffs, NJ, 1993, ISBN 0-13-015157-2 

Planned learning activities and teaching methods

Not applicable.

Assesment methods and criteria linked to learning outcomes


  • mid-term test 14 pts
  • project 29 pts
  • presentation of results in computer labs 6 pts

Language of instruction

Czech, English

Work placements

Not applicable.

Course curriculum

    Syllabus of lectures:
    • Introduction, applications of speech processing, sciences relevant for SP, informational content of speech.
    • Digital processing of speech signals.
    • Speech production and perception, basic notions from psycho-acoustics, applications in speech processing. 
    • Introduction to phonetics, international norms for phoneme mark-up.
    • Pre-processing and basic parameters of speech.
    • Linear-predictive model, spectrum using LP, applications of LP. 
    • Cepstral analysis, Mel-frequency cepstrum.
    • Determination of fundamental frequency.
    • Speech coding
    • Speech recognition - dynamic programming DTW, hidden Markov models HMM
    • Speech synthesis
    • Software and libraries for speech processing.

    Syllabus of numerical exercises:
    • Parameterization, DTW, HMM.
    • Presentation of projects.

    Syllabus of computer exercises:
      Except the last one, Matlab is used in labs.
    • Frames, windows, spectrum, pre-processing.
    • Linear prediction (LPC).
    • Fundamental frequency estimation.
    • Coding.
    • Recognition - Dynamic time Warping (DTW).
    • Recognition - hidden Markov models (Hidden Markov Model Toolkit - HTK).

Aims

To provide students with the knowledge of basic characteristics of speech signal in relation to production and hearing of speech by humans. To describe basic algorithms of speech analysis common to many applications. To give an overview of applications (recognition, synthesis, coding) and to inform about practical aspects of speech algorithms implementation.

Classification of course in study plans

  • Programme IT-MGR-2 Master's

    branch MBI , any year of study, summer semester, 5 credits, compulsory-optional
    branch MPV , any year of study, summer semester, 5 credits, compulsory-optional
    branch MIS , any year of study, summer semester, 5 credits, optional
    branch MBS , any year of study, summer semester, 5 credits, optional
    branch MIN , any year of study, summer semester, 5 credits, compulsory-optional
    branch MMM , any year of study, summer semester, 5 credits, optional
    branch MGM , 1. year of study, summer semester, 5 credits, compulsory
    branch MSK , 2. year of study, summer semester, 5 credits, compulsory-optional

Type of course unit

 

Lecture

26 hours, optionally

Teacher / Lecturer

Syllabus

  1. Introduction, applications of speech processing. 
  2. Digital processing of speech signals.
  3. Speech production and its signal processing model. 
  4. Pre-processing and basic parameters of speech, cepstrum.
  5. Linear-predictive model. 
  6. Fundamental frequency estimation.
  7. Speech coding - basics
  8. CELP Speech coding. 
  9. Speech recognition - basics, DTW. 
  10. Hidden Markov models HMM. 
  11. Large vocabulary continuous speech recognition (LVCSR) systems. 
  12. Speaker and language recognition. Neural networks in speech processing. 
  13. Text to speech synthesis. 

Fundamentals seminar

2 hours, compulsory

Teacher / Lecturer

Syllabus

  1. Parameterization, DTW, HMM.

Exercise in computer lab

12 hours, compulsory

Teacher / Lecturer

Syllabus

    Except the last one, Matlab is used in labs.
  1. Introduction. 
  2. Linear prediction and vector quantization. 
  3. Fundamental frequency estimation and speech coding. 
  4. Basics of classification. 
  5. Recognition - Dynamic time Warping (DTW).
  6. Recognition - hidden Markov models (HTK).

Project

12 hours, compulsory

Teacher / Lecturer

eLearning