Universität des Saarlandes
 

Spoken Language Systems

   
 

Pattern and Speech Recognition

(speech technology; Echtzeitdatenverarbeitung II)

Suitable for:  CoLi, CuK,  Mechatronics, VC, CS

Lecturer: Dietrich Klakow
Location: Geb. C7 2, Seminarraum
Time: Tuesday 14:15-15:45
Starts: 20.10

Exercises
Tutors:

Grzegorz Chrupala <gchrupala@lsv.uni-saarland.de>
Munir Georges <Munir.Georges@lsv.uni-saarland.de>
Location: Geb. C7 2, Seminarraum
Time: Thursday 12:15 or Friday 14:15

Content

The core of any speech recognizer is a pattern classification system. In this course, we will cover the basic principles of pattern recognition and machine learning and see how they are applied to speech recognition.

Specific topics will be:

  • Bayes Classifier
  • Normal Distribution
  • Parameter Estimation
  • Nearest Neighbor Classifier
  • Gaussian Mixture Models
  • Decision Trees
  • Hidden Markov Models
  • Conditional Random Fields
  • Acoustic Modeling

Slides
Chapter 1: Introduction pdf
Chapter 2: Basic Task of Pattern Classification pdf
Chapter 3: Feature Extraction pdf 
Chapter 4: Probability Theory, Distributions and all that pdf 
Chapter 5: Bayesian Decision Theory pdf 
Chapter 6: Non parametric methodspdf 
Chapter 7: Gaussian Mixture Modelspdf 
Chapter 8: Speaker Recognitionpdf 
Chapter 9: Decision Treespdf 
Chapter 10: Hidden Markov Modelspdf 
Chapter 11: Conditional Random Fieldspdf 

Examples and scripts used in the lecture
Maple script for discrete fourier transform
Maple script for a one dimensional gaussian distribution
Maple script for linear algebra
Maple script for a two dimensional gaussian distribution
Maple script to calculate the decison boundary resulting from two normal distributions in two dimensions
Spoken digits clean signal ASCII wav
Spoken digits with noise ASCII wav

Please submit your Solutions to:

  • gchrupala@lsv.uni-saarland.de if attending the Thursday 12:15 session
  • Munir.Georges@lsv.uni-saarland.de if attending the Friday 14:15 session

Exercises:
Exercise 1
Exercise 2
Exercise 3
Exercise 4
Exercise 5
Exercise 6
Exercise 7
Exercise 8
Exercise 9

Solutions:

Exam: to be discussed

Credit Points
CoLi, CS, VC: 6 CP; CuK 4.5; Mechatronik 4 oder 4.5 (BaMa/Diplom)

Literature

Christopher M Bishop
Pattern Recognition and Machine Learning
Springer 2006
ISBN 0-38-731073-8

Richard O. Duda, Peter E. Hart, David G. Stork
Pattern Classification
Wiley-Interscience
November 2000
ISBN 0471056693

Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, Xuedong Huang, Hsiao-Wuen Hong
Spoken Language Processing
Prentice Hall
ISBN: 0130226

Ernst Günter Schukat-Talamazzini
Automatische Spracherkennung
Vieweg Verlag
Braunschweig/Wiesbaden 1995
ISBN 3-528-05492-1
Postscript