Pattern and Speech Recognition
(speech technology;
Echtzeitdatenverarbeitung II)
Suitable for: CoLi,
CuK, Mechatronics, VC, CS
Lecturer: Dietrich Klakow
Location: Geb. C7 2, Seminarraum
Time: Tuesday 14:15-15:45
Starts: 20.10
Exercises
Tutors: Grzegorz Chrupala <gchrupala@lsv.uni-saarland.de> Munir Georges <Munir.Georges@lsv.uni-saarland.de>
Location: Geb. C7 2, Seminarraum
Time: Thursday 12:15 or Friday 14:15
Content
The core of any speech recognizer is a pattern
classification system. In this course, we will cover the basic
principles of pattern recognition and machine learning and see how they
are applied to speech recognition.
Specific topics will be:
- Bayes Classifier
- Normal Distribution
- Parameter Estimation
- Nearest Neighbor Classifier
- Gaussian Mixture Models
- Decision Trees
- Hidden Markov Models
- Conditional Random Fields
- Acoustic Modeling
Slides
Chapter 1: Introduction pdf
Chapter 2: Basic Task of Pattern Classification pdf
Chapter 3: Feature Extraction pdf
Chapter 4: Probability Theory, Distributions and all that pdf
Chapter 5: Bayesian Decision Theory pdf
Chapter 6: Non parametric methodspdf
Chapter 7: Gaussian Mixture Modelspdf
Chapter 8: Speaker Recognitionpdf
Chapter 9: Decision Treespdf
Chapter 10: Hidden Markov Modelspdf
Chapter 11: Conditional Random Fieldspdf
Examples and scripts used in the
lecture
Maple script for
discrete fourier transform
Maple script for
a one dimensional gaussian distribution
Maple script for
linear algebra
Maple script for
a two dimensional gaussian distribution
Maple script to
calculate the decison boundary resulting from two normal distributions in two dimensions
Spoken digits clean signal ASCII wav
Spoken digits with noise ASCII wav
Please submit your Solutions to:
- gchrupala@lsv.uni-saarland.de if attending the Thursday 12:15 session
- Munir.Georges@lsv.uni-saarland.de if attending the Friday 14:15 session
Exercises:
Exercise 1
Exercise 2
Exercise 3
Exercise 4
Exercise 5
Exercise 6
Exercise 7
Exercise 8
Exercise 9
Solutions:
Exam:
to be discussed
Credit Points
CoLi, CS, VC: 6 CP; CuK 4.5; Mechatronik 4 oder 4.5 (BaMa/Diplom)
Literature
Christopher M Bishop
Pattern Recognition and Machine Learning
Springer
2006
ISBN 0-38-731073-8
Richard
O. Duda, Peter E. Hart, David G. Stork
Pattern Classification
Wiley-Interscience
November 2000
ISBN 0471056693
Xuedong
Huang, Alex Acero, Hsiao-Wuen Hon, Xuedong Huang, Hsiao-Wuen
Hong
Spoken Language Processing
Prentice Hall
ISBN: 0130226
Ernst Günter Schukat-Talamazzini
Automatische Spracherkennung
Vieweg Verlag
Braunschweig/Wiesbaden 1995
ISBN 3-528-05492-1
Postscript
|