COSC6328.3 Speech and Language Processing

(Winter 2008)

Instructor:   Prof. Hui Jiang

Time:          TR10:00—11:30

Place:          BC325

Office Hour:  TBA @ CSE3014 (or by appointment)

 

Reference Books (optional):

[1] Spoken Language Processing: a guide to theory, algorithm, and system development  by X.D. Huang, A. Acero, H.W. Hon.  (Prentice Hall PTR, ISBN 0-13-022616-5)

[2] Pattern Recognition and Machine Learning by C. M. Bishop. (Springer, ISBN 0-387-31073-8)

[3] Spoken Dialogues with Computers, edited by R. De Mori.   (Academic Press, ISBN 0-12-209055-1)

[4] Foundations of Statistical Natural Language Processing, by  C. D. Manning and H. Schutze. (The MIT Press, ISBN 0-262-13360-1)

[5] Pattern Classification (2nd Edition) by R. O. Duda, P. Hart and D. Stork. (John Wiley & Sons, Inc., ISBN 0-471-05669-3)

(The above reference books are optional. The course is mainly based on the lecture notes and reading assignments. The lecture notes are usually posted prior to classes)

 

ANOUNCEMENTS:  (Frequent updates during the course, REFRESH the browser)

 

·        Apr 13 – Project 2 has been extended one more week to April 23 and the presentation will be held on April 25 (10am-1pm, Friday, CSB 3033).

·        Apr 1 – Porject2 has been extended to April 16 23 and the presentation will be held on April 18 25 (10:00am-1:00pm, Friday, CSEB3033). Each person will have 10-15mins to present the project. The presentation order is: Nariman, Kmiec, Nassim, Hashmat, Pourya, Li, Damon, Feng, Hu, Vlad, Zhenyu, Talieh.

·        Mar 13 – Lecture notes for week 9 posted. This part is closely related to option B of project 2.

·        Mar 11—Important Announcement regarding research and project presentation:

1. Research Presentation will be held:

                           i) March 27th (10:00am-11:30am, Thursday, in class): Vlad & Pourya: Graphical Models; Pan & Li:  WFST optimization.

                           ii) March 28th (10:00am-12:30pm, Friday, CSEB3033): Feng & Hu: MAP estimation of HMM; Damon & Nariman: Speaker verification;  Nassim & Hashmat: Speech Understanding; Kmiec(15mins): Latent Semantic analysis; Talieh(15mins): SVM.

2. Project2 has been extended to April 9 23 and project2 presentation will be held on April 11th 25th (10:00am-1:00pm, Friday, CSEB3033). Each person will have 10-15mins to present the project. Option A: Nariman, Kmiec, Nassim, Hashmat, Pourya, Li, Damon. Option B: Feng, Hu, Vlad, Zhenyu, Talieh.

3. Meanwhile, classes on March 20, 25 and April 1 are cancelled.

·        Mar 6 – I already had enough number (6+) of people to do option A for Project two. If you didn’t email me any request yet, you will have to do option B for Project two.

·        Mar 6 – Project two has been posted.  Project two has two options, please let me know your preference ASAP. Lecture notes for week eight posted as well. Please read the HTK tutorial which is the required reading material for this week.

·        Feb 27 – Lecture notes for week seven has been posted.

·        Feb 26 – Reading material for week six (a tutorial article on HMM) has been posted. The topics for advanced study are also posted. Please identify your group partner and topic as soon as you can.

·        Feb 20 – Lecture notes for week six has been posted.

·        Feb 19 – Lecture notes for week five has been updated (with WFST part added). The reading material regarding WFST is also posted.

·        Jan 31 – Project one is posted.  It is due on March 6th.

·        Jan 31 – Lecture notes for week five posted. Reading materials for week five posted as well.

·        Jan 23 – Lecture notes for week four posted.

·        Jan 17 – Reading notes for week three posted. Also reading materials for week three posted. Please download and read it. Assignment questions have been posted (download here), please hand in before Feb 11 Feb 19.

·        Jan 10 – Reading materials for weeks one and two posted.  Please catch up.

·        Jan 8 – Lecture notes for week two posted.

·        Jan 3 – Lecture notes for week one posted.

·        Jan 2 – This WWW created. The class starts from Jan 3th.

 

Lecture Notes:

 

Week 1, Week 2, Week 3, Week 4, Week 5, Week 6, Week 7, Week 8, Week 9, Week 10.

 

Evaluation:         

(1)   (10%) Assignment: basic concepts, principles, algorithms.

(2)   (10%) Class participation.

    (3) (55%) Two lab projects: project one (20%) and project two (35%).

    (4) (25%) Assigned article reading and class presentation (5% out of 25% is based on your participation in evaluating other’s presentations).

 

Assignments:  (10%)   download here. (New deadline is Feb 19 in class)


Lab Projects:

   Project  I :  (20%) download datasets: Train.set and Test.set.

   Project  II:  (35%)

 

Reading Lists:  click here.

 

All topics for advanced study NEW: check here.

 

 

 

                                    Course Outline (subject to change)

 

 

 

Content

Reading Assignment

Week 1

 

Introduction: application background; a big picture;

                     speech sounds; spoken language

 

[A1] [A2] in reading list

Week 2

 

Math Background: probabilities; Bayes theorem; statistics; estimation; regression; hypothesis testing; Entropy; mutual information; decision tree

 

Math background

 

A manual for matrix calculus

Week 3

 

Pattern Classification (I): pattern classification & pattern verification; Bayesian decision theory;

 

A web tutorial on Bayesian decision rule

Week 4

 

Pattern Classification (II): Model estimation: maximum likelihood, Bayesian learning, EM algorithm; Simple models: single Gaussian, Gaussian mixture model, etc.

 

[B1] in

reading list

pp. 47-54 & summary

Week 5

 

Pattern Classification (III) & Pattern Verification:  alternative model estimation: discriminative training & Bayesian learning;

Linear discriminant functions; support vector machine (SVM); large margin classifiers;

Pattern verification as statistical hypothesis testing; speaker verification; outlier rejection

 

Tutorial on verification [C1]  (sec. 6 optional)

 

Finite-State-Machine (FSM) Toolkit

 

Week 6

 

Hidden Markov Model (HMM): HMM vs. Markov chains; HMM concepts; Three algorithms: forward-backward; Viterbi decoding; Baum-Welch learning.

 

A Tutorial on HMM

 

 

Week 7

 

Automatic Speech Recognition (ASR) (I): Introduction & Acoustic modeling  

how to use HMM for ASR;  ASR as an example of pattern classification;

Acoustic modeling: HMM learning (ML, MAP);

Parameter tying (decision tree based state tying).

HTK User guide [E1]  (ONLY ch.1,2,3 pp.2-43)

Week 8

 

Automatic Speech Recognition (ASR) (II):  Language modeling

n-grams, smoothing, learning, perplexity, class-based

 

Week 9

 

Automatic Speech Recognition (ASR) (III): Search

Why search; Viterbi decoding in a large HMM;

Beam search; Tree-based lexicon; dynamic decoding

 

Week 10

 

Spoken Language Processing (I): text categorization

classify text documents: call/email routing, topic detection, etc.

vector-based approach, Naďve Bayes classifier; Bayesian networks, etc.

(2) HMM applications: Statistical Part-of-Speech (POS) tagging;

 Language understanding: hidden concept models.

 

Week 11

 

Spoken Language Processing (II): statistical machine translation

 IBM’s models for machine translation: lexicon model, alignment model, language model

 training process, generation & search

 

 

Week 12

 

Student’s Presentations