Schedule of Topics


PHILIP IS WORKING ON THIS AT THE MOMENT. DO NOT USE. This is the schedule of topics for Computational Linguistics II, Spring 2009.

Readings are from Christopher D. Manning and Hinrich Schuetze, Foundations of Statistical Natural Language Processing, unless otherwise specified. The "other" column has optional links pointing either to material you should already know (but might want to review) or to related material you might be interested in.

THIS SCHEDULE IS A WORK IN PROGRESS!
In addition, some topic areas may take longer than expected, so keep an eye on the class mailing list or e-mail me for "official" dates.

Class Topic
Readings* Assignments Other
Jan 28 Course administrivia, semester plan; some statistical NLP fundamentals
Ch 1, 2.1.[1-9] (for review)
Probability spaces; finite-state and Markov models; expected values; Bayes' Rule
Assignment 1 Corpus Colossal (The Economist, 20 Jan 2005); Language Log; Resnik and Elkiss (DRAFT); Linguist's Search Engine
Feb 4 Words and lexical association
Ch 5
Zipf's law; collocations; mutual information; hypothesis testing
Assignment 2 Dunning (1993); Kilgarriff (2005); Gries (2005); Bland and Altman (1995)
Feb 11 Information theory
Ch 2.2, Ch 6
Information theory essentials; entropy, relative entropy, mutual information; noisy channel model; cross entropy and perplexity
Assignment 3, due Wednesday Feb 25 at 1:30pm
Feb 18 Maximum likelihood estimation and Expectation Maximization
Skim Ch 9-10
Maximum likelihood estimation overview; quick review of smoothing; HMM review; deriving forward-backward algorithm as an instance of EM; Viterbi algorithm.
An empirical study of smoothing techniques for language modeling (Stanley Chen and Joshua Goodman, Technical report TR-10-98, Harvard University, August 1998);
Revised Chapter 4 from the updated Jurafsky and Martin textbook.
Feb 25 Probabilistic grammar
Ch 11-12, Abney (1996)
Memoization and dynamic programming; review of CKY; defining PCFG; PCKY (inside probabilities); Viterbi CKY; revisiting EM: the inside-outside algorithm
Assignment 4.
, due at start of class Wednesday March 11. (Please send initial time estimates now.)
Jason Eisner's great parsing song; Pereira (2000); Detlef Prescher, A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars; McClosky, Charniak, and Johnson (2006), Effective Self-Training for Parsing
Mar 4 Probabilistic parsing
Parsing as inference; distinction between logic and control; Viterbi CKY; CFG extensions (grandparent parent nodes, lexicalization)
Mar 11 Watch and discuss Dan Klein talk on grammar induction (first 30 minutes of this online talk) Take-home midterm handed out Petrov and Klein, Learning and Inference for Hierarchically Split PCFGs
Mar 18 Spring Break
Have fun!
Mar 25 Supervised classification
Ch 16
Supervised learning -- k-nearest neighbor classification; naive Bayes; decision lists; decision trees; transformation-based learning (Sec 10.4); linear classifiers; the kernel trick; perceptrons; SVM basics.
Apr 1 Evaluation in NLP Evaluation paradigms for NLP; parser evaluation in particular
Apr 8 Maxent; supervised approaches to word sense disambiguation
Ch 7; Adwait Ratnaparkhi, A Maximum Entropy Model for Part-Of-Speech Tagging (EMNLP 1996); Resnik, "WSD in NLP Applications" (Ch 11 in Edmonds and Agirre (2006))
The maximum entropy principle and maxent models; feature selection.
Team project handed out. Other useful readings include Adwait Ratnaparkhi's A Simple Introduction to Maximum Entropy Models for Natural Language Processing (1997) and Adam Berger's maxent tutorial; and Noah Smith's notes on loglinear models.
Apr 15 Unsupervised and semi-supervised WSD
Ch 8.5, 15.{1,2,4}
Characterizing the WSD problem; WSD as a supervised classification problem. Unsupervised methods/Lesk's algorithm semi-supervised learning and Yarowsky's algorithm; WSD in applications; WSD evaluation; IR basics.
Apr 22 Machine translation
Ch 13 and Adam Lopez, Statistical Machine Translation, In ACM Computing Surveys 40(3), Article 8, pages 149, August 2008.

Historical view of MT approaches; noisy channel for SMT; IBM models 1 and 4; HMM distortion model; going beyond word-level models

Also potentially useful or of interest: Kevin Knight, A Statistical MT Tutorial Workbook;
Mihalcea and Pedersen (2003);
Philip Resnik, Exploiting Hidden Meanings: Using Bilingual Text for Monolingual Annotation. In Alexander Gelbukh (ed.), Lecture Notes in Computer Science 2945: Computational Linguistics and Intelligent Text Processing, Springer, 2004, pp. 283-299.
Apr 29 Phrase-based statistical MT Papineni, Roukos, Ward and Zhu. 2001. BLEU: A Method for Automatic Evaluation of Machine Translation

Components of a phrase-based system: language modeling, translation modeling; sentence alignment, word alignment, phrase extraction, parameter tuning, decoding, rescoring, evaluation.

Koehn, PHARAOH: A Beam Search Decoder for Phrase-Based Statistical Machine Translation; Koehn (2004) presentation on PHARAOH decoder
May 6 TBD
Take-home final handed out

*Readings are from Manning and Schuetze unless otherwise specified. Do the reading before the class where it is listed!

Return to course home page