Schedule of Topics

This is the schedule of topics for Computational Linguistics II, Spring 2008.

Readings are from Christopher D. Manning and Hinrich Schuetze, Foundations of Statistical Natural Language Processing, unless otherwise specified. The "other" column has optional links pointing either to material you should already know (but might want to review) or to related material you might be interested in.

THIS SCHEDULE IS A WORK IN PROGRESS!
In addition, some topic areas may take longer than expected, so keep an eye on the class mailing list or e-mail me for "official" dates.

Class	Topic	Readings*	Assignments	Other
Jan 30	Course administrivia, semester plan; corpus-driven and computational linguistics	Ch 1, 2.1.[1-9] (for review) Word counts; tokenization; frequency and Zipf's law; concordances		Corpus Colossal (The Economist, 20 Jan 2005); Language Log; Resnik and Elkiss (DRAFT); Linguist's Search Engine
Feb 6	Words and lexical association	Ch 5 Collocations; mutual information; hypothesis testing	Assignment 1a, Assignment 1b	Dunning (1993), Bland and Altman (1995)
Feb 13	Information theory, n-gram models	Ch 2.2, Ch 6 Information theory essentials; entropy, relative entropy, mutual information; noisy channel model; cross entropy and perplexity	Assignment 2
Feb 20	Review of smoothing and HMMs	Skim Ch 9-10 Maximum likelihood estimation; basics of smoothing; interpolated estimation; Katz backoff. HMM review; forward and Viterbi algorithms; deriving forward-backward as an instance of EM; HMM as a noisy channel model.	An empirical study of smoothing techniques for language modeling (Stanley Chen and Joshua Goodman, Technical report TR-10-98, Harvard University, August 1998); Revised Chapter 4 from the updated Jurafsky and Martin textbook.
Feb 27	Probabilistic grammar	Ch 11-12, Abney (1996) Memoization and dynamic programming; review of CKY; defining PCFG; PCKY (inside probabilities); Viterbi CKY; revisiting EM: the inside-outside algorithm; parsing as inference	Assignment 3. Extra credit (10%): Download and install the Dyna programming language, run the example in demos/cky, and briefly explain in plain English each of the three lines in the core program cky.dyna (i.e. the two constit lines and the goal). You will probably find the CKY example and the documentation of inference rules useful.	Jason Eisner's great parsing song; Pereira (2000); Detlef Prescher, A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars; McClosky, Charniak, and Johnson (2006), Effective Self-Training for Parsing
Mar 5	Beyond CFG;	History-based grammars; lexicalized and dependency-based models
Mar 12	Parser Evaluation and NLP Evaluation in General	Evaluation paradigms for NLP; parser evaluation in particular	Take-home midterm handed out
Mar 19	Spring Break		Have fun!
Mar 26	Supervised classification	Ch 16 Supervised learning -- k-nearest neighbor classification; naive Bayes; decision lists; decision trees; transformation-based learning (Sec 10.4); linear classifiers; the kernel trick; perceptrons; SVM basics.
Apr 9	NO CLASS
Apr 16	Maxent; supervised approaches to word sense disambiguation	Ch 7; Adwait Ratnaparkhi, A Maximum Entropy Model for Part-Of-Speech Tagging (EMNLP 1996); Resnik, "WSD in NLP Applications" (Ch 11 in Edmonds and Agirre (2006)) The maximum entropy principle and maxent models; feature selection. Characterizing the WSD problem; WSD as a supervised classification problem.	Assignment 4	Other useful readings include Adwait Ratnaparkhi's A Simple Introduction to Maximum Entropy Models for Natural Language Processing (1997) and Adam Berger's maxent tutorial; and Noah Smith's notes on loglinear models.
Apr 23	Unsupervised and semi-supervised WSD	Ch 8.5, 15.{1,2,4} Unsupervised methods/Lesk's algorithm semi-supervised learning and Yarowsky's algorithm; WSD in applications; WSD evaluation; IR basics.	Team Project
Apr 30	Information retrieval; guest lecture (Smaranda Muresan) on graph-based methods in NLP	(a) Rada Mihalcea and Paul Tarau, TextRank: Bringing Order into Texts, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, July 2004.; (b) Rada Mihalcea, Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization, in Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2004), Barcelona, Spain, July 2004; (c) Paper/data of Pang and Lee on sentiment analysis with min-cuts PageRank and variants; HITS; min-cuts	IR lecture slides, Graph-methods lecture slides.	Optional readings of interest: (a) Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze, Introduction to Information Retrieval, Cambridge University Press: Chapter 21 "Link Analysis"; (b) Page L. et. al Page Rank Citation Ranking: Bringing Order to the Web; (c) Jon Kleinberg Authoritative sources in a hyperlinked environment, in proceedings of SODA 1998 (d) Kurt Bryan and Tanya Leise, The $25,000,000,000 Eigenvector: The Linear Algebra Behind Google (SIAM Review 48(3), 2006, pp. 569-581)
May 7	Machine translation	Ch 13 and Adam Lopez, A Survey of Statistical Machine Translation, Techreport LAMP-TR-135/CS-TR-4831/UMIACS-TR-2006-47, University of Maryland, College Park, April 2007 Historical view of MT approaches; noisy channel for SMT; IBM models 1 and 4; HMM distortion model; going beyond word-level models		Also potentially useful or of interest: Kevin Knight, A Statistical MT Tutorial Workbook; Mihalcea and Pedersen (2003); Philip Resnik, Exploiting Hidden Meanings: Using Bilingual Text for Monolingual Annotation. In Alexander Gelbukh (ed.), Lecture Notes in Computer Science 2945: Computational Linguistics and Intelligent Text Processing, Springer, 2004, pp. 283-299.
May 7	Phrase-based statistical MT	Papineni, Roukos, Ward and Zhu. 2001. BLEU: A Method for Automatic Evaluation of Machine Translation Components of a phrase-based system: language modeling, translation modeling; sentence alignment, word alignment, phrase extraction, parameter tuning, decoding, rescoring, evaluation.		Koehn, PHARAOH: A Beam Search Decoder for Phrase-Based Statistical Machine Translation; Koehn (2004) presentation on PHARAOH decoder

*Readings are from Manning and Schuetze unless otherwise specified. Do the reading before the class where it is listed!

Return to course home page

This page last updated 30 Jan 2008.

Many thanks to David Chiang, Bonnie Dorr, Christof Monz, Amy Weinberg, for discussions about the syllabus. Responsibility for the outcome is, of course, completely indeterminate. :-)