EM and the Inside-Outside Algorithm


In class I went through a derivation of the Inside-Outside algorithm that differs from the Manning and Schuetze presentation in two ways. First, I began by considering what the maximum likelihood estimate of rule probabilities would look like if derivations were observable; i.e. I began with the M-step as a normalized count, and then broke the numerator down progressively as an expectation in terms of currently-guessed rule probabilities. Second, I used my own notation, because the Manning and Schuetze notation is a bit cumbersome.

The assignment is to take the logical presentation that I made in class, and write up lecture notes converting the notation from mine into Manning and Schuetze's. If you can recapitulate my discussion in their notation, then we can be pretty confident you understand how and why the algorithm works.

You do not need to go beyond what I covered in class; e.g. I did not cover the probabilities of the unary preterminal-to-terminal rules and you don't need to either.