EM and the Inside-Outside Algorithm
In class I went through a derivation of the Inside-Outside algorithm
that differs from the Manning and Schuetze presentation in two ways.
First, I began by considering what the maximum likelihood estimate of
rule probabilities would look like if derivations were observable;
i.e. I began with the M-step as a normalized count, and then broke the
numerator down progressively as an expectation in terms of
currently-guessed rule probabilities. Second, I used my own notation,
because the Manning and Schuetze notation is a bit cumbersome.
The assignment is to take the logical presentation that I made in
class, and write up lecture notes converting the notation from mine
into Manning and Schuetze's. If you can recapitulate my discussion in
their notation, then we can be pretty confident you understand how and
why the algorithm works.
You do not need to go beyond what I covered in class; e.g. I did not
cover the probabilities of the unary preterminal-to-terminal rules and
you don't need to either.