Implementing a Left-Corner Chart Parser


This is a double assignment, i.e. worth the same credit as two assignments. It is due Friday, December 1, 2006 at 5pm.

In this assignment, you will implement a left-corner chart parser. You'll do this by taking what you learned about chart parsing from Jurafsky and Martin (esp. their coverage of the Earley algorithm in Chapter 12), and integrating that with the parsing schema for left-corner chart parsers described by Robert Moore, "Improved Left-Corner Chart Parsing for Large Context-Free Grammars" [local link] (in Bunt, Carroll, and Satta, eds., New Developments in Parsing Technology, Kluwer, 2004, pp. 185-201; interestingly, it looks like the innovations in this paper are patented).

Steps in this Assignment

  1. Required. Read the Moore paper up through the discussion of steps 4a and 5a on the 5th page. (Read the rest if you're interested, but here we're focused on the basic version rather than the improvements that are the main topic of the paper.)

  2. Required. Answer: Which steps in the Earley algorithm do Moore's steps 1-3 correspond to?

  3. Optional. Implement the Earley algorithm, and test it using the same grammar and input that Jurafsky and Martin use in their "A Complete Example" (Figure 12.14). That way you can make sure you're building the same chart they are in order to verify the correctness of your implementation. (This optional step can form the basis for significant partial credit. Turn in your code electronically, including a README describing how to run it, and in the hardcopy you turn in, include a printout of the chart contents corresponding to Figure 12.14.)

  4. Required. Implement a recognizer that operates according to the parsing scheme specified by Moore in steps {1,2,3,4a,5a}. Turn in the code of the parser (submitted electronically), including a README describing how to run it, as well as a printout showing the contents of the chart for the sentence "book that flight" (following the format of Jurafsky and Martin's Figure 12.14).

  5. Required. In your hardcopy, provide a brief, clear description of the difference (or differences) between Moore's left-corner parsing scheme and Earley's algorithm. Illustrate using one or more differences between chart states (edges) introduced by Earley's parser and those introduced by the left-corner chart parser, given the same input -- i.e. an edge or edges introduced by one parser but not the other.

Thoughts and Guidance

Extra Credit (10%)

Add the bookkeeping needed to read off all the parses from the chart at the end. In your hardcopy, include a brief, clear description of the necessary modifications. Run your parser on a globally ambiguous sentence, and include a printout of the parse trees in your hardcopy -- automatic plain text output, not something fancy created using a tree drawing program, though "prettyprinting" indentation would be nice. Here's an example of a prettyprinted parse tree (from Marcus et al., 1993):
( (S
   (NP Battle-tested industrial managers
       here)
    always
   (VP buck
       up
       (NP nervous newcomers)
       (PP with
           (NP the tale
               (PP of
                   (NP (NP the
                           (ADJP first
                                 (PP of
                                     (NP their countrymen)))
                           (S (NP *)
                              to
                              (VP visit
                                  (NP Mexico))))
                       ,
                       (NP (NP a boatload
                               (PP of
                                   (NP (NP warriors)
                                       (VP-1 blown
                                           ashore
                                           (ADVP (NP 375 years)
                                                 ago)))))
                           (VP-1 *pseudo-attach*))))))))
  .)
Please truncate the parse tree output at a maximum of a few printed pages to save paper -- I don't need to see every single tree, just a few!