-7 Algorithms for NLP The Earley Parsing Algorithm Reading: Jay Earley, An Efficient Context-Free Parsing Algorithm Comm. of the ACM vol. 3 (2), pp. 94 2
The Earley Parsing Algorithm General Principles: A clever hybrid Bottom-Up and Top-Down approach Bottom-Up parsing completely guided by Top-Down predictions Maintains sets of dotted grammar rules that: Reflect what the parser has seen so far Explicitly predict the rules and constituents that will combine into a complete parse Similar to Chart Parsing - partial analyses can be shared Time Complexity 3, but better on particular sub-classes Developed prior to Chart Parsing, first efficient parsing algorithm for general context-free grammars. -7 Algorithms for NLP
The Earley Parsing Method Main Data Structure: The state (or item ) A state is a dotted rule and starting position: by The algorithm maintains sets of states, one set for each position in the input string (starting from ) We denote the set for position 2-7 Algorithms for NLP
, then add to Scanner: If state The Earley Parsing Algorithm Three Main Operations: Predictor: If state rule of the form, add to then for every the state Completer: If state in of form, add to the state input word is the state 3-7 Algorithms for NLP and the next then for every state
# # with % & The Earley Recognition Algorithm Simplified version with no lookaheads and for grammars without epsilon-rules Assumes input is string of grammar terminal symbols We extend the grammar with a new rule "! $ The algorithm sequentially constructs the sets for $ We initialize the set! $ 4-7 Algorithms for NLP
% * # # and ' & ) % & ' The Earley Recognition Algorithm The Main Algorithm: parsing input. (! $ 2. For do: Process each item in order by applying to it the single applicable operation among: (a) Predictor (adds new items to (b) Completer (adds new items to (c) Scanner (adds new items to ) ) 3. If, Reject the input 4. If (! $ then Accept the input -7 Algorithms for NLP
2 3 2 Earley Recognition - Example The Grammar: 4 6 The original input: The large can can hold the water POS assigned input: art adj n aux v art n Parser input: art adj n aux v art n $ 6-7 Algorithms for NLP
Earley Recognition - Example : (! $ : 7-7 Algorithms for NLP
Earley Recognition - Example : 2: 8-7 Algorithms for NLP
Earley Recognition - Example 2: 3: 9-7 Algorithms for NLP
3 3 2 Earley Recognition - Example 3: 4: -7 Algorithms for NLP 3
4 4 2 Earley Recognition - Example 4: : -7 Algorithms for NLP 4 2 3
Earley Recognition - Example : 6: 2-7 Algorithms for NLP 4 2
Earley Recognition - Example 6: 7: 3-7 Algorithms for NLP
4 3 (! 2 Earley Recognition - Example 7: $ 8: (! $ 4-7 Algorithms for NLP
# # # # Time Complexity of Earley Algorithm Algorithm iterates for each word of input (i.e. iterations) How many items can be created and processed in Each item in Thus has the form items? The Scanner and Predictor operations on an item each require constant time The Completer operation on an item adds items of form to, with, so it may require up to time for each processed item, Time required for each iteration ( ) is thus 2 Time bound on entire algorithm is therefore 3-7 Algorithms for NLP
Time Complexity of Earley Algorithm Special Cases: Completer is the operation that may require iteration 2 time in For unambiguous grammars, Earley shows that the completer operation will require at most time Thus time complexity for unambiguous grammars is 2 For some grammars, the number of items in each bounded by a constant is These are called bounded-state grammars and include even some ambiguious grammars. For bounded-state grammars, the time complexity of the algorithm is linear - 6-7 Algorithms for NLP
Parsing with an Earley Parser As usual, we need to keep back-pointers to the constituents that we combine together when we complete a rule Each item must be extended to have the form, where the already found RHS sub-constituents / / are pointers to the At the end - reconstruct parse from the back-pointers To maintain efficiency - we must do ambiguity packing 7-7 Algorithms for NLP
Earley Parsing - Example 8-7 Algorithms for NLP
Earley Parsing - Example : (! $ : 9-7 Algorithms for NLP
Earley Parsing - Example : 2: 2 2-7 Algorithms for NLP 2
Earley Parsing - Example 2: 2 3: 4 2 3 3 4 2 3 2-7 Algorithms for NLP
3 3 2 4 Earley Parsing - Example 3: 4 2 3 4: 22-7 Algorithms for NLP 3
4 4 2 Earley Parsing - Example 4: : 26 23-7 Algorithms for NLP 4 6 2 3
Earley Parsing - Example : 26 6: 7 7 24-7 Algorithms for NLP 7 4
7 7 9 Earley Parsing - Example 6: 7 7 7: 8 9 8 2-7 Algorithms for NLP 8
4 7 (! 3 2 9 Earley Parsing - Example 7: 26 $ 2 8 9 4 26 9 4 8: (! $ 26-7 Algorithms for NLP