Bottom-Up Parsing
Reductions
sentence: | w | * | x | + | y | * | z | <end> |
---|---|---|---|---|---|---|---|---|
reductions | ||||||||
w | * | x | + | y | * | z | <end> | |
1 | Factor | |||||||
2 | Term | |||||||
3 | Factor | |||||||
4 | Term | |||||||
5 | Exp | |||||||
6 | Factor | |||||||
7 | Term | |||||||
8 | Factor | |||||||
9 | Term | |||||||
10 | Exp |
- The reductions are equivalent to performing the following right-most (rm) derivations
-
- <Exp>
rm=> <Exp> + <Term> rm=> <Exp> + <Term>*<Factor> rm=> <Exp> + <Term>*z rm=> <Exp> + <Factor>*z rm=> <Exp> + y*z rm=> <Term> + y*z rm=> <Term>*<Factor> + y*z rm=> <Term>*x + y*z rm=> <Factor>*x + y*z rm=> w*x+y*z - <Exp>
- but performing them in reverse.
Parse tree
-
+
*
*
w x y z
Table-driven, bottom-up parsing
- Grammar
(E ~ <Exp>, T ~ <Term>, F ~ <Factor>, id ~ x | y | ... , for brevity) - E -> E + T
- E -> E - T
- E -> T
- T -> T * F
- T -> T / F
- T -> F
- F -> id
- F -> ( E )
- F -> - F
- E -> E - T
Parsing, w*x+y*z, i.e., id*id+id*id, general bottom-up shift-reduce scheme | ||
---|---|---|
>stack> | <input< | action |
w * x + y * z $ | shift | |
w | * x + y * z $ | reduce |
F | * x + y * z $ | reduce |
T | * x + y * z $ | shift |
T * | x + y * z $ | shift |
T * x | + y * z $ | reduce |
T * F | + y * z $ | reduce |
T | + y * z $ | reduce |
E | + y * z $ | shift |
E + | y * z $ | shift |
E + y | * z $ | reduce |
E + F | * z $ | reduce |
E + T | * z $ | shift |
E + T * | z $ | shift |
E + T * z | $ | reduce |
E + T * F | $ | reduce |
E + T | $ | reduce |
E | $ | accept |
handles for reduction are underlined |
SLR Parsing
- To produce an SLR (simple LR) parsing table, first augment the grammar with new start symbol E' and production...
- E' -> E
- then use this algorithm
- C := { closure({E' -> . E}) };
- for each set, I, in C do
- for each symbol, X, s.t. goto(I, X) is not empty and is not in C do
- C := C + goto(I, X)
- end_for
- end_for
- for each set, I, in C do
- where
- function goto(I, X)
- begin --X is a terminal or a non-terminal symbol
- I' := { };
- for each member of I of the form A -> α . X β do
- I' := I' + { A -> α X . β } --jump the X
- end_for;
- return closure(I')
- end --goto
- begin --X is a terminal or a non-terminal symbol
- and
- function closure(I)
- begin
- I' := I;
- for each member of I' of the form C -> γ . D φ do
- for each production D -> ψ do
- I' := I' + {D -> . ψ}
- end_for
- end_for
- end --closure
- begin
- This produces the canonical collection of LR(0) items, e.g.,
- I0 = { E' -> . E,
- closure: E -> . E + T,
E -> . E - T, E -> . T, T -> . T * F, T -> . T / F, T -> . F, F -> . id, F -> . ( E ), F -> . - F } - goto(I0, E) =
- I1 = { E' -> E . ,
E -> E . + T, E -> E . - T } - goto(I0, T) =
- I2 = { E -> T . ,
T -> T . * F, T -> T . / F } - goto(I0, F) =
- I3 = { T -> F . }
- goto(I0, id) =
- I4 = { F -> id . }
- goto(I0, ( ) =
- I5 = { F -> ( . E ),
- closure: E -> . E + T,
E -> . E - T, E -> . T, T -> . T * F, T -> . T / F, T -> . F, F -> . id, F -> . ( E ), F -> . - F } - goto(I0, - ) =
- I6 = { F -> - . F,
- closure: F -> . id,
F -> . ( E ) F -> . - F } - goto(I1, +) =
- I7 = { E -> E + . T,
- closure: T -> . T * F,
T -> . T / F, T -> . F, F -> . id, F -> . ( E ), F -> . - F } - goto(I1, - ) =
- I8 = { E -> E - . T,
- closure: T -> . T * F,
T -> . T / F, T -> . F, F -> . id, F -> . ( E ), F -> . - F } - goto(I2, *) =
- I9 = { T -> T * . F,
- closure: F -> . id,
F -> . ( E ), F -> . - F } - goto (I2, /) =
- I10 = { T -> T / . F,
- closure: F -> .id,
F -> . ( E ), F -> . - F } - goto(I5, E) =
- I11 = { F -> ( E . ),
E -> E . + T, E -> E . - T } - goto(I5, T) = I2
- goto(I5, F) = I3
- goto(I5, id) = I4
- goto(I5, ( ) = I5 (!)
- goto(I6, F) =
- goto(I5, F) = I3
- I12 = { F -> - F . }
- goto(I6, id) = I4
- goto(I6, - ) = I6
- goto(I6, ( ) = I5
- goto(I7, T) =
- goto(I6, - ) = I6
- I13 = { E -> E + T . ,
T -> T . * F, T -> T . / F } - goto(I7, F) = I3
- goto(I7, id) = I4
- goto(I7, ( ) = I5
- goto(I7, -) = I6
- goto(I8, T) =
- goto(I7, id) = I4
- I14 = { E -> E - T . ,
T -> T . * F, T -> T . / F } - goto(I8, F) = I3
- goto(I8, id) = I4
- goto(I8, ( ) = I5
- goto(I8, -) = I6
- goto(I9, F) =
- goto(I8, id) = I4
- I15 = { T -> T * F . }
- goto(I9, id) = I4
- goto(I9, ( ) = I5
- goto(I9, -) = I6
- goto(I10, F) =
- goto(I9, ( ) = I5
- I16 = { T -> T / F . }
- goto(I10, id) = I4
- goto(I10, ( ) = I5
- goto(I10, -) = I6
- goto(I11, +) = I7
- goto(I11, -) = I8
- goto(I11, ) ) =
- goto(I10, ( ) = I5
- I17 = { F -> ( E ) . }
-
- goto(I13, *) = I9
- goto(I13, /) = I10
- goto(I14, *) = I9
- goto(I14, /) = I10
- goto(I13, *) = I9
- To fill in the entries of the SLR parsing table where statei corresponds to Ii
-
- initialise all action[,] and goto[,] entries to blank (indicating parse error);
- for each state, i, do
- for each terminal symbol, a, do
- if A -> α . a β is in Ii, and goto(Ii,a)=Ij then
- action[i, a] := shift j --[*]
- end_if
- end_for;
- if S' -> S . is in Ii then
- action[i, $] := accept --[*]
else- end_if;
- for each non-terminal, A, other than S' do
- if A -> α . is in Ii then
- for each a in FOLLOW(A) do
- action[i, a] := reduce by A -> α --[*]
- end_for
- end_if
- end_for;
end_if;- for each non-terminal, A, do
- if goto(Ii, A) = Ij then
- goto[i, A] := j
- end_if
- end_for
- end_for --state i
- initialise all action[,] and goto[,] entries to blank (indicating parse error);
- [*] If any conflicting actions are produced here then fail,
the grammar is not SLR(1).
- Blank entries indicate parsing errors.
SLR(1) table | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
state | Action | Goto | |||||||||
id | + | - | * | / | ( | ) | $ | E | T | F | |
0 | s4 | s6 | s5 | 1 | 2 | 3 | |||||
1 | s7 | s8 | accept | ||||||||
2 | r(E->T) | r(E->T) | s9 | s10 | r(E->T) | r(E->T) | |||||
3 | r(T->F) | r(T->F) | r(T->F) | r(T->F) | r(T->F) | r(T->F) | |||||
4 | r(F->id) | r(F->id) | r(F->id) | r(F->id) | r(F->id) | r(F->id) | |||||
5 | s4 | s5 | 11 | 2 | 3 | ||||||
6 | s4 | s6 | s5 | 12 | |||||||
7 | s4 | s6 | s5 | 13 | 3 | ||||||
8 | s4 | s6 | s5 | 14 | 3 | ||||||
9 | s4 | s6 | s5 | 15 | |||||||
10 | s4 | s6 | s5 | 16 | |||||||
11 | s7 | s8 | s17 | ||||||||
12 | r(F->-F) | r(F->-F) | r(F->-F) | r(F->-F) | r(F->-F) | r(F->-F) | |||||
13 | r(E->E+T) | r(E->E+T) | s9 | s10 | r(E->E+T) | r(E->E+T) | |||||
14 | r(E->E-T) | r(E->E-T) | s9 | s10 | r(E->E-T) | r(E->E-T) | |||||
15 | r(T->T*F) | r(T->T*F) | r(T->T*F) | r(T->T*F) | r(T->T*F) | r(T->T*F) | |||||
16 | r(T->T/F) | r(T->T/F) | r(T->T/F) | r(T->T/F) | r(T->T/F) | r(T->T/F) | |||||
17 | r(F->(E)) | r(F->(E)) | r(F->(E)) | r(F->(E)) | r(F->(E)) | r(F->(E)) |
- To parse using an SLR(1) parsing table
-
- initialise stack;
- push start_state;
- do
- s := top of stack; --a state
- a := next input symbol;
- case action[s, a] of
- shift s' =>
- push a; push s';
- advance to next input symbol;
- | reduce by A -> β =>
- pop 2 × |β| items from stack;
- output A -> β;
- s' := top of stack;
- push A; push(goto[s', A]);
- | accept => ... success ... ;
- | _ => ... error ...
- end_case
- end_do
- initialise stack;
Parsing w*x+y*z, i.e., id*id+id*id, using the SLR(1) table | ||
---|---|---|
>stack> | <input< | action |
0 | w * x + y * z $ | s4 |
0 w 4 | * x + y * z $ | r |
0 F 3 (=goto(0,F)) | * x + y * z $ | r |
0 T 2 | * x + y * z $ | s9 |
0 T 2 * 9 | x + y * z $ | s4 |
0 T 2 * 9 x 4 | + y * z $ | r |
0 T 2 * 9 F 15 | + y * z $ | r |
0 T 2 | + y * z $ | r |
0 E 1 | + y * z $ | s7 |
0 E 1 + 7 | y * z $ | s4 |
0 E 1 + 7 y 4 | * z $ | r |
0 E 1 + 7 F 3 | * z $ | r |
0 E 1 + 7 T 13 | * z $ | s9 |
0 E 1 + 7 T 13 * 9 | z $ | s4 |
0 E 1 + 7 T 13 * 9 z 4 | $ | r |
0 E 1 + 7 T 13 * 9 F 15 | $ | r |
0 E 1 + 7 T 13 | $ | r |
0 E 1 | $ | accept |
References
- The dragon book,
- R. Bornat, Understanding and Writing Compilers, Macmillan 1979.
— 2002 L.A.