Compiler Design
Top-Down Parsing & LL(1)
Top-Down Parsing Concept
Top-Down Parsing is a parsing strategy that attempts to find a leftmost derivation for an input string by building the parse tree from the root (start symbol) down to the leaves (input terminals).
Top-down parsers are categorized into:
- Recursive Descent Parser: A parser that uses recursive procedures to parse the input, which may require backtracking if multiple choices match the current lookahead.
- Predictive LL(1) Parser: A non-recursive, table-driven parser that reads input from Left to right, constructs a Leftmost derivation, and uses exactly 1 lookahead symbol to decide which production to apply without backtracking.
Computing FIRST and FOLLOW Sets
To construct an LL(1) predictive parsing table, the parser must compute two collections of terminal symbols: FIRST and FOLLOW.
Rules to compute FIRST for any grammar symbol X:
- If
Xis a terminal, thenFIRST(X) = { X }. - If there is a production
X → ε, then addεtoFIRST(X). - If
Xis a non-terminal andX → Y_1 Y_2 ... Y_kis a production:- Add all terminals of
FIRST(Y_1)toFIRST(X). - If
Y_1can deriveε, add all terminals ofFIRST(Y_2)toFIRST(X), and so on. - If all
Y_1 ... Y_kcan deriveε, then addεtoFIRST(X).
- Add all terminals of
Rules to compute FOLLOW for all non-terminals A:
- If
Sis the start symbol, add$(end of input marker) toFOLLOW(S). - If there is a production
A → α B β, then add everything inFIRST(β)(exceptε) toFOLLOW(B). - If there is a production
A → α B, or a productionA → α B βwhereε ∈ FIRST(β), then add everything inFOLLOW(A)toFOLLOW(B).
LL(1) Grammars and Table Construction
A grammar is LL(1) if and only if, for any two distinct productions A → α and A → β:
FIRST(α)andFIRST(β)are disjoint sets.- If
ε ∈ FIRST(α), thenFIRST(β)andFOLLOW(A)are disjoint.
These rules ensure that the parser table has no multi-entry cells (conflicts). In the next section, we will walk through a complete, step-by-step solved problem constructing the LL(1) table.