Compiler Design

Top-Down Parsing & LL(1)


Top-Down Parsing Concept

Top-Down Parsing is a parsing strategy that attempts to find a leftmost derivation for an input string by building the parse tree from the root (start symbol) down to the leaves (input terminals).

Top-down parsers are categorized into:

  • Recursive Descent Parser: A parser that uses recursive procedures to parse the input, which may require backtracking if multiple choices match the current lookahead.
  • Predictive LL(1) Parser: A non-recursive, table-driven parser that reads input from Left to right, constructs a Leftmost derivation, and uses exactly 1 lookahead symbol to decide which production to apply without backtracking.

Computing FIRST and FOLLOW Sets

To construct an LL(1) predictive parsing table, the parser must compute two collections of terminal symbols: FIRST and FOLLOW.

Rules to compute FIRST for any grammar symbol X:

  1. If X is a terminal, then FIRST(X) = { X }.
  2. If there is a production X → ε, then add ε to FIRST(X).
  3. If X is a non-terminal and X → Y_1 Y_2 ... Y_k is a production:
    • Add all terminals of FIRST(Y_1) to FIRST(X).
    • If Y_1 can derive ε, add all terminals of FIRST(Y_2) to FIRST(X), and so on.
    • If all Y_1 ... Y_k can derive ε, then add ε to FIRST(X).

Rules to compute FOLLOW for all non-terminals A:

  1. If S is the start symbol, add $ (end of input marker) to FOLLOW(S).
  2. If there is a production A → α B β, then add everything in FIRST(β) (except ε) to FOLLOW(B).
  3. If there is a production A → α B, or a production A → α B β where ε ∈ FIRST(β), then add everything in FOLLOW(A) to FOLLOW(B).

LL(1) Grammars and Table Construction

A grammar is LL(1) if and only if, for any two distinct productions A → α and A → β:

  • FIRST(α) and FIRST(β) are disjoint sets.
  • If ε ∈ FIRST(α), then FIRST(β) and FOLLOW(A) are disjoint.

These rules ensure that the parser table has no multi-entry cells (conflicts). In the next section, we will walk through a complete, step-by-step solved problem constructing the LL(1) table.