Compiler Design

Finite Automata & DFA Construction / Subset Construction


Since computers cannot simulate non-deterministic path choices efficiently, every NFA must be converted into an equivalent DFA. The **Subset Construction Algorithm** does this by grouping NFA states into subsets that the machine could be in simultaneously.

Epsilon-Closure (ε-closure)

The ε-closure of a state is the set of all NFA states reachable from that state by traversing ONLY epsilon (ε) transitions (including the state itself).

To find the ε-closure of a set of NFA states T:

Push all states in T onto stack;
Initialize ε-closure(T) = T;

while (stack is not empty) {
    Pop u from stack;
    for (each state v with an ε-transition from u) {
        if (v is not in ε-closure(T)) {
            Add v to ε-closure(T);
            Push v onto stack;
        }
    }
}

Subset Construction Algorithm

Let N be an NFA. The subset construction constructs a DFA D with states Dstates and transition table Dtran:

1. DstartState = ε-closure(N.startState)
2. Add DstartState as unmarked to Dstates
3. while (there is an unmarked state U in Dstates) {
       Mark U;
       for (each input symbol 'a') {
           T = states reached from elements in U on input 'a'
           U_new = ε-closure(T)
           if (U_new is not empty) {
               if (U_new is not in Dstates) {
                   Add U_new as unmarked to Dstates
               }
               Dtran[U, 'a'] = U_new
           }
       }
   }
4. A state in Dstates is an ACCEPT state of the DFA 
   if it contains at least one accept state of the NFA.