Compiler Design

Intermediate Code Generation


Intermediate Code Generation

Intermediate code is a representation between the high-level source language and low-level target machine code. The compiler translates source code into this form before generating final machine code.

  • Machine Independence: Enforces a clean separation between the front-end (source-dependent) and back-end (machine-dependent) of the compiler.
  • Code Optimization: Facilitates machine-independent code optimization before final code generation.
  • Retargeting: Makes it easier to retarget the compiler to new machines by just changing the back-end.
  • Simplicity: Intermediate forms are simpler to optimize than complex high-level source code or rigid machine code.

Forms of Intermediate Code

In Three-Address Code, each instruction has at most ONE operator on the right-hand side, and it references at most three addresses (two operands and one result).

Common Representations:

  • Quadruples: A table with 4 fields: Operator, Operand 1, Operand 2, Result.
  • Triples: A table with 3 fields: Operator, Operand 1, Operand 2. Results are implicitly referred to by the line number where they are computed.
  • Indirect Triples: Uses a separate table of pointers to triples, making optimizations (like moving instructions) easier.

In postfix notation (or Reverse Polish Notation), operators follow their operands. The main advantage is that no parentheses are needed to dictate the order of operations, as the order is unambiguous.

Example:

Infix:   a + b * c
Postfix: a b c * +

Abstract Syntax Tree (AST): A tree representation where internal nodes are operators and leaf nodes are operands.

Directed Acyclic Graph (DAG): A more compact version of an AST where common subexpressions are shared. Each subexpression has only ONE node, even if it appears multiple times in the expression.

FeatureASTDAG
Node duplicationDuplicates nodes for repeated subexpressions.Shares nodes for repeated subexpressions.
Memory efficiencyLess efficient (larger size).More efficient (compact representation).