List of Images

Introduction

Phases of a compiler, and interfaces between them.
Tree representation of a straight-line program.

Lexical Analysis

Regular expression notation.
Regular expressions for some tokens.
Finite automata for lexical tokens. The states are indicated by circles; final states are indicated by double circles. The start state has an arrow coming in from nowhere. An edge labeled with several characters is shorthand for many parallel edges.
Combined finite automaton.
The automaton of Image 2.4 recognizes several tokens. The symbol | indicates the input position at each successive call to the lexical analyzer, the symbol ⊥ indicates the current position of the automaton, and ⊺ indicates the most recent position in which the recognizer was in a final state.
Translation of regular expressions to NFAs.
Four regular expressions translated to an NFA.
NFA converted to DFA.

Parsing

Parse tree.
Two parse trees for the same sentence using Grammar 3.1.
Two parse trees for the sentence 1-2-3 in Grammar 3.5.
Two parse trees for the sentence 1+2*3 in Grammar 3.5.
Parse trees that Grammar 3.8 will never produce.
Predictive parsing table for Grammar 3.12.
Shift-reduce parse of a sentence. Numeric subscripts in the Stack are DFA state numbers; see Table 3.19.
LR(0) states for Grammar 3.20.
LR(0) states and parsing table for Grammar 3.23.
SLR parsing table for Grammar 3.23.
LR(1) states for Grammar 3.26.
A hierarchy of grammar classes.
SableCC shift-reduce error message for Grammar 3.32.
Burke-Fisher parsing, with an error-repair queue. Image 3.18 shows the complete parse of this string according to Table 3.19.

Abstract Syntax

Abstract syntax for the MiniJava language.
Orthogonal directions of modularity.

Semantic Analysis

Several active environments at once.
Hash tables.
Binary search trees.
A MiniJava Program and its symbol table

Activation Records

A stack frame.

Translation to Intermediate Code

Compilers for five languages and four target machines: (a) without an IR, (b) with an IR.
Intermediate representation trees.
Object initialization.

Basic Blocks and Traces

Identities on trees (see also Exercise 8.1).
Different trace coverings for the same program.

Instruction Selection

Arithmetic and memory instructions. The notation M[x] denotes the memory word at address x.
A tree tiled in two ways.
The Schizo-Jouette architecture.

Liveness Analysis

Liveness of variables a, b, c.
Representations of interference.

Register Allocation

Simplification stack, and a possible coloring.
Graph coloring with coalescing.
A coloring, with coalescing, for Graph 11.1.
Moving a callee-save register to a fresh temporary.

Garbage Collection

A heap to be garbage collected. Class descriptors are not shown in the diagram.
Mark-and-sweep collection.
Copying collection.
Breadth-first copying collection.
Generational collection. The bold arrow is one of the rare pointers from an older generation to a newer one.

Object-Oriented Languages

Single inheritance of data fields.
Class descriptors for dynamic method lookup.
Multiple inheritance of data fields.
Field offsets in descriptors for multiple inheritance.

Functional Programming Languages

Closures for execution of twice(add(5)). SL=static link; RV=return value; EP=escaping-variables-pointer or environment-pointer.
printTable as compiled.

Dataflow Analysis

An illustration of value numbering. (a) A basic block; (b) the table created by the value-numbering algorithm, with hidden bindings shown crossed out; (c) a view of the table as a DAG.

Loop Optimizations

Some loops; in each case, 1 is the header node.
None of these contains a loop. Dotted lines indicate reduction of graph (c) by deleting edges and collapsing nodes.
(a) A flow graph; (b) its dominator tree.
The loop-nest tree for Image 18.3a. Each loop header is shown in the top half of each oval (nodes 1, 2, 5, 8); a loop comprises a header node (e.g., node 5), all the other nodes shown in the same oval (e.g., node 10), and all the nodes shown in subtrees of the loop-nest-tree node (e.g., 8, 9).
(a) A loop; (b) the same loop with a preheader.
Some good and bad candidates for hoisting tab.
A while loop (a), transformed into a repeat loop (b).
The basic induction variable i is incremented by different amounts in different iterations; the derived induction variable j is not changed in every iteration.

Static Single-Assignment Form

(a) A straight-line program. (b) The program in singleassignment form.
(a) A program with a control-Flow join; (b) the program trans formed to single assignment form; (c) edge-split SSA form.
(a) A program with a loop; (b) the program transformed to edgesplit single-assignment form. a0, b0, c0 are initial values of the variables before block 1.
Conversion of a program to static single-assignment form. Node 7 is a postbody node, inserted to make sure there is only one loop edge (see Exercise 18.6); such nodes are not strictly necessary but are sometimes helpful.
Node 5 dominates all the nodes in the grey area. (a) Dominance frontier of node 5 includes the nodes (4, 5, 12, 13) that are targets of edges crossing from the region dominated by 5 (grey area including node 5) to the region not strictly dominated by 5 (white area including node 5). (b) Any node in the dominance frontier of n is also a point of convergence of nonintersecting paths, one from n and one from the root node. (c) Another example of converging paths P1,5 and P5,5.
A control-Flow graph and trees derived from it. The numeric labels in part (b) are the dfnums of the nodes.
Path compression. (a) Ancestor links in a spanning tree; AncestorWithLowestSemi(v) traverses three links. (b) New nodes a2, a3 are linked into the tree. Now AncestorWithLowestSemi(w) would traverse 6 links. (c) AncestorWithLowestSemi(v) with path compression redirects ancestor links, but best[v] remembers the best intervening node on the compressed path between v and a1. (d) Now, after a2 and a3 are linked, AncestorWithLowestSemi(w) traverses only 4 links.
Conditional constant propagation.
This transformation does not preserve the dominance property of SSA form, and should be avoided.
Construction of the control-dependence graph.
Aggressive dead-code elimination
Functional intermediate representation. Binding occurrences of variables are underlined.

Pipelining and Scheduling

Functional unit requirements of instructions (on the MIPS R4000 processor). This machine's Floating-point ADD instruction uses the instruction-fetch unit for one cycle; reads registers for one cycle; unpacks exponent and mantissa; then for the next cycle uses a shifter and an adder; then uses both the adder and a rounding unit; then the rounding unit and a shifter; then writes a result back to the register file. The MULT and CONV instructions use functional units in a different order.
If there is only one functional unit of each kind, then an ADD cannot be started at the same time as a MULT (because of numerous resource hazards shown in boldface); nor three cycles after the MULT (because of Add, Round, and Write hazards); nor four cycles later (because of Add and Round hazards). But if there were two adders and two rounding units, then an ADD could be started four cycles after a MULT. Or with dual fetch units, multiple-access register file, and dual unpackers, the MULT and ADD could be started simultaneously.
Data dependence. (Above) If the MULT produces a result that is an operand to ADD, the MULT must write its result to the register file before the ADD can read it. (Below) Special bypassing circuitry can route the result of MULT directly to the Shift and Add units, skipping the Write, Read, and Unpack stages.
Pipelined schedule. Assignments in each row happen simultaneously; each right-hand side refers to the value before the assignment. The loop exit test i < N + 1 has been "moved past" three increments of i, so appears as i < N − 2.
Pipelined schedule, with move instructions.
Iterative modulo scheduling applied to Program 20.4b. Graph 20.5a is the data-dependence graph; Δmin = 5 (see page 451); H = [c, d, e, a, b, f, j, g, h].
Dependence of ADD's instruction-fetch on result of BRANCH.

The Memory Hierarchy

Organization of a direct-mapped cache. Key field of the address is used to index the tags array and the data blocks; if tags[key] matches the tag field of the address then the data is valid (cache hit). Word index is used to select a word from the cache block.
The memory hierarchy.
Alignment of data objects (or basic blocks) to avoid crossing cache-block boundaries is often worthwhile, even at the cost of empty space between objects.
If x is rarely true, basic-block placement (a) will occupy three in-cache blocks, while (b) will usually occupy only two.
Execution of a dot-product loop, with 4-word cache blocks.
(a) Without prefetching, on a machine with dynamic instruction reordering, the number of outstanding instructions (reserved registers) grows proportionally to the cache-miss latency.
(b) With prefetching, the hardware reservation table never grows large. (Steady-state behavior is shown here, not the initial transient.)
Matrix multiplication. Each element of C is computed from a row of A and a column of B. With blocking, a c × c block of the C matrix is computed from a c × N block of A and a N × c block of B.