SOLUTION OF DATAFLOW EQUATIONS

Liveness of variables "flows" around the edges of the control-flow graph; determining the live range of each variable is an example of a dataflow problem. will discuss several other kinds of dataflow problems. Flow-graph terminology A flow-graph node has out-edges that lead to successor nodes, and in-edges that come from predecessor nodes. The set pred[n] is all the predecessors of node n, and succ[n] is the set of successors. In Graph 10.1 the out-edges of node 5 are 5 → 6 and 5 → 2, and succ[5] = {2, 6}. The in-edges of 2 are 5 → 2 and 1 → 2, and pred[2] = {1, 5}. Uses and defs An assignment to a variable or temporary defines that variable. An occurrence of a variable on the right-hand side of an assignment (or in other expressions) uses the variable. We can speak of the def of a variable as the set of graph nodes that define it; or the def of a graph node as the set of variables that it defines; and similarly for the use of a variable or graph node. In Graph 10.1, def(3) = {c}, use(3) = {b, c}. Liveness A variable is live on an edge if there is a directed path from that edge to a use of the variable that does not go through any def. A variable is live-in at a node if it is live on any of the in-edges of that node; it is live-out at a node if it is live on any of the out-edges of the node.

CALCULATION OF LIVENESS

Liveness information (live-in and live-out) can be calculated from use and def as follows:

If a variable is in use[n], then it is live-in at node n. That is, if a statement uses a variable, the variable is live on entry to that statement.
If a variable is live-in at a node n, then it is live-out at all nodes m in pred[n].
If a variable is live-out at node n, and not in def [n], then the variable is also live-in at n. That is, if someone needs the value of a at the end of statement n, and n does not provide that value, then a's value is needed even on entry to n.

These three statements can be written as Equations 10.3 on sets of variables. The live-in sets are an array in[n] indexed by node, and the live-out sets are an array out[n]. That is, in[n] is all the variables in use[n], plus all the variables in out[n] and not in def [n]. And out[n] is the union of the live-in sets of all successors of n.EQUATIONS 10.3: Dataflow equations for liveness analysis.
Java ScreenShot

Algorithm 10.4 finds a solution to these equations by iteration. As usual, we initialize in[n] and out[n] to the the empty set {}, forall n, then repeatedly treat the equations as assignment statements until a fixed point is reached.ALGORITHM 10.4: Computation of liveness by iteration.

for each n
 in[n] {}; out[n] {}
repeat for each n in′[n] → in[n]; out′[n] ← out[n]
 in[n] ← use[n] ∪ (out[n] − def[n])
 out[n] ← ∪_s∈succ[n]in[s]
until in′[n] = in[n] and out′[n] = out[n] for all n

Table 10.5 shows the results of running the algorithm on Graph 10.1. The columns 1st, 2nd, etc., are the values of in and out on successive iterations of the repeat loop. Since the 7th column is the same as the 6th, the algorithm terminates.

Table 10.5: Liveness calculation following forward control-flow edges.
			1st		2nd		3rd		4th		5th		6th		7th
	use	def	in	out	in	out	in	out	in	out	in	out	in	out	in	out

1		a				a		a		ac	c	ac	c	ac	c	ac
2	a	b	a		a	bc	ac	bc	ac	bc	ac	bc	ac	bc	ac	bc
3	bc	c	bc		bc	b	bc	b	bc	b	bc	b	bc	bc	bc	bc
4	b	a	b		b	a	b	a	b	ac	bc	ac	bc	ac	bc	ac
5	a		a	a	a	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac
6	c		c		c		c		c		c		c		c

We can speed the convergence of this algorithm significantly by ordering the nodes properly. Suppose there is an edge 3 → 4 in the graph. Since in[4] is computed from out[4], and out[3] is computed from in[4], and so on, we should compute the in and out sets in the order out[4] → in[4] → out[3] → in[3]. But in Table 10.5, just the opposite order is used in each iteration! We have waited as long as possible (in each iteration) to make use of information gained from the previous iteration. Table 10.6 shows the computation, in which each for loop iterates from 6 to 1 (approximately following the reversed direction of the flow-graph arrows), and in each iteration the out sets are computed before the in sets. By the end of the second iteration, the fixed point has been found; the third iteration just confirms this.

Table 10.6: Liveness calculation following reverse control-flow edges.
			1st		2nd		3rd
	use	def	out	in	out	in	out	in

6	c			c		c		c
5	a		c	ac	ac	ac	ac	ac
4	b	a	ac	bc	ac	bc	ac	bc
3	bc	c	bc	bc	bc	bc	bc	bc
2	a	b	bc	ac	bc	ac	bc	ac
1		a	ac	c	ac	c	ac	c

When solving dataflow equations by iteration, the order of computation should follow the "flow." Since liveness flows backward along control-flow arrows, and from "out" to "in", so should the computation. Ordering the nodes can be done easily by depth-first search, as shown in . Basic blocks Flow-graph nodes that have only one predecessor and one successor are not very interesting. Such nodes can be merged with their predecessors and successors; what results is a graph with many fewer nodes, where each node represents a basic block. The algorithms that operate on flow graphs, such as liveness analysis, go much faster on the smaller graphs. explains how to adjust the dataflow equations to use basic blocks. In this chapter we keep things simple. One variable at a time Instead of doing dataflow "in parallel" using set equations, it can be just as practical to compute dataflow for one variable at a time as information about that variable is needed. For liveness, this would mean repeating the dataflow traversal once for each temporary. Starting from each use site of a temporary t, and tracing backward (following predecessor edges of the flow graph) using depth-first search, we note the liveness of t at each flow-graph node. The search stops at any definition of the temporary. Although this might seem expensive, many temporaries have very short live ranges, so the searches terminate quickly and do not traverse the entire flow graph for most variables.

REPRESENTATION OF SETS

There are at least two good ways to represent sets for dataflow equations: as arrays of bits or as sorted lists of variables. If there are N variables in the program, the bit-array representation uses N bits for each set. Calculating the union of two sets is done by or-ing the corresponding bits at each position. Since computers can represent K bits per word (with K = 32 typical), one set-union operation takes N/K operations. A set can also be represented as a linked list of its members, sorted by any totally ordered key (such as variable name). Calculating the union is done by merging the lists (discarding duplicates). This takes time proportional to the size of the sets being unioned.

Clearly, when the sets are sparse (fewer than N/K elements, on the average), the sorted-list representation is asymptotically faster; when the sets are dense, the bit-array representation is better.

TIME COMPLEXITY

How fast is iterative dataflow analysis? A program of size N has at most N nodes in the flow graph, and at most N variables. Thus, each live-in set (or live-out set) has at most N elements; each set-union operation to compute live-in (or live-out) takes O(N) time. The for loop computes a constant number of set operations per flow-graph node; there are O(N) nodes; thus, the for loop takes O(N²) time. Each iteration of the repeat loop can only make each in or out set larger, never smaller. This is because the in and out sets are monotonic with respect to each other. That is, in the equation in[n] = use[n]∪(out[n]−def[n]), a larger out[n] can only make in[n] larger. Similarly, in out[n] = ∪_s∈succ[n]in[s], a larger in[s] can only make out[n] larger. Each iteration must add something to the sets; but the sets cannot keep growing infinitely; at most every set can contain every variable. Thus, the sum of the sizes of all in and out sets is 2N², which is the most that the repeat loop can iterate. Thus, the worst-case run time of this algorithm is O(N⁴). Ordering the nodes using depth-first search (Algorithm 17.5, page 363) usually brings the number of repeat-loop iterations to two or three, and the live sets are often sparse, so the algorithm runs between O(N) and O(N²) in practice.

discusses more sophisticated ways of solving dataflow equations quickly.

LEAST FIXED POINTS

Table 10.7 illustrates two solutions (and a nonsolution!) to the Equations 10.3; assume there is another program variable d not used in this fragment of the program.

Table 10.7: X and Y are solutions to the liveness equations; Z is not a solution.
			X		Y		Z
	use	def	in	out	in	out	in	out
1		a	c	ac	cd	acd	c	ac
2	a	b	ac	bc	acd	bcd	ac	b
3	bc	c	bc	bc	bcd	bcd	b	b
4	b	a	bc	ac	bcd	acd	b	ac
5	a		ac	ac	acd	acd	ac	ac
6	c		c		c		c

In solution Y, the variable d is carried uselessly around the loop. But in fact, Y satisfies Equations 10.3 just as X does. What does this mean? Is d live or not? The answer is that any solution to the dataflow equations is a conservative approximation. If the value of variable a will truly be needed in some execution of the program when execution reaches node n of the flow graph, then we can be assured that a is live-out at node n in any solution of the equations. But the converse is not true; we might calculate that d is live-out, but that doesn't mean that its value will really be used. Is this acceptable? We can answer that question by asking what use will be made of the dataflow information. In the case of liveness analysis, if a variable is thought to be live, then we will make sure to have its value in a register. A conservative approximation of liveness is one that may erroneously believe a variable is live, but will never erroneously believe it is dead. The consequence of a conservative approximation is that the compiled code might use more registers than it really needs; but it will compute the right answer. Consider instead the live-in sets Z, which fail to satisfy the dataflow equations. Using this Z we think that b and c are never live at the same time, and we would assign them to the same register. The resulting program would use an optimal number of registers but compute the wrong answer. A dataflow equation used for compiler optimization should be set up so that any solution to it provides conservative information to the optimizer; imprecise information may lead to suboptimal but never incorrect programs. Theorem Equations 10.3 have more than one solution. Proof X and Y are both solutions. Theorem All solutions to Equations 10.3 contain solution X. That is, if in_X [n] and in_Y [n] are the live-in sets for some node n in solutions X and Y, then in_X [n] ⊆ in_Y [n]. Proof See Exercise 10.2.

We say that X is the least solution to Equations 10.3. Clearly, since a bigger solution will lead to using more registers (producing suboptimal code), we want to use the least solution. Fortunately, Algorithm 10.4 always computes the least fixed point.

STATIC VS. DYNAMIC LIVENESS

A variable is live "if its value will be used in the future." In Graph 10.8, we know that b × b must be nonnegative, so that the test c ≥ b will be true. Thus, node 4 will never be reached, and a's value will not be used after node 2; a is not live-out of node 2.GRAPH 10.8: Standard static dataflow analysis will not take advantage of the fact that node 4 can never be reached.
Java ScreenShot

But Equations 10.3 say that a is live-in to node 4, and therefore live-out of nodes 3 and 2. The equations are ignorant of which way the conditional branch will go. "Smarter" equations would permit a and c to be assigned the same register. Although we can prove here that b*b ≥ 0, and we could have the compiler look for arithmetic identities, no compiler can ever fully understand how all the control flow in every program will work. This is a fundamental mathematical theorem, derivable from the halting problem. Theorem There is no program H that takes as input any program P and input X and (without infinite-looping) returns true if P(X) halts and false if P(X) infinite-loops. Proof Suppose that there were such a program H; then we could arrive at a contradiction as follows. From the program H, construct the function F, Java ScreenShot

By the definition of H, if F(F) halts, then H(F, F) is true; so the then clause is taken; so the while loop executes forever; so F(F) does not halt. But if F(F) loops forever, then H(F, F) is false; so the else clause is taken; so F(F) halts. The program F(F) halts if it doesn't halt, and doesn't halt if it halts: a contradiction. Thus there can be no program H that tests whether another program halts (and always halts itself). Corollary No program H′(X, L) can tell, for any program X and label L within X, whether the label L is ever reached on an execution of X. Proof From H′ we could construct H. In some program that we want to test for halting, just let L be the end of the program, and replace all instances of the halt command with goto L. Conservative approximation This theorem does not mean that we can never tell if a given label is reached or not, just that there is not a general algorithm that can always tell. We could improve our liveness analysis with some special-case algorithms that, in some cases, calculate more information about run-time control flow. But any such algorithm will come up against many cases where it simply cannot tell exactly what will happen at run time. Because of this inherent limitation of program analysis, no compiler can really tell if a variable's value is truly needed - whether the variable is truly live. Instead, we have to make do with a conservative approximation. We assume that any conditional branch goes both ways. Thus, we have a dynamic condition and its static approximation:

Dynamic liveness Avariable a is dynamically live at node n if some execution of the program goes from n to a use of a without going through any definition of a.
Static liveness Avariable a is statically live at node n if there is some path of control-flow edges from n to some use of a that does not go through a definition of a.

Clearly, if a is dynamically live, it is also statically live. An optimizing compiler must allocate registers, and do other optimizations, on the basis of static liveness, because (in general) dynamic liveness cannot be computed.

INTERFERENCE GRAPHS

Liveness information is used for several kinds of optimizations in a compiler. For some optimizations, we need to know exactly which variables are live at each node in the flow graph. One of the most important apps of liveness analysis is for register allocation: We have a set of temporaries a, b, c,… that must be allocated to registers r₁,…, r_k. A condition that prevents a and b from being allocated to the same register is called an interference. The most common kind of interference is caused by overlapping live ranges: When a and b are both live at the same program point, then they cannot be put in the same register. But there are some other causes of interference: for example, when a must be generated by an instruction that cannot address register r₁, then a and r₁ interfere. Interference information can be expressed as a matrix; Image 10.9a has an x marking interferences of the variables in Graph 10.1. The interference matrix can also be expressed as an undirected graph (Image 10.9b), with a node for each variable, and edges connecting variables that interfere.
Image 10.9: Representations of interference.

Special treatment of MOVE instructions In static liveness analysis, we can give MOVE instructions special consideration. It is important not to create artifical interferences between the source and destination of a move. Consider the program: Java ScreenShot

After the copy instruction both s and t are live, and normally we would make an interference edge (s, t) since t is being defined at a point where s is live. But we do not need separate registers for s and t, since they contain the same value. The solution is just not to add an interference edge (t, s) in this case. Of course, if there is a later (nonmove) definition of t while s is still live, that will create the interference edge (t, s). Therefore, the way to add interference edges for each new definition is

At any nonmove instruction that defines avariable a, where the live-out variables are b₁,…, b_j, add interference edges (a, b₁),…,(a, b_j).
At a move instruction a ← c, where variables b₁,…, b_j are live-out, add interference edges (a, b₁),…,(a, b_j) for any b_i that is not the same as c.

			1st		2nd		3rd		4th		5th		6th		7th
	use	def	in	out	in	out	in	out	in	out	in	out	in	out	in	out

1		a				a		a		ac	c	ac	c	ac	c	ac
2	a	b	a		a	bc	ac	bc	ac	bc	ac	bc	ac	bc	ac	bc
3	bc	c	bc		bc	b	bc	b	bc	b	bc	b	bc	bc	bc	bc
4	b	a	b		b	a	b	a	b	ac	bc	ac	bc	ac	bc	ac
5	a		a	a	a	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac
6	c		c		c		c		c		c		c		c

			1st		2nd		3rd
	use	def	out	in	out	in	out	in

6	c			c		c		c
5	a		c	ac	ac	ac	ac	ac
4	b	a	ac	bc	ac	bc	ac	bc
3	bc	c	bc	bc	bc	bc	bc	bc
2	a	b	bc	ac	bc	ac	bc	ac
1		a	ac	c	ac	c	ac	c

			X		Y		Z
	use	def	in	out	in	out	in	out
1		a	c	ac	cd	acd	c	ac
2	a	b	ac	bc	acd	bcd	ac	b
3	bc	c	bc	bc	bcd	bcd	b	b
4	b	a	bc	ac	bcd	acd	b	ac
5	a		ac	ac	acd	acd	ac	ac
6	c		c		c		c

			1st		2nd		3rd		4th		5th		6th		7th
	use	def	in	out	in	out	in	out	in	out	in	out	in	out	in	out

1		a				a		a		ac	c	ac	c	ac	c	ac
2	a	b	a		a	bc	ac	bc	ac	bc	ac	bc	ac	bc	ac	bc
3	bc	c	bc		bc	b	bc	b	bc	b	bc	b	bc	bc	bc	bc
4	b	a	b		b	a	b	a	b	ac	bc	ac	bc	ac	bc	ac
5	a		a	a	a	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac
6	c		c		c		c		c		c		c		c

			1st		2nd		3rd
	use	def	out	in	out	in	out	in

6	c			c		c		c
5	a		c	ac	ac	ac	ac	ac
4	b	a	ac	bc	ac	bc	ac	bc
3	bc	c	bc	bc	bc	bc	bc	bc
2	a	b	bc	ac	bc	ac	bc	ac
1		a	ac	c	ac	c	ac	c

			X		Y		Z
	use	def	in	out	in	out	in	out
1		a	c	ac	cd	acd	c	ac
2	a	b	ac	bc	acd	bcd	ac	b
3	bc	c	bc	bc	bcd	bcd	b	b
4	b	a	bc	ac	bcd	acd	b	ac
5	a		ac	ac	acd	acd	ac	ac
6	c		c		c		c

			1st		2nd		3rd		4th		5th		6th		7th
	use	def	in	out	in	out	in	out	in	out	in	out	in	out	in	out

1		a				a		a		ac	c	ac	c	ac	c	ac
2	a	b	a		a	bc	ac	bc	ac	bc	ac	bc	ac	bc	ac	bc
3	bc	c	bc		bc	b	bc	b	bc	b	bc	b	bc	bc	bc	bc
4	b	a	b		b	a	b	a	b	ac	bc	ac	bc	ac	bc	ac
5	a		a	a	a	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac	ac
6	c		c		c		c		c		c		c		c

			1st		2nd		3rd
	use	def	out	in	out	in	out	in

6	c			c		c		c
5	a		c	ac	ac	ac	ac	ac
4	b	a	ac	bc	ac	bc	ac	bc
3	bc	c	bc	bc	bc	bc	bc	bc
2	a	b	bc	ac	bc	ac	bc	ac
1		a	ac	c	ac	c	ac	c

			X		Y		Z
	use	def	in	out	in	out	in	out
1		a	c	ac	cd	acd	c	ac
2	a	b	ac	bc	acd	bcd	ac	b
3	bc	c	bc	bc	bcd	bcd	b	b
4	b	a	bc	ac	bcd	acd	b	ac
5	a		ac	ac	acd	acd	ac	ac
6	c		c		c		c