Reachability in DAGs - Java Programming Language

To conclude our study of DAGs, we consider the problem of computing the transitive closure of a DAG. Can we develop algorithms for DAGs that are more efficient than the algorithms for general digraphs that we considered in ? Any method for topological sorting can serve as the basis for a transitive-closure algorithm for DAGs, as follows: We proceed through the vertices in reverse topological order, computing the reachability array for each vertex (its row in the transitive-closure matrix) from the rows corresponding to its adjacent vertices. The reverse topological sort ensures that all those rows have already been computed. In total, we check each of the V entries in the array corresponding to the destination vertex of each of the E edges, for a total running time proportional to VE. Although it is simple to implement, this method is no more efficient for DAGs than for general digraphs. When we use a standard DFS for the topological sort (see Program 19.7), we can improve performance for some DAGs, as shown in Program 19.9. Since there are no cycles in a DAG, there are no back edges in any DFS. More important, both cross edges and down edges point to nodes for which the DFS has completed. To take advantage of this fact, we develop a recursive method to compute all vertices reachable from a given start vertex, but (as usual in DFS) we make no recursive calls for vertices for which the reachable set has already been computed. In this case, the reachable vertices are represented by a row in the transitive closure, and the recursive method takes the logical or of all the rows associated with its adjacent edges. For tree edges, we do a recursive call to compute the row; for cross edges, we can skip the recursive call because we know that the row has been computed by a previous recursive call; for down edges, we can skip the whole computation because any reachable nodes that it would add have already been accounted for in the set of reachable nodes for the destination vertex (lower and earlier in the DFS tree). Using this version of DFS might be characterized as using dynamic coding to compute the transitive closure because we make use of results that have already been computed (and saved in the adjacency matrix rows corresponding to previously-processed vertices) to avoid making unnecessary recursive calls. Screenshot illustrates the computation of the transitive closure for the sample DAG in Screenshot.

Screenshot Transitive closure of a DAG

This sequence of row vectors is the transitive closure of the DAG in Screenshot, with rows created in reverse topological order, computed as the last action in a recursive DFS function (see Program 19.9). Each row is the logical or of the rows for adjacent vertices, which appear earlier in the list. For example, to compute the row for 0 we take the logical or of the rows for 5, 2, 1, and 6 (and put a 1 corresponding to 0 itself) because the edges 0-5, 0-2, 0-1, and 0-6 take us from 0 to any vertex that is reachable from any of those vertices. We can ignore down edges because they add no new information. For example, we ignore the edge from 0 to 3 because the vertices reachable from 3 are already accounted for in the row corresponding to 2.

Java graphics 19fig27

Property 19.13 With dynamic coding and DFS, we can support constant query time for the abstract transitive closure of a DAG with space proportional to V² and time proportional to V² + VX for preprocessing (computing the transitive closure), where X is the number of cross edges in the DFS forest. Proof: The proof is immediate by induction from the recursive method in Program 19.9. We visit the vertices in reverse topological order. Every edge points to a vertex for which we have already computed all reachable vertices, and we can therefore compute the set of reachable vertices of any vertex by merging together the sets of reachable vertices associated with the destination vertex of each edge. Taking the logical or of the specified rows in the adjacency matrix accomplishes this merge. We access a row of size V for each tree edge and each cross edge. There are no back edges, and we can ignore down edges because we accounted for any vertices they reach when we processed any ancestors of both nodes earlier in the search. Screenshot

If our DAG has no down edges (see Exercise 19.42), the running time of Program 19.9 is proportional to VE and represents no improvement over the transitive-closure algorithms that we examined for general digraphs in (such as, for example, Program 19.4) or the approach based on topological sorting that is described at the beginning of this section. On the other hand, if the number of down edges is large (or, equivalently, the number of cross edges is small), Program 19.9 will be significantly faster than these methods. The problem of finding an optimal algorithm (one that is guaranteed to finish in time proportional to V²) for computing the transitive closure of dense DAGs is still unsolved. The best known worst-case performance bound is VE. However, we are certainly better off using an algorithm that runs faster for a large class of DAGs, such as Program 19.9, than we are using one that always runs in time proportional to VE, such as Program 19.4. As we see in , this performance improvement for DAGs has direct implications for our ability to compute the transitive closure of general digraphs as well.

Transitive closure of a DAG

The constructor in this class accomplishes the transitive-closure computation with a single DFS if the graph given as parameter is a DAG. As with our other transitive-closure implementations, we maintain a dense-graph representation of the transitive closure as a private data member. The recursive method tcR computes the reachable vertices from w by computing the reachable vertices of its children in the DFS tree. Any vertex i that is reachable from a child t of w is also reachable from w.

class GraphTC
{ private Graph G; private DenseGraph D;
 private int cnt;
 private int[] pre;
 void tcR(int w)
 {
 pre[w] = cnt++;
 AdjList A = D.getAdjList(w);
 for (int t = A.beg(); !A.end(); t = A.nxt())
 {
 D.insert(new Edge(w, t));
 if (pre[t] > pre[w]) continue;
 if (pre[t] == -1) tcR(t);
 for (int i = 0; i < D.V(); i++)
 if (D.edge(t, i)) D.insert(new Edge(w, i));
 }
 }
 GraphTC(Graph G)
 { this.G = G; cnt = 0;
 D = GraphUtilities.densecopy(G);
 pre = new int[G.V()];
 for (int v = 0; v < D.V(); v++) pre[v] = -1;
 for (int v = 0; v < D.V(); v++)
 if (pre[v] == -1) tcR(v);
 }
 boolean reachable(int v, int w)
 { return D.edge(v, w); }
}

Exercises

19.115 Show, in the style of Screenshot, the reachability arrays that result when we use Program 19.9 to compute the transitive closure of the DAG
19.116 Develop a version of Program 19.9 that uses a representation of the transitive closure that does not support edge testing and that runs in time proportional to V² + _e v(e), where the sum is over all edges in the DAG and v(e) is the number of vertices reachable from the destination vertex of edge e. This cost will be significantly less than VE for some sparse DAGs (see Exercise 19.65).
19.117 Implement an abstract-transitive-closure class for DAGs that uses extra space at most proportional to V (and is suitable for huge DAGs). Use topological sorting to provide quick response when the vertices are not connected, and use a source-queue implementation to return the length of the path when they are connected.
19.118 Develop a transitive-closure implementation based on a sink-queue-based reverse topological sort (see Exercise 19.105).
19.119 Does your solution to Exercise 19.118 require that you examine all edges in the DAG, or are there edges that can be ignored, such as the down edges in DFS? Give an example that requires examination of all edges, or characterize the edges that can be skipped.