In this chapter, we concentrate on weighted undirected graphs—the most natural setting for MST problems. Perhaps the simplest way to begin is to extend the basic graph representations from to represent weighted graphs as follows: In the adjacency-matrix representation, the matrix can contain edge weights rather than Boolean values; in the adjacency-lists representation, we can add a field for the weights to the list elements that represent edges. This classic approach is appealing in its simplicity, but we will use a different method that is not much more complicated and will make our programs useful in more general settings. Comparative performance characteristics of the two approaches are briefly treated later in this section. To address the issues raised when we shift from graphs where our only interest is in the presence or absence of edges to graphs where we are interested in information associated with edges, it is useful to imagine scenarios where vertices and edges are objects of unknown complexity, perhaps part of a huge client database that is built and maintained by some other app. For example, we might like to view streets, roads, and highways in a map database as abstract edges with their lengths as weights, but, in the database, a road's entry might carry a substantial amount of other information such as its name and type, a detailed description of its physical properties, how much traffic is currently using it, and so forth. Now, a client program could extract the information that it needs, build a graph, process it, and then interpret the results in the context of the database, but that might be a difficult and costly process. In particular, it amounts to (at least) making a copy of the graph.
A better alternative is for the client to define an Edge data type and for our implementations to manipulate references to edges. Program 20.1 shows the details of the Edge and Graph ADTs that we define to support this approach. The Edge ADT in Program 20.1 has straightforward accessor methods for getting an edge's vertices and weight and two methods that give clients needed information about the orientation of the edge. Either e.from(v) is true, e.v() is v and e.other(v) is e.w(); or e.from(v) is false, e.w() is v and e.other(v) is e.v()). The need for these methods will become plain when we examine client code. The Graph ADT in Program 20.1 uses Edge in a straightforward manner. The AdjList interface is like the one we discussed in the text describing Program 17.1 except the return types of the beg() and nxt() methods are changed to be Edge instead of int. Our algorithms now will process the sequence of edges adjacent to a vertex instead of the sequence of adjacent vertices. Clients can easily meet our minimal requirements for the Edge data type while retaining the flexibility to define app-specific data types for use in other contexts. Our implementations can manipulate edge references and use the interface to extract the information they need from Edges without regard to how they are represented. For testing algorithms and basic apps, we can use an Edge implementation that has two ints and a double as private data members that are initialized from constructor arguments and are the return values of the v(), w(), and wt() member methods, respectively (see Exercise 20.8). To avoid proliferation of simple types, we use edge weights of type double throughout this chapter and . In our examples, we use edge weights that are real numbers between 0 and 1. This decision does not conflict with various alternatives that we might need in apps, because we can explicitly or implicitly rescale the weights to fit this model (see Exercises 20.1 and 20.11). For example, if the weights are positive integers less than a known maximum value, we can divide them all by the maximum value to convert them to real numbers between 0 and 1. If we wanted to do so, we could build a more general ADT interface and use any data type for edge weights that supports addition, subtraction, and comparisons, since we do little more with the weights than to accumulate sums and make decisions based on their values. In , our algorithms are concerned with comparing linear combinations of edge weights, and the running time of some algorithms depends on arithmetic properties of the weights, so we switch to integer weights to allow us to more easily analyze the algorithms.
Programs 20.2 and 20.3 implement the weighted-graph ADT of Program 20.1 with an adjacency-matrix representation. As before, inserting an edge into an undirected graph amounts to storing a reference to it in two places in the matrix—one for each orientation of the edge. As is true of algorithms that use the adjacency-matrix representation for unweighted graphs, the running time of any algorithm that uses this representation is proportional to V2 (for the Java system to initialize the matrix) or higher. With this representation, we test for the existence of an edge v-w by testing whether the reference in row v and column w is null. In some situations, we could avoid such tests by using sentinel values for the weights, but the implementations are more clearly understood with them present.
Program 20.4 gives a Graph implementation for the weighted-graph ADT, using edge references in an adjacency-lists representation, with the iterator left as an exercise. As in , a vertex-indexed array associates each vertex with a linked list of that vertex's incident edges. In this implementation, each list node contains a reference to an edge. As with the adjacency matrix, we could, if desired, save space by putting just the destination vertex and the weight in the list nodes (leaving the source vertex implicit) at the cost of a more complicated iterator (see Exercises 20.12 and 20.15).
Program 20.5 illustrates how client programs use Graph and Edge. It is an implementation of the edges method for our GraphUtilities class. We assume that the other simple methods in GraphUtilities and GraphIO that we have been using are updated in a similar manner. At this point, it is worthwhile to compare these representations with the simple representations that were mentioned at the beginning of this section (see Exercises 20.12 and 20.13). If we are building a graph from scratch, using references certainly requires more space. Not only do we need space for the references, but also we need space for the indices (vertex names), which are implicit in the simple representations. To use edge references in the adjacency-matrix representation, we need extra space for V2 edge references and E pairs of indices. Similarly, to use edge references in the adjacency-list representation we need extra space for E edge references and E indices. On the other hand, the use of edge references is likely to lead to faster code, because the compiled client code will follow one reference to get direct access to an edge's weight, in contrast to the simple representation, where an Edge needs to be constructed and its fields accessed. If the space cost is prohibitive, using the minimal representations (and perhaps streamlining the iterators to save time) certainly is a reasonable alternative; otherwise, the flexibility afforded by the references is probably worth the space. To reduce clutter, we do use the simple representations in all of our figures. That is, rather than showing a matrix of references to edge structures that contain pairs of integers and weights, we simply show a matrix of weights, and rather than showing list nodes that contain references to edge structures, we show nodes that contain destination vertices. The adjacency-matrix and adjacency-lists representations of our sample graph are shown in Screenshot.
Screenshot Weighted-graph representations (undirected)
The two standard representations of weighted undirected graphs include weights with each edge representation, as illustrated in the adjacency-matrix (left) and adjacency-lists (right) representation of the graph depicted in Screenshot. For simplicity in these figures, we show the weights in the matrix entries and list nodes; in our programs we use references to client edges. The adjacency matrix is symmetric and the adjacency lists contain two nodes for each edge, as in unweighted directed graphs. Nonexistent edges are represented by null references in the matrix (indicated by asterisks in the figure) and are not present at all in the lists. Self-loops are absent in both of the representations illustrated here because MST algorithms are simpler without them; other algorithms that process weighted graphs use them (see ).
As with our undirected-graph implementations, we do not explicitly test for parallel edges in either implementation. Depending on the app, we might alter the adjacency-matrix representation to keep the parallel edge of lowest or highest weight, or to effectively coalesce parallel edges to a single edge with the sum of their weights. In the adjacency-lists representation, we allow parallel edges to remain in the data structure, but we could build more powerful data structures to eliminate them using one of the rules just mentioned for adjacency matrices (see Exercise 17.49). How should we represent the MST itself? The MST of a graph G is a subgraph of G that is also a tree, so we have numerous options. Chief among them are
Screenshot MST representations
This figure depicts various representations of the MST in Screenshot. The most straightforward is a list of its edges, in no particular order (left). The MST is also a sparse graph, and might be represented with adjacency lists (center). The most compact is a parent-link representation: We choose one of the vertices as the root and keep two vertex-indexed vectors, one with the parent of each vertex in the tree, the other with the weight of the edge from each vertex to its parent (right). The orientation of the tree (choice of root vertex) is arbitrary, not a property of the MST. We can convert from any one of these representations to any other in linear time.
The same tree might have many different representations in any of the schemes. In what order should the edges be presented in the list-of-edges representation? Which node should be chosen as the root in the parent-link representation (see Exercise 20.22)? Generally speaking, when we run an MST algorithm, the particular MST representation that results is an artifact of the algorithm used and does not reflect any important features of the MST. From an algorithmic point of view, the choice of MST representation is of little consequence because we can convert easily from each of these representations to any of the others. To convert from the graph representation to an array of edges, we can use the edges method of Program 20.5. To convert from the parent-link representation in an array st (with weights in another array wt) to an array of references to edges in mst, we can use the loop
for (k = 1; k < G.V(); k++) mst[k] = new Edge(k, st[k], wt[k]);
This code is for the typical case where the MST is rooted at 0, and it does not put the dummy edge 0-0 onto the MST edge list. These two conversions are trivial, but how do we convert from the array-of-edge-references representation to the parent-link representation? We have the basic tools to accomplish this task easily as well: We convert to the graph representation using a loop like the one just given (changed to invoke insert for each edge), then run a DFS starting at any vertex to compute the parent-link representation of the DFS tree, in linear time. In short, although the choice of MST representation is a matter of convenience, we package all of our algorithms in a graph-processing class GraphMST that computes a private array mst of references to edges. Depending on the needs of apps, we can implement accessor methods for this class to return this array or to give client programs other information about the MST, but we do not specify further details of this interface (see Exercises 20.8 and 20.9).