Minimum Spanning Trees

Graph models where we associate weights or costs with each edge are called for in many apps. In an airline map where edges represent flight routes, these weights might represent distances or fares. In an electric circuit where edges represent wires, the weights might represent the length of the wire, its cost, or the time that it takes a signal to propagate through it. In a job-scheduling problem, weights might represent time or the cost of performing tasks or of waiting for tasks to be performed. Questions that entail cost minimization naturally arise for such situations. We examine algorithms for two such problems: (i) find the lowest-cost way to connect all of the points, and (ii) find the lowest-cost path between two given points. The first type of algorithm, which is useful for undirected graphs that represent objects such as electric circuits, finds a minimum spanning tree; it is the subject of this chapter. The second type of algorithm, which is useful for digraphs that represent objects such as an airline route map, finds the shortest paths; it is the subject of . These algorithms have broad applicability beyond circuit and map apps, extending to a variety of problems that arise on weighted graphs. When we study algorithms that process weighted graphs, our intuition is often supported by thinking of the weights as distances: We speak of "the vertex closest to x," and so forth. Indeed, the term "shortest path" embraces this bias. Despite numerous apps where we actually do work with distance and despite the benefits of geometric intuition in understanding the basic algorithms, it is important to remember that the weights do not need to be proportional to a distance at all; they might represent time or cost or an entirely different variable. Indeed, as we see in , weights in shortest-paths problems can even be negative. To appeal to intuition in describing algorithms and examples while still retaining general applicability, we use ambiguous terminology where we refer interchangeably to edge lengths and weights. When we refer to a "short" edge, we mean a "low-weight" edge, and so forth. For most of the examples in this chapter, we use weights that are proportional to the distances between the vertices, as shown in Screenshot. Such graphs are more convenient for examples, because we do not need to carry the edge labels and can still tell at a glance that longer edges have weights higher than those of shorter edges. When the weights do represent distances, we can consider algorithms that gain efficiency by taking into account geometric properties (s 20.7 and 21.5). With that exception, the algorithms that we consider simply process the edges and do not take advantage of any implied geometric information (see Screenshot).

Screenshot A weighted undirected graph and its MST

A weighted undirected graph is a set of weighted edges. The MST is a set of edges of minimal total weight that connects the vertices (black in the edge list, thick edges in the graph drawing). In this particular graph, the weights are proportional to the distances between the vertices, but the basic algorithms that we consider are appropriate for general graphs and make no assumptions about the weights (see Screenshot).

Java graphics 20fig01.gif

Screenshot Arbitrary weights

In this example, the edge weights are arbitrary and do not relate to the geometry of the drawn graph representation at all. This example also illustrates that the MST is not necessarily unique if edge weights may be equal: we get one MST by including 3-4 (shown) and a different MST by including 0-5 instead (although 7-6, which has the same weight as those two edges, is not in any MST).

Java graphics 20fig02.gif

The problem of finding the minimum spanning tree of an arbitrary weighted undirected graph has numerous important apps, and algorithms to solve it have been known since at least the 1920s; but the efficiency of implementations varies widely, and researchers still seek better methods. In this section, we examine three classical algorithms that are easily understood at a conceptual level; in s 20.3 through 20.5, we examine implementations of each in detail; and in , we consider comparisons of and improvements on these basic approaches. Definition 20.1 A minimum spanning tree (MST) of a weighted graph is a spanning tree whose weight (the sum of the weights of its edges) is no larger than the weight of any other spanning tree. If the edge weights are all positive, it suffices to define the MST as the set of edges with minimal total weight that connects all the vertices, as such a set of edges must form a spanning tree. The spanning-tree condition in the definition is included so that it applies for graphs that may have negative edge weights (see Exercises 20.2 and 20.3). If edges can have equal weights, the minimum spanning tree may not be unique. For example, Screenshot shows a graph that has two different MSTs. The possibility of equal weights also complicates the descriptions and correctness proofs of some of our algorithms. We have to consider equal weights carefully, because they are not unusual in apps and we want to know that our algorithms operate correctly when they are present. Not only might there be more than one MST, but also the nomenclature does not capture precisely the concept that we are minimizing the weight rather than the tree itself. The proper adjective to describe a specific tree is minimal (one having the smallest weight). For these reasons, many authors use more accurate terms like minimal spanning tree or minimum-weight spanning tree. The abbreviation MST, which we shall use most often, is universally understood to capture the basic concept. Still, to avoid confusion when describing algorithms for networks that may have edges with equal weights, we do take care to be precise to use the term "minimal" to refer to "an edge of minimum weight" (among all edges in some specified set) and "maximal" to refer to "an edge of maximum weight." That is, if edge weights are distinct, a minimal edge is the shortest edge (and is the only minimal edge); but if there is more than one edge of minimum weight, any one of them might be a minimal edge. We work exclusively with undirected graphs in this chapter. The problem of finding a minimum-weight directed spanning tree in a digraph is different, and is more difficult. Several classical algorithms have been developed for the MST problem. These methods are among the oldest and most well-known algorithms in this tutorial. As we have seen before, the classical methods provide a general approach, but modern algorithms and data structures can give us compact and efficient implementations. Indeed, these implementations provide a compelling example of the effectiveness of careful ADT design and proper choice of fundamental ADT data structure and algorithm implementations in solving increasingly difficult algorithmic problems.