The next MST algorithm that we consider is also the oldest. Like Kruskal's algorithm, we build the MST by adding edges to a spreading forest of MST subtrees; but we do so in stages, adding several MST edges at each stage. At each stage, we find the shortest edge that connects each MST subtree with a different one, then add all such edges to the MST. Our union-find ADT from Chapter 1 again leads to an efficient implementation. For this problem, it is convenient to extend the interface to make the find operation available to clients. We use this method to associate an index with each subtree so that we can tell quickly to which subtree a given vertex belongs. With this capability, we can implement efficiently each of the operations for Boruvka's algorithm. First, we maintain a vertex-indexed array that identifies, for each MST subtree, the nearest neighbor. Then, we perform the following operations on each edge in the graph:
After this scan of all the graph edges, the nearest-neighbor array has the information that we need to connect the subtrees. For each vertex index, we perform a union operation to connect it with its nearest neighbor. In the next stage, we discard all the longer edges that connect other pairs of vertices in the now-connected MST subtrees. Figures 20.14 and 20.15 illustrate this process on our sample algorithm
Screenshot Boruvka's MST algorithm
The diagram at the top shows a directed edge from each vertex to its closest neighbor. These edges show that 0-2, 1-7, and 3-5 are each the shortest edge incident on both their vertices, 6-7 is 6's shortest edge, and 4-3 is 4's shortest edge. These edges all belong to the MST and comprise a forest of MST subtrees (center), as computed by the first phase of Boruvka's algorithm. In the second phase, the algorithm completes the MST computation (bottom) by adding the edge 0-7, which is the shortest edge incident on any of the vertices in the subtrees it connects, and the edge 4-7, which is the shortest edge incident on any of the vertices in the bottom subtree.
Screenshot Union-find array in Boruvka's algorithm
This figure depicts the contents of the union-find array corresponding to the example depicted in Screenshot. Initially, each entry contains its own index, indicating a forest of singleton vertices. After the first stage, we have three components, represented by the indices 0, 1, and 3 (the union-find trees are all flat for this tiny example). After the second stage, we have just one component, represented by 1.
Program 20.9 is a direct implementation of Boruvka's algorithm. There are three major factors that make this implementation efficient:
It is difficult to quantify precisely all these factors, but the following bound is easy to establish: Property 20.11 The running time of Boruvka's algorithm for computing the MST of a graph is O(E lg V lg* E). Proof: Since the number of trees in the forest is halved at each stage, the number of stages is no larger than lg V. The time for each stage is at most proportional to the cost of E find operations, which is less than E lg* E, or linear for practical purposes.
The running time given in Property 20.11 is a conservative upper bound since it does not take into account the substantial reduction in the number of edges during each stage. The find operations take constant time in the early passes, and there are very few edges in the later passes. Indeed, for many graphs, the number of edges decreases exponentially with the number of vertices, and the total running time is proportional to E. For example, as illustrated in Screenshot, the algorithm finds the MST of our larger sample graph in just four stages.
Screenshot Boruvka's MST algorithm
The MST evolves in just four stages for this example (top to bottom).
It is possible to remove the lg* E factor to lower the theoretical bound on the running time of Boruvka's algorithm to be proportional to E lg V, by representing MST subtrees with doubly linked lists instead of using the union and find operations. This improvement is sufficiently more complicated to implement and the potential performance improvement sufficiently marginal that it may not be worth considering for use in practice (see Exercises 20.67 and 20.68). Boruvka's is the oldest of the algorithms that we consider: It was originally conceived in 1926 for a power-distribution app. The method was rediscovered by Sollin in 1961; it later attracted attention as the basis for MST algorithms with efficient asymptotic performance and as the basis for parallel MST algorithms.