Prim's algorithm builds the MST one edge at a time, finding a new edge to attach to a single growing tree at each step. Kruskal's algorithm also builds the MST one edge at a time; but, by contrast, it finds an edge that connects two trees in a spreading forest of growing MST subtrees. We start with a degenerate forest of V single-vertex trees and perform the operation of combining two trees (using the shortest edge possible) until there is just one tree left: the MST. Screenshot shows a step-by-step example of the operation of Kruskal's algorithm; Screenshot illustrates the algorithm's dynamic characteristics on a larger example. The disconnected forest of MST subtrees evolves gradually into a tree. Edges are added to the MST in order of their length, so the forests comprise vertices that are connected to one another by relatively short edges. At any point during the execution of the algorithm, each vertex is closer to some vertex in its subtree than to any vertex not in its subtree.
Screenshot Kruskal's MST algorithm
Given a list of a graph's edges in arbitrary order (left edge list), the first step in Kruskal's algorithm is to sort them by weight (right edge list). Then we go through the edges on the list in order of their weight, adding edges that do not create cycles to the MST. We add 5-3 (the shortest edge), then 7-1, then 7-6 (left), then 0-2 (right, top) and 0-7 (right, second from top). The edge with the next largest weight, 0-1, creates a cycle and is not included. Edges that we do not add to the MST are shown in gray on the sorted list. Then we add 4-3 (right, third from top). Next, we reject 5-4 because it causes a cycle, then we add 7-4 (right, bottom). Once the MST is complete, any edges with larger weights would cause cycles and be rejected (we stop when we have added V – 1 edges to the MST). These edges are marked with asterisks on the sorted list.
Screenshot Kruskal's MST algorithm
This sequence shows 1/4, 1/2, 3/4, and the full MST as it evolves.
Kruskal's algorithm is simple to implement, given the basic algorithmic tools that we have considered in this tutorial. Indeed, we can use any sort from Part 3 to sort the edges by weight and any of the connectivity algorithms from Chapter 1 to eliminate those that cause cycles! Program 20.8 is an implementation along these lines of an MST class for a graph ADT that is functionally equivalent to the other MST implementations that we consider in this chapter. The implementation does not depend on the graph representation: It uses GraphUtilities.edges to get an array that contains the graph's edges, then computes the MST from that array. Note that there are two ways in which Kruskal's algorithm can terminate. If we find V – 1 edges, then we have a spanning tree and can stop. If we examine all the edges without finding V – 1 tree edges, then we have determined that the graph is not connected, precisely as we did in Chapter 1. Analyzing the running time of Kruskal's algorithm is a simple matter because we know the running time of its constituent ADT operations. Property 20.9 Kruskal's algorithm computes the MST of a graph in time proportional to E lg E. Proof: This property is a consequence of the more general observation that the running time of Program 20.8 is proportional to the cost of sorting E numbers plus the cost of E find and V – 1 union operations. If we use standard ADT implementations such as mergesort and weighted union-find with halving, the cost of sorting dominates.
We consider the comparative performance of Kruskal's and Prim's algorithm in . For the moment, note that a running time proportional to E lg E is not necessarily worse than E lg V, because E is at most V2, so lg E is at most 2lgV. Performance differences for specific graphs are due to what the properties of the implementations are and to whether the actual running time approaches these worst-case bounds. In practice, we might use quicksort or a fast system sort (which is likely to be based on quicksort). Although this approach may give the usual unappealing (in theory) quadratic worst case for the sort, it is likely to lead to the fastest run time. Indeed, on the other hand, we could use a radix sort to do the sort in linear time (under certain conditions on the weights) so that the cost of the E find operations dominates and then adjust Property 20.9 to say that the running time of Kruskal's algorithm is within a constant factor of E lg* E under those conditions on the weights (see Chapter 2). Recall that the function lg* E is the number of iterations of the binary logarithm function before the result is less than 1, which is less than 5 if E is less than 265536. In other words, these adjustments make Kruskal's algorithm effectively linear in most practical circumstances. Typically, the cost of finding the MST with Kruskal's algorithm is even lower than the cost of processing all edges because the MST is complete well before a substantial fraction of the (long) graph edges is ever considered. We can take this fact into account to reduce the running time significantly in many practical situations by keeping edges that are longer than the longest MST edge entirely out of the sort. One easy way to accomplish this objective is to use a priority queue, with an implementation that does the construct operation in linear time and the remove the minimum operation in logarithmic time. For example, we can achieve these performance characteristics with a standard heap implementation, using bottom-up construction (see Section 9.4). Specifically, we make the following changes to Program 20.8: First, we change the sort invocation to a call on pq.construct() to build a heap in time proportional to E. Second, we change the inner loop to take the shortest edge off the priority queue with e = pq.delmin() and to change all references to a[i] to refer to e. Property 20.10 A priority-queue–based version of Kruskal's algorithm computes the MST of a graph in time proportional to E + X lg V, where X is the number of graph edges not longer than the longest edge in the MST. Proof: See the preceding discussion, which shows the cost to be the cost of building a priority queue of size E plus the cost of running the X delete the minimum, X find, and V – 1 union operations. Note that the priority-queue–construction costs dominate (and the algorithm is linear time) unless X is greater than E/lg V.
We can also apply the same idea to reap similar benefits in a quicksort-based implementation. Consider what happens when we use a straight recursive quicksort, where we partition at i, then recursively sort the subfile to the left of i and the subfile to the right of i. We note that, by construction of the algorithm, the first i elements are in sorted order after completion of the first recursive call (see Program 9.2). This obvious fact leads immediately to a fast implementation of Kruskal's algorithm: If we put the check for whether the edge a[i] causes a cycle between the recursive calls, then we have an algorithm that, by construction, has checked the first i edges, in sorted order, after completion of the first recursive call! If we include a test to return when we have found V-1 MST edges, then we have an algorithm that sorts only as many edges as we need to compute the MST, with a few extra partitioning stages involving larger elements (see Exercise 20.58). Like straight sorting implementations, this algorithm could run in quadratic time in the worst case, but we can provide a probabilistic guarantee that the worst-case running time will not come close to this limit. Also, like straight sorting implementations, this program is likely to be faster than a heap-based implementation because of its shorter inner loop. If the graph is not connected, the partial-sort versions of Kruskal's algorithm offer no advantage because all the edges have to be considered. Even for a connected graph, the longest edge in the graph might be in the MST, so any implementation of Kruskal's method would still have to examine all the edges. For example, the graph might consist of tight clusters of vertices all connected together by short edges, with one outlier connected to one of the vertices by a long edge. Despite such anomalies, the partial-sort approach is probably worthwhile because it offers significant gain when it applies and incurs little if any extra cost. Historical perspective is relevant and instructive here as well. Kruskal presented this algorithm in 1956, but, again, the relevant ADT implementations were not carefully studied for many years, so the performance characteristics of implementations such as the priority-queue version of Program 20.8 were not well understood until the 1970s. Other interesting historical notes are that Kruskal's paper mentioned a version of Prim's algorithm (see Exercise 20.60) and that Boruvka mentioned both approaches. Efficient implementations of Kruskal's method for sparse graphs preceded implementations of Prim's method for sparse graphs because union-find (and sort) ADTs came into use before priority-queue ADTs. Generally, as was true of implementations of Prim's algorithm, advances in the state of the art for Kruskal's algorithm are attributed primarily to advances in ADT performance. On the other hand, the applicability of the union-find abstraction to Kruskal's algorithm and the applicability of the priority-queue abstraction to Prim's algorithm have been prime motivations for many researchers to seek better implementations of those ADTs.