Simple, Euler, and Hamilton Paths
Our first nontrivial graph-processing algorithms solve fundamental problems concerning paths in graphs. They introduce the general recursive paradigm that we use throughout the tutorial, and they illustrate that apparently similar graph-processing problems can range widely in difficulty. These problems take us from local properties such as the existence of edges or the degrees of vertices to global properties that tell us about a graph's structure. The most basic such property is whether two vertices are connected. If they are, we are interested in finding a simple path that connects them. Simple path Given two vertices, is there a simple path in the graph that connects them? In some apps, we might be satisfied to know merely whether or not such a path exists, but we are concerned here with the problem of finding a specific path. Program 17.16 is a direct solution that finds a path. It is based on depth-first search, a fundamental graph-processing paradigm that we considered briefly in Chapters 3 and 5 and shall study in detail in . The algorithm is based on a recursive private method that determines whether there is a simple path from v to w by checking, for each edge v-t incident on v, whether there is a simple path from t to w that does not go through v. It uses a vertex-indexed array to mark v so that no path through v will be checked in any recursive invocation. The code in Program 17.16 simply tests for the existence of a path. How can we augment it to print the path's edges? Thinking recursively suggests an easy solution:
Alone, the first change would cause the path from v to w to be printed in reverse order: If the invocation searchR(t, w) finds a path from t to w (and prints that path's edges in reverse order), then printing t-v completes the job for the path from v to w. The second change reverses the order: To print the edges on the path from v to w, we print the edges on the path from w to v in reverse order. (This trick depends on the graph being undirected, since the algorithm returns a path that goes in the opposite direction from the one it traversed.) We could use this same strategy to implement an ADT operation that invokes a client-supplied method for each of the path's edges (see Exercise 17.88).
Screenshot gives an example of the dynamics of the recursion. As with any recursive program (indeed, any program with method invocations at all), such a trace is easy to produce: To modify Program 17.16 to produce one, we can add a variable depth that is incremented on entry and decremented on exit to keep track of the depth of the recursion, then add code at the beginning of the recursive method to print out depth spaces followed by the appropriate information (see Exercises 17.86 and 17.87).
Screenshot Trace for simple path search
This trace shows the operation of the recursive method in Program 17.16 for the call searchR(G, 2, 6) to find a simple path from 2 to 6 in the graph shown at the top. There is a line in the trace for each edge considered, indented one level for each recursive call. To check 2-0, we call searchR(G, 0, 6). This call causes us to check 0-1, 0-2, and 0-5. To check 0-1, we call searchR(G, 1, 6), which causes us to check 1-0 and 1-2, which do not lead to recursive calls because 0 and 2 are marked. For this example, the method discovers the path 2-0-5-4-6.
Property 17.2 We can find a path connecting two given vertices in a graph in linear time. The recursive depth-first search method in Program 17.16 immediately implies a proof by induction that the ADT operation determines whether or not a path exists. Such a proof is easily extended to establish that, in the worst case, Program 17.16 checks all the entries in the adjacency matrix exactly once. Similarly, we can show that the analogous program for adjacency lists checks all of the graph edges exactly twice (once in each direction), in the worst case.
We use the phrase linear in the context of graph algorithms to mean a quantity whose value is within a constant factor of V + E, the size of the graph. As discussed at the end of , such a value is also normally within a constant factor of the size of the graph representation. Property 17.2, is worded so as to allow for the use of the adjacency-lists representation for sparse graphs and the adjacency-matrix representation for dense graphs, our general practice. It is not appropriate to use the term "linear" to describe an algorithm that uses an adjacency matrix and runs in time proportional to V2 (even though it is linear in the size of the graph representation) unless the graph is dense. Indeed, if we use the adjacency-matrix representation for a sparse graph, we cannot have a linear-time algorithm for any graph-processing problem that could require examination of all the edges. We study depth-first search in detail in a more general setting in the next chapter, and we consider several other connectivity algorithms there. For example, a slightly more general version of Program 17.16 gives us a way to pass through all the edges in the graph, building a vertex-indexed array that allows a client to test in constant time whether there exists a path connecting any two vertices. Property 17.2 can substantially overestimate the actual running time of Program 17.16, because it might find a path after examining only a few edges. For the moment, our interest is in knowing a method that is guaranteed to find in linear time a path connecting any two vertices in any graph. By contrast, other problems that appear similar are much more difficult to solve. For example, consider the following problem, where we seek paths connecting pairs of vertices, but add the restriction that they visit all the other vertices in the graph, as well. Hamilton path Given two vertices, is there a simple path connecting them that visits every vertex in the graph exactly once? If the path is from a vertex back to itself, this problem is known as the Hamilton tour problem. Is there a cycle that visits every vertex in the graph exactly once (see Screenshot)?
Screenshot Hamilton tour
The graph at the top has the Hamilton tour 0-6-4-2-1-3-5-0, which visits each vertex exactly once and returns to the start vertex, but the graph at the bottom has no such tour.
At first blush, this problem seems to admit a simple solution: We can use the simple modification to the recursive part of the path-finding class that is shown in Program 17.17. But this program is not likely to be useful for many graphs, because its worst-case running time is exponential in the number of vertices in the graph. Property 17.3 A recursive search for a Hamilton tour could take exponential time. Proof: Consider a graph where vertex V-1 is isolated and the edges, with the other V – 1 vertices, constitute a complete graph. Program 17.17 will never find a Hamilton path, but it is easy to see by induction that it will examine all of the (V – 1)! paths in the complete graph, all of which involve V – 1 recursive invocations. The total number of recursive invocations is therefore about V!, or about (V/e)V, which is higher than any constant to the Vth power.
Our implementations Program 17.16 for finding simple paths and Program 17.17 for finding Hamilton paths are extremely similar. If no path exists, both programs terminate when all the elements of the visited array are set to true. Why are the running times so dramatically different? Program 17.16 is guaranteed to finish quickly because it sets at least one element of the visited array to true each time searchR is invoked. Program 17.17, on the other hand, can set visited elements back to 0, so we cannot guarantee that it will finish quickly (see Screenshot).
Screenshot Hamilton-tour–search trace
This trace shows the edges checked by Program 17.17 when discovering that the graph shown at the top has no Hamilton tour. For brevity, edges to marked vertices are omitted.
When searching for simple paths, in Program 17.16, we know that, if there is a path from v to w, we will find it by taking one of the edges v-t from v, and the same is true for Hamilton paths. But there this similarity ends. If we cannot find a simple path from t to w, then we can conclude that there is no simple path from v to w that goes through t; but the analogous situation for Hamilton paths does not hold. It could be the case that there is no Hamilton path to w that starts with v-t, but there is one that starts with v-x-t for some other vertex x. We have to make a recursive invocation from t corresponding to every path that leads to it from v. In short, we may have to check every path in the graph. It is worthwhile to reflect on just how slow a factorial-time algorithm is. If we could process a graph with 15 vertices in 1 second, it would take 1 day to process a graph with 19 vertices, over 1 year for 21 vertices, and over 6 centuries for 23 vertices. Faster computers do not help much, either. A computer that is 200,000 times faster than our original one would still take more than a day to solve that 23 vertex problem. The cost to process graphs with 100 or 1000 vertices is too high to contemplate, let alone graphs of the size that we might encounter in practice. It would take millions of pages in this tutorial just to write down the number of centuries required to process a graph that contained millions of vertices. In Chapter 5, we examined a number of simple recursive programs that are similar in character to Program 17.17 but that could be drastically improved with top-down dynamic programming. This recursive program is entirely different in character: The number of intermediate results that would have to be saved is exponential. Despite many people doing an extensive amount of work on the problem, no one has been able to find any algorithm that can promise reasonable performance for large (or even medium-sized) graphs. Now, suppose that we change the restriction from having to visit all the vertices to having to visit all the edges. Is this problem easy, like finding a simple path, or hopelessly difficult, like finding a Hamilton path? Euler path Is there a path connecting two given vertices that uses each edge in the graph exactly once? The path need not be simple—vertices may be visited multiple times. If the path is from a vertex back to itself, we have the Euler tour problem. Is there a cyclic path that uses each edge in the graph exactly once? We prove in the corollary to Property 17.4 that the path problem is equivalent to the tour problem in a graph formed by adding an edge connecting the two vertices. Screenshot gives two small examples.
Screenshot Euler tour and path examples
The graph at the top has the Euler tour 0-1-2-0-6-4-3-2-4-5-0, which uses all the edges exactly once. The graph at the bottom no such tour, but it has the Euler path 1-2-0-1-3-4-2-3-5-4-6-0-5.
This classical problem was first studied by L. Euler in 1736. Indeed, some people trace the origin of the study of graphs and graph theory to Euler's work on this problem, starting with a special case known as the bridges of Königsberg problem (see Screenshot). The Swiss town of Königsberg had seven bridges connecting riverbanks and islands, and people in the town found that they could not seem to cross all the bridges without crossing one of them twice. Their problem amounts to the Euler tour problem.
Screenshot Bridges of Königsberg
A well-known problem studied by Euler has to do with the town of Königsberg, in which there is an island at the point where the river Pregel forks. There are seven bridges connecting the island with the two banks of the river and the land between the forks, as shown in the diagram at top. Is there a way to cross the seven bridges in a continuous walk through the town, without recrossing any of them? If we label the island 0, the banks 1 and 2, and the land between the forks 3 and define an edge corresponding to each bridge, we get the multigraph shown at the bottom. The problem is to find a path through this graph that uses each edge exactly once.
These problems are familiar to puzzle enthusiasts. They are commonly seen in the form of puzzles where you are to draw a given figure without lifting your pencil from the paper, perhaps under the constraint that you must start and end at particular points. It is natural for us to consider Euler paths when developing graph-processing algorithms, because an Euler path is an efficient representation of the graph (putting the edges in a particular order) that we might consider as the basis for developing efficient algorithms. Euler showed that it is easy to determine whether or not a path exists, because all that we need to do is to check the degree of each of the vertices. The property is easy to state and apply, but the proof is a tricky exercise in graph theory. Property 17.4 A graph has an Euler tour if and only if it is connected and all its vertices are of even degree. Proof: To simplify the proof, we allow self-loops and parallel edges, though it is not difficult to modify the proof to show that this property also holds for simple graphs (see Exercise 17.94). If a graph has an Euler tour, then it must be connected because the tour defines a path connecting each pair of vertices. Also, any given vertex v must be of even degree because when we traverse the tour (starting anywhere else), we enter v on one edge and leave on a different edge (neither of which appears again on the tour); so the number of edges incident upon v must be twice the number of times we visit v when traversing the tour, an even number. To prove the converse, we use induction on the number of edges. The claim is certainly true for graphs with no edges. Consider any connected graph that has more than one edge, with all vertices of even degree. Suppose that, starting at any vertex v, we follow and remove any edge, and we continue doing so until arriving at a vertex that has no more edges. This process certainly must terminate, since we delete an edge at every step, but what are the possible outcomes? Screenshot illustrates examples. Immediately, we see that we must end up back at v, because we end at a vertex other than v if and only if that vertex had an odd degree when we started.
Screenshot Partial tours
Following edges from any vertex in a graph that has an Euler tour always takes us back to that vertex, as shown in these examples. The cycle does not necessarily use all the edges in the graph.
One possibility is that we trace the full tour; if so, we are done. Otherwise, all the vertices in the graph that remains have even degree, but it may not be connected. Still, each connected component has an Euler tour by the inductive hypothesis. Moreover, the cyclic path just removed connects those tours together into an Euler tour for the original graph: traverse the cyclic path, taking detours to do the Euler tours for the connected components. Each detour is a proper Euler tour that takes us back to the vertex on the cyclic path where it started. Note that a detour may touch the cyclic path multiple times (see Exercise 17.98). In such a case, we take the detour only once (say, when we first encounter it).
Corollary A graph has an Euler path if and only if it is connected and exactly two of its vertices are of odd degree. Proof: This statement is equivalent to Property 17.4 in the graph formed by adding an edge connecting the two vertices of odd degree (the ends of the path).
Therefore, for example, there is no way for anyone to traverse all the bridges of Königsberg in a continuous path without retracing their steps, since all four vertices in the corresponding graph have odd degree (see Screenshot). As discussed in , we can find all the vertex degrees in time proportional to E for the adjacency-lists or set-of-edges representation and in time proportional to V2 for the adjacency-matrix representation, or we can maintain a vertex-indexed array with vertex degrees as part of the graph representation (see Exercise 17.42). Given the array, we can check whether Property 17.4 is satisfied in time proportional to V. Program 17.18 implements this strategy and demonstrates that determining whether a given graph has an Euler path is an easy computational problem. This fact is significant because we have little intuition to suggest that the problem should be easier than determining whether a given graph has a Hamilton path. Now, suppose that we actually wish to find an Euler path. We are treading on thin ice because a direct recursive implementation (find a path by trying an edge and then making a recursive invocation to find a path for the rest of the graph) will have the same kind of factorial-time performance as Program 17.17. We expect not to have to live with such performance because it is so easy to test whether a path exists, so we seek a better algorithm. It is possible to avoid factorial-time blowup with a fixed-cost test for determining whether or not to use an edge (rather than unknown costs from the recursive invocation), but we leave this approach as an exercise (see Exercises 17.96 and 17.97). Another approach is suggested by the proof of Property 17.4. Traverse a cyclic path, deleting the edges encountered and pushing onto a stack the vertices that it encounters, so that (i) we can trace back along that path, printing out its edges, and (ii) we can check each vertex for additional side paths (which can be spliced into the main path). This process is illustrated in Screenshot.
Screenshot Euler tour by removing cycles
This figure shows how Program 17.19 discovers an Euler tour from 0 back to 0 in a sample graph. Thick black edges are those on the tour, the stack contents are listed below each diagram, and adjacency lists for the non-tour edges are shown at left. First, the program adds the edge 0-1 to the tour and removes it from the adjacency lists (in two places) (top left, lists at left). Second, it adds 1-2 to the tour in the same way (left, second from top). Next, it winds up back at 0 but continues to do another cycle 0-5-4-6-0, winding up back at 0 with no more edges incident upon 0 (right, second from top). Then it pops the isolated vertices 0 and 6 from the stack until 4 is at the top and starts a tour from 4 (right, third from from top), which takes it to 3, 2, and back to 4, where upon it pops all the now-isolated vertices 4, 2, 3, and so forth. The sequence of vertices popped from the stack defines the Euler tour 0-6-4-2-3-4-5-0-2-1-0 of the whole graph.
Program 17.19 is an implementation along these lines. It assumes that an Euler path exists, and it destroys its local copy of the graph; thus, it is important that the Graph class that this program uses have a copy constructor that creates a completely separate copy of the graph. The code is tricky—novices may wish to postpone trying to understand it until gaining more experience with graph-processing algorithms in the next few chapters. Our purpose in including it here is to illustrate that good algorithms and clever implementations can be very effective for solving some graph-processing problems.
Property 17.5 We can find an Euler tour in a graph, if one exists, in linear time. We leave a full induction proof as an exercise (see Exercise 17.100). Informally, after the first invocation of path, the stack contains a path from v to w, and the graph that remains (after removal of isolated vertices) consists of the smaller connected components (sharing at least one vertex with the path so far found) that also have Euler tours. We pop isolated vertices from the stack and use path to find Euler tours that contain the nonisolated vertices, in the same manner. Each edge in the graph is pushed onto (and popped from) the stack exactly once, so the total running time is proportional to E.
Despite their appeal as a systematic way to traverse all the edges and vertices, we rarely use Euler tours in practice because few graphs have them. Instead, we typically use depth-first search to explore graphs, as described in detail in . Indeed, as we shall see, doing depth-first search in an undirected graph amounts to computing a two-way Euler tour: a path that traverses each edge exactly twice, once in each direction. In summary, we have seen in this section that it is easy to find simple paths in graphs, that it is even easier to know whether we can tour all the edges of a large graph without revisiting any of them (by just checking that all vertex degrees are even), and that there is even a clever algorithm to find such a tour; but that it is practically impossible to know whether we can tour all the graph's vertices without revisiting any. We have simple recursive solutions to all these problems, but the potential for exponential growth in the running time renders some of the solutions useless in practice. Others provide insights that lead to fast practical algorithms. This range of difficulty among apparently similar problems that is illustrated by these examples is typical in graph processing and is fundamental to the theory of computing. As discussed briefly in and in detail in Part 8, we must acknowledge what seems to be an insurmountable barrier between problems that seem to require exponential time (such as the Hamilton tour problem and many other commonly encountered problems) and problems for which we know algorithms that can guarantee to find a solution in polynomial time (such as the Euler tour problem and many other commonly encountered problems). In this tutorial, our primary objective is to develop efficient algorithms for problems in the latter class.