In apps where networks model maps, our primary interest is often in finding the best route from one place to another. In this section, we examine a strategy for this problem: a fast algorithm for the source–sink shortest-path problem in Euclidean networks, which are networks whose vertices are points in the plane and whose edge weights are defined by the geometric distances between the points. These networks satisfy two important properties that do not necessarily hold for general edge weights. First, the distances satisfy the triangle inequality: The distance from s to d is never greater than the distance from s to x plus the distance from x to d. Second, vertex positions give a lower bound on path length: No path from s to d will be shorter than the distance from s to d. The algorithm for the source–sink shortest-paths problem that we examine in this section takes advantage of these two properties to improve performance. Often, Euclidean networks are also symmetric: Edges run in both directions. As mentioned at the beginning of the chapter, such networks arise immediately if, for example, we interpret the adjacency-matrix or adjacency-lists representation of an undirected weighted Euclidean graph (see ) as a weighted digraph (network). When we draw an undirected Euclidean network, we assume this interpretation to avoid proliferation of arrowheads in the drawings. The basic idea is straightforward: Priority-first search provides us with a general mechanism to search for paths in graphs. With Dijkstra's algorithm, we examine paths in order of their distance from the start vertex. This ordering ensures that, when we reach the sink, we have examined all paths in the graph that are shorter, none of which took us to the sink. But in a Euclidean graph, we have additional information: If we are looking for a path from a source s to a sink d and we encounter a third vertex v, then we know that not only do we have to take the path that we have found from s to v, but also the best that we could possibly do in traveling from v to d is first to take an edge v-w and then to find a path whose length is the straight-line distance from w to d (see Screenshot). With priority-first search, we can easily take into account this extra information to improve performance. We use the standard algorithm, but we use the sum of the following three quantities as the priority of each edge v-w: the length of the known path from s to v, the weight of the edge v-w, and the distance from w to t. If we always pick the edge for which this number is smallest, then, when we reach t, we are still assured that there is no shorter path in the graph from s to t. Furthermore, in typical networks we reach this conclusion after doing far less work than we would were we using Dijkstra's algorithm.
Screenshot Edge relaxation (Euclidean)
In a Euclidean graph, we can take distances to the destination into account in the relaxation operation as we compute shortest paths. In this example, we could deduce that the path depicted from s to v plus v-w cannot lead to a shorter path from s to d than the one already found because the length of any such path must be at least the length of the path from s to v plus the length of v-w plus the straight-line distance from w to d, which is greater than the length of the known path from s to d. Tests like this one can significantly reduce the number of paths that we have to consider.
To implement this approach, we use a standard PFS implementation of Dijkstra's algorithm (Program 21.1, since Euclidean graphs are normally sparse, but also see Exercise 21.73) with two changes: First, instead of initializing wt[s] at the beginning of the search to 0.0, we set it to the quantity dist(s, d), where distance is a method that returns the distance between two vertices. Second, we define the priority P to be the function
(wt[v] + e.wt() + distance(w, d) - distance(v, d))
instead of the function (wt[v] + e.wt()) that we used in Program 21.1 (recall that v and w are local variables that are set to the values e.v() and e.w(), respectively). These changes, to which we refer as the Euclidean heuristic, maintain the invariant that the quantity wt[v] - distance(v, d) is the length of the shortest path through the network from s to v, for every tree vertex v (and therefore wt[v] is a lower bound on the length of the shortest possible path through v from s to d). We compute wt[w] by adding to this quantity the edge weight (the distance to w) plus the distance from w to the sink d. Property 21.11 Priority-first search with the Euclidean heuristic solves the source–sink shortest-paths problem in Euclidean graphs. Proof: The proof of Property 21.2 applies: At the time that we add a vertex x to the tree, the addition of the distance from x to d to the priority does not affect the reasoning that the tree path from s to x is a shortest path in the graph from s to x, since the same quantity is added to the length of all paths to x. When d is added to the tree, we know that no other path from s to d is shorter than the tree path, because any such path must consist of a tree path followed by an edge to some vertex w that is not on the tree, followed by a path from w to d (whose length cannot be shorter than the distance from w to d); and, by construction, we know that the length of the path from s to w plus the distance from w to d is no smaller than the length of the tree path from s to d.
In , we discuss another simple way to implement the Euclidean heuristic. First, we make a pass through the graph to change the weight of each edge: For each edge v-w, we add the quantity distance(w, d) - distance(v, d). Then, we run a standard shortest-path algorithm, starting at s (with wt[s] initialized to distance(s, d)) and stopping when we reach d. This method is computationally equivalent to the method that we have described (which essentially computes the same weights on the fly) and is a specific example of a basic operation known as reweighting a network. Reweighting plays an essential role in solving the shortest-paths problems with negative weights; we discuss it in detail in . The Euclidean heuristic affects the performance but not the correctness of Dijkstra's algorithm for the source–sink shortest-paths computation. As discussed in the proof of Property 21.2, using the standard algorithm to solve the source–sink problem amounts to building an SPT that has all vertices closer to the start than the sink d. With the Euclidean heuristic, the SPT contains just the vertices whose path from s plus distance to d is smaller than the length of the shortest path from s to d. We expect this tree to be substantially smaller for many apps because the heuristic prunes a substantial number of long paths. The precise savings is dependent on the structure of the graph and the geometry of the vertices. Screenshot shows the operation of the Euclidean heuristic on our sample graph, where the savings are substantial. We refer to the method as a heuristic because there is no guarantee that there will be any savings at all: It could always be the case that the only path from source to sink is a long one that wanders arbitrarily far from the source before heading back to the sink (see Exercise 21.80).
Screenshot Shortest path in a Euclidean graph
When we direct the shortest-path search towards the destination vertex, we can restrict the search to vertices within a relatively small ellipse around the path, as illustrated in these three examples, which show SPT subtrees from the examples in Screenshot.
Screenshot illustrates the basic underlying geometry that describes the intuition behind the Euclidean heuristic: If the shortest-path length from s to d is z, then vertices examined by the algorithm fall roughly within the ellipse defined as the locus of points x for which the distance from s to x plus the distance from x to d is equal to z. For typical Euclidean graphs, we expect the number of vertices in this ellipse to be far smaller than the number of vertices in the circle of radius z that is centered at the source (those that would be examined by Dijkstra's algorithm).
Screenshot Euclidean heuristic cost bounds
When we direct the shortest-path search towards the destination vertex, we can restrict the search to vertices within an ellipse around the path, as compared to the circle centered at s that is required by Dijkstra's algorithm. The radius of the circle and the shape of the ellipse are determined by the length of the shortest path.
Precise analysis of the savings is a difficult analytic problem and depends on models of both random point sets and random graphs (see reference section). For typical situations, we expect that, if the standard algorithm examines X vertices in computing a source–sink shortest path, the Euclidean heuristic will cut the cost to be proportional to , which leads to an expected running time proportional to V for dense graphs and proportional to for sparse graphs. This example illustrates that the difficulty of developing an appropriate model or analyzing associated algorithms should not dissuade us from taking advantage of the substantial savings that are available in many apps, particularly when the implementation (add a term to the priority) is trivial. The proof of Property 21.11 applies for any function that gives a lower bound on the distance from each vertex to d. Might there be other functions that will cause the algorithm to examine even fewer vertices than the Euclidean heuristic? This question has been studied in a general setting that applies to a broad class of combinatorial search algorithms. Indeed, the Euclidean heuristic is a specific instance of an algorithm called A * (pronounced "ay-star"). This theory tells us that using the best available lower-bound function is optimal; stated another way, the better the bound function, the more efficient the search. In this case, the optimality of A* tells us that the Euclidean heuristic will certainly examine no more vertices than Dijkstra's algorithm (which is A* with a lower bound of 0). The analytic results just described give more precise information for specific random network models. We can also use properties of Euclidean networks to help build efficient implementations of the abstract–shortest-paths ADT, trading time for space more effectively than we can for general networks (see Exercises 21.48 through 21.50). Such algorithms are important in apps such as map processing, where networks are huge and sparse. For example, suppose that we want to develop a navigation system based on shortest paths for a map with millions of roads. We perhaps can store the map itself in a small onboard computer, but the distances and paths matrices are much too large to be stored (see Exercises 21.39 and 21.40); therefore, the all-paths algorithms of are not effective. Dijkstra's algorithm also may not give sufficiently short response times for huge maps. Exercises 21.77 through 21.78 explore strategies whereby we can invest a reasonable amount of preprocessing and space to provide fast responses to source–sink shortest-paths queries.