Previous   Next

Sublinear-Time Sorts

The primary conclusion that we can draw from the analytic results of is that the running time of radix sorts can be sublinear in the total amount of information in the keys. In this section, we consider practical implications of this fact.

The LSD radix-sort implementation given in makes bytesword passes through the file. By making R large, we get an efficient sorting method, as long as N is also large and we have space for R counters. As mentioned in the proof of Property 10.5, a reasonable choice is to make lg R (the number of bits per byte) about one-quarter of the word size so that the radix sort is four key-indexed counting passes. Each byte of each key is examined, but there are only four digits per key. This example corresponds directly to the architectural organization of many computers: one typical organization has 32-bit words, each consisting of four 8-bit bytes. We extract bytes, rather than bits, from words, which approach is likely to be much more efficient on many computers. Now, each key-indexed–counting pass is linear, and, because there are only four of them, the entire sort is linear—certainly the best performance we could hope for in a sort.

In fact, it turns out that we can get by with only two key-indexed counting passes. We do so by taking advantage of the fact that the file will be almost sorted if only the leading w/2 bits of the w-bit keys are used. As with quicksort, we can complete the sort efficiently by using insertion sort on the whole file afterward. This method is a trivial modification to Program 10.4. To do a right-to-left sort using the leading one-half of the keys, we simply start the outer for loop at bytesword/2-1, rather than bytesword-1. Then, we use a conventional insertion sort on the nearly ordered file that results. Figures 10.3 and 10.18 provide convincing evidence that a file sorted on its leading bits is well ordered. Insertion sort would use only six exchanges to sort the file in the fourth column of Screenshot,and Screenshot shows that a larger file sorted on only the leading one-half of its bits also could be sorted efficiently by insertion sort.

Screenshot Dynamic characteristics of LSD radix sort on MSD bits

When keys are random bits, sorting the file on the leading bits of the keys brings it nearly into order. This diagram compares a six-pass LSD radix sort (left) on a file of random 6-bit keys with a three-pass LSD radix sort, which can be followed by an insertion-sort pass (right). The latter strategy is nearly twice as fast.

Java graphics 10fig18.gif


For some file sizes, it might make sense to use the extra space that would otherwise be used for the auxiliary array to try to get by with just one key-indexed–counting pass, doing the rearrangement in place. For example, sorting 1 million random 32-bit keys could be done with one key-indexed–counting sort on the leading 20 bits, then an insertion sort. To do that, we need space just for the (1 million) counters—significantly less than would be needed for an auxiliary array. Using this method is equivalent to using standard MSD radix sort with R = 220, although it is essential that a small radix be used for small files for such a sort (see the discussion after Property 10.4).

The LSD approach to radix sorting is widely used, because it involves extremely simple control structures and its basic operations are suitable for machine-language implementation, which can directly adapt to special-purpose high-performance hardware. In such an environment, it might be fastest to run a full LSD radix sort. We need to have space for just N extra references (and R counters) to use LSD radix sort, and this investment yields a method that can sort random files with only three or four passes.

Table 10.1. Empirical study of radix sorts (integer keys)

These relative timings for radix sorts on random files of N 32-bit integers (all with a cutoff to insertion sort for N less than 16) indicate that, when used with care, radix sorts can be among the fastest sorts available. If we use a huge radix for tiny files, we ruin the performance of MSD radix sort, but adapting the radix to be less than the file size cures this problem. The fastest method for integer keys is LSD radix sort on the leading one-half of the bits, which we can speed up further by paying careful attention to the inner loop (see Exercise 10.51).

   

4-bit bytes

8-bit bytes

16-bit bytes

N

Q

M

L

M

L

L*

M

L

M*

12500

53

29

35

13

14

18

21

15

2

25000

49

22

40

19

20

16

26

23

5

50000

73

50

84

32

42

24

35

38

12

100000

91

99

170

66

89

51

61

74

64

200000

192

202

373

130

195

119

202

158

82

400000

410

457

784

515

413

274

62506

346

153

800000

892

1003

1587

2452

847

637

588076

726

1039

Key:

Q Quicksort, standard (Program 7.1)

M MSD radix sort, standard (Program 10.2)

L LSD radix sort (Program 10.4)

M* MSD radix sort, radix adapting to file size

L* LSD radix sort on MSD bits

In conventional coding environments, the inner loop of the key-indexed–counting program on which the radix sorts are based contains a substantially higher number of instructions than do the inner loops of quicksort or mergesort. This property of the implementations implies that the sublinear methods that we have been describing may not be as much faster than quicksort (say) as we might expect in many situations.

Table 10.2. Empirical study of radix sorts (string keys)

These relative timings for various sorts on the first N words of Moby Dick (all, except heapsort, with a cutoff to insertion sort for N less than 16) indicate that the MSD-first approach is effective for string data. The cutoff for small subfiles is not effective unless we modify the insertion sort to avoid going through the leading parts of the keys (see Exercise 10.52). Three-way partitioning benefits from the large numbers of equal keys in the file and MSD radix sort benefits from the assumption that all characters are lower-case letters—in more general situations, three-way radix quicksort is likely to outperform the other methods.

N

Q

T

M

F

R

X

X*

12500

75

59

68

74

69

64

43

25000

129

107

145

169

127

115

102

50000

313

257

341

418

237

283

267

100000

819

603

757

986

500

681

649

Key:

Q Quicksort, standard (Program 7.1)

T Quicksort with three-way partitioning (Program 7.5)

M Mergesort (Program 8.3)

F Heapsort with Floyd's improvement (see )

R MSD radix sort (Program 10.2)

X Three-way radix quicksort (Program 10.3)

X* Three-way radix quicksort (with cutoff)

General-purpose algorithms such as quicksort are more widely used than radix sort, because they adapt to a broader variety of apps. The primary reason for this state of affairs is that the key abstraction on which radix sort is built is less general than the one that we used in Chapters 6 through 9. Our use of the ITEM interface to specify that items to be sorted must have a less method (and Java's use of Comparable and compareTo for the same purpose) is to have the client provide the comparison method. This arrangement not only handles situations where the client can use specialized knowledge about complex keys to implement a fast comparison but also allows us to sort using an ordering relation that may not involve keys at all. Radix sorting may not be applicable in such situations.

When any of them could be used, the choice among quicksort and the various radix sort algorithms (and related versions of quicksort!) that we have considered in this chapter will depend not only on features of the app (such as key, record, and file size) but also on features of the coding and machine environment that relate to the efficiency of access and use of individual bits and bytes. Tables 10.1 and 10.2 give empirical results in support of the conclusion that the linear- and sublinear-time performance results that we have discussed for various apps of radix sorts make these sorting methods an attractive choice for a variety of suitable apps.

Exercises

Java graphics icon01.gif 10.50 What is the major drawback of doing LSD radix sorting on the leading bits of the keys, then cleaning up with insertion sort afterward?

Java graphics roundbullet.gif 10.51 Develop an implementation of LSD radix sort for 32-bit keys with as few instructions as possible in the inner loop.

Implement three-way radix quicksort such that the insertion sort for small files does not use leading bytes that are known to be equal in comparisons.

Given 1 million random 32-bit keys, find the choice of byte size that minimizes the total running time when we use the method of using LSD radix sort on the first 2 bytes, then using insertion sort to clean up.

Answer Exercise 10.53 for 1 billion 64-bit keys.

Answer Exercise 10.54 for three-pass LSD radix sort.


Previous   Next
Comments