First-Class ADTs - Algorithms - Java Programming Language

ADTs help us manage the complexity of creating client programs that address the needs of increasingly more complicated apps by building increasingly powerful layers of abstraction. Throughout this process, it is often natural to want to use the data types in our programs in the same way that we use primitive types such as int or float.In this section, we consider the pitfalls that arise when we try to do so.

Definition 4.4 A first-class data type is one which we can use in our programs in the same way as we use primitive data types.

If a first-class data type is accessed only through an interface, it is a first-class ADT.

In general, Java does not support first-class data types because its primitive (built-in) data types are fundamentally different from its class (user-defined) data types. Java also provides direct language support for the String type, making it different from both primitive types and other class types. First, arithmetic operators such as + and * are defined for primitive data types (and + is defined for the String type), but we cannot arrange to write a+b when a and b are objects of a user-defined type. Second, we can define methods for class types and extend them, but we cannot do either for primitive types. Third, the meaning of a = b depends on whether or not a and b are primitive types: if they are primitive, a gets a copy of the value of b; if not, a gets a copy of a reference to b. The same holds true of method parameters and return values.

As with our other definitions related to data types, we cannot be precise in defining the concept of first-class types without straying into deep issues relating to semantics of operations. It is one thing to expect that we be able to write a = b when a and b are objects from a user-defined class, but it is quite another thing to precisely specify what we mean by that statement.

In a perfect world, we might envision all data types having some universal set of well-defined methods. An example is the convention that all Java objects have a toString method. In practice, each data type is characterized by its own set of methods. This difference between data types in itself militates against a precise definition of the concept of first-class data types, because it implies that we should provide definitions for every operation that is defined for built-in types, which we rarely do. Most often, only a few crucial operations are of importance to us, and we try to use those operations for our own data types in the same way as we do for built-in types.

As an illustration, we consider an ADT for the complex-number abstraction. Our goal is to be able to write programs that perform algebraic operations on complex numbers using operations defined in the ADT. We would like to declare and initialize complex numbers and use arithmetic operations on complex numbers to perform various computations involving them. As just mentioned, we will not be able to write clients that use the arithmetic operators * and + on complex numbers; we will have to define and use appropriate methods for these operations. Still, it is natural to want to compute with complex numbers in much the same way as we compute with real numbers or integers.

We now digress to consider briefly a few mathematical properties of complex numbers. In one sense, we are not digressing at all, because it is interesting to contemplate the relationship between complex numbers themselves as a mathematical abstraction and how to represent them in a computer program.

The number Java graphics 04icon02 is an imaginary number. Although Java graphics 04icon03 is meaningless as a real number, we name it i and perform algebraic manipulations with i, replacing i² with -1 whenever it appears. A complex number consists of two parts, real and imaginary-complex numbers can be written in the form a + bi, where a and b are reals. To multiply complex numbers, we apply the usual algebraic rules, replacing i² with -1 whenever it appears. For example,

Java graphics 04icon04

The real or imaginary parts might cancel out (have the value 0) when we perform a complex multiplication. For example,

Java graphics 04icon05

Scaling the preceding equation by dividing through by Java graphics 04icon06 we find that

Java graphics 04icon07

In general, there are many complex numbers that evaluate to 1 when raised to a power. These are the complex roots of unity. Indeed, for each N, there are exactly N complex numbers z with z^N = 1. The numbers

Java graphics 04icon08

for k = 0, 1, ..., N - 1 are easily shown to have this property (see Exercise 4.68). For example, taking k = 1 and N = 8 in this formula gives the particular eighth root of unity that we just discovered.

As an example of a client, consider the task of writing a program that computes each of the Nth roots of unity for any given N and checks the computation by raising each of them to the Nth power. This process should produce the output shown in Screenshot: We expect that each number raised to the Nth power gives the same result: 1, or 1 + 0i. (The real and imaginary parts that we compute may not be exactly one and zero, respectively, because of limitations on the precision of our calculations.)

Screenshot Complex roots of unity

This table gives the output that would be produced by Program 4.19 when invoked with a.out 8, with an implementation of the overloaded toString method that does appropriate output formatting (see Exercise 4.70). The eight complex roots of unity are ±1, ±i, and (left two columns). Each of these eight numbers gives the result 1+ 0i when raised to the eighth power (right two columns).

Java graphics 04icon10

Java graphics 04icon09

Complex numbers driver (roots of unity)

This client program performs a computation on complex numbers using an ADT that allows it to compute directly with the abstraction of interest by using objects of type Complex. This program checks the ADT implementation by computing the powers of the roots of unity. With an appropriate toString method (see Exercise 4.70), it prints the table in Screenshot.

public class RootsOfUnity { public static void main(String[] args) { int N = Integer.parseInt(args[0]); Out.println(N + " roots of unity"); for (int k = 0; k < N; k++) { double x = Math.cos(2.0*Math.PI*k/N), y = Math.sin(2.0*Math.PI*k/N); Complex t = new Complex(x, y); Out.print(k + ": "+ t); Complex z = (Complex) t.clone(); for (int j = 0; j < N-1; j++) z.mult(t); Out.println(" " + z); } } }

How should we arrange to multiply two complex numbers? Ideally, we would want to write expressions like

a = b * c;

where a, b,and c are all of type Complex, but, again, Java does not support this style of programming. One idea is to try to mimic this style by writing a static method that takes two Complex objects as parameters and returns a Complex, so that we can write

a = Complex.mult(b, c);

Another approach is to use a single-parameter class method mult that we use to multiply a Complex object by the given parameter. This approach mimics the use of expressions like a *= b with primitive types. We faced a similar tradeoff in when discussing the implementation of a method for computing the distance between two points. Here there is an additional important performance difference: when we use the static two-parameter method, we have to create a new Complex (for the result) every time we perform an arithmetic operation and, in an extended computation, are likely to leave numerous objects to be gathered by the garbage collector. When we use the class method, we do not pay this price (see Exercise 4.63).

ADT interface for complex numbers

This interface for complex numbers allows implementations to create objects of type Complex (initialized with two double values), to access the real and imaginary parts, and to use the mult method. While not explicitly specified, system default mechanisms that work for all classes allow us to use Complex objects as parameters or return values in methods. The clone() method provides a way for clients to mimic an assignment statement (copy the value of one Complex into another) (see text).

class Complex implements Cloneable // ADT interface { // implementations and private members hidden Complex(double re, double im) double re() double im() Complex mult(Complex rhs) public Object clone() public String toString() }

Suppose that we have a float named t and an int named N, and we wish to compute the value of t raised to the Nth power. If N is not large, a natural way to perform this computation is to use the following code:

float z = t; for (int j = 0; j < N-1; j++) z *= t;

Accordingly, if t is Complex, we would expect to be able to write

Complex z = t; for (int j = 0; j < N-1; j++) z.mult(t);

But we would be mistaken in this expectation, because z and t are references to the same complex object, not different ones. This code actually computes the value of Java graphics 04icon11 not t^N. The problem is that we expect the assignment statement to make a copy of the object, but it actually makes a copy of a reference to the object.

ADT implementation for complex numbers

This code implements the ADT defined in Program 4.20 using doubles to represent the real and imaginary parts of each complex number. Like toString, a default implementation of the clone() method (which copies the data fields of this object into a new object) exists in Object and may be redefined by any implementation but must have the given signature.

class Complex implements Cloneable { private double re, im; Complex(double re, double im) { this.re = re; this.im = im; } double re() { return re; } double im() { return im; } void add(Complex rhs) { re = re() + rhs.re(); im = im() + rhs.im(); } void mult(Complex rhs) { double t = re(); re = re() * rhs.re() - im() * rhs.im(); im=t*rhs.im() + im() * rhs.re(); } public String toString() { return re() + " " + im(); } }

Java has a mechanism that specifically addresses this problem: any class can implement the Cloneable interface. In such a class, clients can invoke an object's clone method that returns a copy of the object (a different object with the same data fields). Program 4.19 is a client that uses this capability to print and check the roots of unity, as desired.

Program 4.20 is an ADT for complex numbers based on the discussion above, and Program 4.21 is an implementation that uses a standard data representation (one double for the real part and one double for the imaginary part). Even for this simple example, it is important that the data type be abstract because there is another standard representation that we might wish to consider using: polar coordinates (see Exercise 4.67).

Another place to use cloning is when we use objects as method parameters. For primitive types, we expect to have an object of our own for use within the method; for object types, we are passing a reference to the object and need to clone it if we want the method to have a copy. The same holds true for return values.

This issue of copy semantics is an important one to address in any ADT design. When an object's data fields are just primitive types, as in Program 4.21, the default implementation of clone in Object suffices (since it copies the values of the object's data fields to the corresponding fields in the clone); but if the data fields contain references to other objects, we need to clone those objects; if those objects contain references to other objects, we need to clone them, and so forth. Next, we consider an example that will help us examine this issue in more detail.

Program 4.22 exemplifies a client program that manipulates FIFO queues. It simulates a certain situation involving customers arriving and being served in a set of M queues. Screenshot is an example of the output produced by this program. Our interest in this program is as an example of working with queues as objects-we can imagine writing similar programs to test various methods of organizing queues to serve customers, and so forth. This program prints out the contents of the queues on the last five times through the simulation loop.

Screenshot Random-queue simulation

This listing gives the tail end of the output produced when Program 4.22 is invoked with 80 as the command-line argument. It shows the contents of the queues after the indicated operations, where we choose a queue at random and put the next item in, then choose another queue at random and, if it is nonempty, get an item out.

Java graphics 04fig13

For the present discussion, our interest is in the for loop at the end, which is intended to print out the contents of each queue. Suppose that we use the linked-list queue implementation that we considered in Program 4.16. We know that when we say t = Q[k] (where t and Q[k] are both intQueue objects) we would just be making them refer to the same queue, but what behavior do we want when we make intQueue clonable and say t = (intQueue) Q[k].clone()? The default implementation of clone will simply make a new object t with a copy of Q[k]'s data fields. In this case, the head and tail fields of t would be references to the first and last objects (respectively) on Q[k]. But this would result in unintended behavior (see Exercise 4.71) because we were clearly expecting to have a copy of the whole list. The system cannot know how to do that-we have to provide the code. To do so, we override the implementation of clone, as illustrated in Program 4.24.

Queue client program (queue simulation)

This client program simulates a situation where we assign customers waiting for service at random to one of M service queues, then choose a queue at random (possibly the same one) and, if it is nonempty, perform the service (remove a customer from the queue). To see the effect on the queues, we print out the customer added, the customer served, and the contents of the queues for the last five iterations.

This implementation uses the clonable queue ADT interface of Program 4.23 and requires a clone implementation such as the one given in Program 4.24 to make a copy of the appropriate queue for t each time through the inner for loop.

public class SimulateQueues { private static int M = 4; public static void main(String[] args) { int N = Integer.parseInt(args[0]); intQueue[] Q = new intQueue[M]; for (int i = 0; i < M; i++) Q[i] = new intQueue(N); for (int i = 0; i < N; i++) { int in = (int) (Math.random() * M); int out = (int) (Math.random() * M); Q[in].put(i); if (!Q[out].empty()) Q[out].get(); if(i<N-5)continue; Out.print(in + " in "); Out.println(out + " out"); for (int k = 0;k<M;k++) { intQueue t; t = (intQueue) Q[k].clone(); Out.print(k + ": "); while(!t.empty()) Out.print(t.get() + " "); Out.println(""); } } } }

Clonable queue ADT interface

To make a user-defined class whose data members may contain pointers behave more like a built-in type, we need to include a clone() implementation in its interface, as in this version of the basic FIFO queue interface that we considered in Program 4.15.

class intQueue implements Cloneable // ADT interface { // private members and implementations hidden intQueue(int) public Object clone() boolean empty() void put(int) int get() }

Such methods are generally based on straightforward traversals of our data structures. However, we do not always take these extra steps, because

We often use only a single instance of an object from a class.
If we do have multiple instances, we want to avoid inadvertently copying huge data structures.

In short, while cognizant of our ability to clone objects we remain aware of the tradeoff between convenience and cost when doing so, particularly when huge amounts of data are involved.

As another example, we might envision modifying Program 4.22 to periodically print just the first few items on each queue, so that we can track progress even when the queues become huge. But we might eventually be surprised by poor performance when the queues do become huge, because initialization of the local variable in the for loop invokes the copy constructor, which makes a copy of the entire queue, even if we only want to access a few elements. Eventually that entire queue is collected as garbage, because it is the value of a local variable. For Program 4.22 as it stands, where we access every item on the copy, the extra cost of allocating and garbage collection affects the running time by only a constant factor, but it would be an unreasonable price to pay if we merely wanted to access a few items. In such a situation, we would prefer to use the default pointer-assignment implementation for copy, and modify the ADT to add operations allowing us to access items on the queue without modifying it.

Linked-list implementation of a clonable queue

Adding this method upgrades the FIFO queue class implementation in Program 4.16 to implement the interface in Program 4.23. It makes a copy of the list by traversing it and building a new list from the same items.

public Object clone() { intQueue Q = new intQueue(0); for (Node t = head; t != null; t = t.next) Q.put(t.item); return Q; }

The list of questions that can arise when we consider ADT implementations is long, even for simple ADTs like the ones that we have been considering in this chapter. Do we want to be able to have different types of objects on the same queue? Do we want to use different implementations for queues of the same type in a single client because we know of performance differences? Should information about the efficiency of implementations be included in the interface? What form should that information take? Such questions underscore the importance of understanding the basic characteristics of our algorithms and data structures and how client programs may use them effectively, which is, in a sense, the topic of this tutorial. Though full implementations are often exercises in software engineering instead of algorithms design, we strive to remain cognizant of the relevant issues so that our algorithms and data structures can serve as the basis for software tools in a broad variety of apps (see reference section).

Exercises

4.62 Develop a version of the Complex class in this section (Program 4.21) that uses static methods instead of class methods for add and mult, and write a version of the roots-of-unity client (Program 4.19) that uses your class.
Compare the performance of your solution to Exercise 4.62 with the programs in the text, by removing the println statements and comparing running times for N = 100, 1000, and 10000.
Write a clone method for the equivalence-relations ADT in .
Create an ADT with a clone method for use in programs that process playing cards.
4.66 Write a program to determine empirically the probability that various poker hands are dealt, using your ADT from Exercise 4.65.
Develop an implementation for the complex-number ADT (a program with the same public methods as Program 4.21 that is based on representing complex numbers in polar coordinates (that is, in the form re^iq).
4.68 Use the identity e^iq = cos q + i sin q to prove that e²^pi = 1 and that the N complex Nth roots of unity are

for k = 0, 1, ..., N - 1.
List the Nth roots of unity for N from 2 through 8.
4.70 Provide an implementation of toString for Program 4.21 that produces the output in Screenshot for Program 4.19.
4.71 Describe precisely what happens when you run the queue simulation program Program 4.22 using a clonable version of Program 4.16 or Program 4.17, but with the default clone method.
Develop an implementation of the clonable FIFO queue ADT given in the text (Program 4.23) that uses an array as the underlying data structure.
4.73 Write an interface for a pushdown-stack ADT that includes a clone method.
Develop an implementation of your interface from Exercise 4.73 that uses an array as the underlying data structure.
Develop an implementation of your interface from Exercise 4.73 that uses a linked list as the underlying data structure.
Modify the postfix-evaluation program in to evaluate postfix expressions consisting of complex numbers with integer coefficients, using the complex-numbers ADT in the text (Program 4.21). For simplicity, assume that the complex numbers all have nonnull integer coefficients for both real and imaginary parts and are written with no spaces. For example, your program should print the output 8+4i when given the input
1+1i 0+1i + 1-2i * 3+4i + . 
4.77 Do a mathematical analysis of the queue-simulation process in Program 4.22 to determine, as a function of N and M, the probability that the queue selected for the Nth get is empty and the expected number of items in the queues after N iterations of the for loop.