Collections of Items - Algorithms - Java Programming Language

The data structures that we use in apps often contain a great deal of information of various types, and certain pieces of information may belong to multiple independent data structures. For example, a file of personnel data may contain records with names, addresses, and various other pieces of information about employees; and each record may need to belong to one data structure for searching for particular employees, to another data structure for answering statistical queries, and so forth.

Despite this diversity and complexity, a large class of computing apps involve generic manipulation of data items and need access to the information associated with them for a limited number of specific reasons. ADTs provide a way for us to make explicit any assumptions about the operations that we perform on data items.

Many of the manipulations that are required are concerned with collections of items. We shall see that these manipulations are a natural outgrowth of basic computational procedures and are therefore needed in a broad variety of apps. Several of the fundamental algorithms that we consider in this tutorial can be applied effectively to the task of building a layer of abstraction that can provide client programs with the ability to perform such manipulations efficiently. Thus, we shall consider in detail numerous ADTs that are associated with such manipulations. They define various operations on collections of items that are valid for many different types of items.

In particular, several of the data structures and algorithms that we consider in this tutorial are used to implement fundamental ADTs comprising collections of abstract objects that are built up from just two operations:

insert a new item into the collection.
remove an item from the collection.

We refer to such ADTs as generalized queues. For convenience, we also typically include explicit operations to construct the data structure (constructors) and to count the number of items in the data structure (or just to test whether it is empty). We also might need an operation to copy the data structure (clone it); we shall discuss that in .

When we insert an item, our intent is clear, but which item do we get when we remove an item from the collection? Different ADTs for collections of items are characterized by different criteria for deciding which item to remove for the remove operation and by different conventions associated with the various criteria. Moreover, we shall encounter a number of other natural operations beyond insert and remove. Several of the algorithms and data structures that we consider in this tutorial were designed to support efficient implementation of various subsets of these operations for various different remove criteria and other conventions. These ADTs are conceptually simple, used widely, and lie at the core of a great many computational tasks, so they deserve the careful attention that we pay them.

In this chapter, we consider several of these fundamental data structures, their properties, and examples of their app, while at the same time using them as examples to illustrate the basic mechanisms that we use to develop ADTs. In , we consider the pushdown stack, where the rule for removing an item is to remove the one that was most recently inserted. We consider apps of stacks in and implementations in , including a specific approach to keeping the apps and implementations separate. After stepping back to consider the process of creating a new ADT, in the context of the union-find abstraction for the connectivity problem that we considered in , we return to collections of items, to consider FIFO and generalized queues (which differ from stacks on the abstract level only in that they involve using a different rule to remove items) and generalized queues where we disallow duplicate items.

As we saw in , arrays and linked lists provide basic mechanisms that allow us to insert and remove specified items. Indeed, linked lists and arrays are the underlying data structures for several of the implementations of generalized queues that we consider. As we know, the cost of insertion and deletion is dependent on the specific structure that we use and the specific item being inserted or removed. For a given ADT, our challenge is to choose a data structure that allows us to perform the required operations efficiently. In this chapter, we examine in detail several examples of ADTs for which linked lists and arrays provide appropriate solutions. ADTs that support more powerful operations require more sophisticated implementations, which are the prime impetus for many of the algorithms that we consider in this tutorial.

Data types comprising collections of items (generalized queues) are a central object of study in computer science because they directly support a fundamental paradigm of computation. For a great many computations, we find ourselves in the position of having many objects with which to work, but being able to process only one object at a time. Therefore, we need to save the others while processing that one. This processing might involve examining some of the items already saved away or adding more to the collection, but operations of saving the items away and retrieving them according to some criterion are the basis of the computation. Many classical data structures and algorithms fit this mold, as we shall see.