Persistence

Now that you have a data model that is nearly complete, you can start to consider where to store it. Although many developers automatically think of JDBC when someone says "persistence," that isn't the only way you can store your data. In fact, storing data in a relational database management system (RDBMS) may be the wrong choice.

RDBMS Versus OODBMS

Although RDBMSs are the most common data storage solution used in Java programming, they are often the wrong choice. Many systems would benefit far more from an object-oriented database management system (OODBMS). To understand why, it's important that you know the differences between relational and object databases. Relational databases organize information into tables. These tables have columns for each piece of data and rows that organize the data into records. Each row is one record, and each column is a piece of that record. Connections between records of different types are made with relations. These databases can be easily expressed with an entity-relationship diagram such as the one in Screenshot-6.

Screenshot-6. An entity-relationship diagram
Java figs/HCJ_0806.gif

In this case, there are two tables, one with five columns and one with four. Each customer's purchases are stored in the Purchase table and are linked to the Customer table via a foreign key. To find a customer's purchases, execute a SQL statement such as the following:

select PURCHASE_ID from Purchase where Purchase.CUSTOMER_ID = 80543


Those of you who have worked with relational databases are already familiar with this concept. However, the above statement has a deeper meaning that many developers fail to consider. In the example, the select statement prompts the database to look through the Purchase table, checking each record for the proper CUSTOMER_ID. The more purchases there are, the longer this process will take. Object databases work differently. Consider the UML diagram in Screenshot-6 rendered in an object-oriented manner, as shown in Screenshot-7.

Screenshot-7. UML diagram of a customer schema
Java figs/HCJ_0807.gif

The purchase is an object that is aggregated into the customer object. If you render this code into Java, you will use a list to contain each customer's purchases. Once you have a customer object, you simply navigate to the desired purchases. This is exactly how object databases work. They store the entire objects in their systems and navigate from one object to another using a device called a smart pointer, which replaces the standard reference with a reference that has more functionality. With the object database, the individual purchase objects are stored on the disk. In the list that contains the purchases, the database mechanism stores these locations using various proprietary methods, such as an object ID (OID). When you navigate to the object, the persistence mechanism finds the OID and goes directly to the object. This mechanism is much faster. Since the database doesn't have to search the list of records for a particular OID, but instead goes directly to the record on the disk, a costly step is removed. Also, when navigating deeply, the situation improves because you simply jump from place to place on the disk instead of searching through thousands of records. Finally, since the object database stores the objects as objects, there is no need to flatten map objects that may be multidimensional. Instead of flattening hierarchies, you simply tell the engine to "store this," and it does. Object databases provide superior performance in object navigation; however, an object database is not necessarily superior in all instances. Since the objects are stored in an object database as objects and not in a single table, parametric searches with object databases are much slower than those with relational databases. For example, suppose you try to write a report on the purchases in your store. If you want to record every customer that downloaded barbecue equipment (SKU starts with BBQ) that cost over $40, you would need a parametric search. In this case, the parameters are "SKU begins with BBQ and price > 40 dollars." With a relational database, the results are found by using a SQL statement. Object databases have something similar; however, the relational database will always perform faster because its records are already optimized to perform parametric searches. The object database, on the other hand, has to locate each customer and then navigate each to their purchases, checking each against the search criteria. So the question of whether to use an object database or a relational database depends on your app. If you are writing an app in which the vast majority of work will be parametric searches, then your best bet is to use a relational database. On the other hand, if the vast majority of work will be navigation, then your best bet is to use an object database. But there is another problem with object databases: there is no free (or even inexpensive) object database—they are not cheap. Some of the best, such as Versant and ObjectStore, can cost as much as Oracle. If you can afford Oracle, you can probably afford an object database, and I strongly encourage you to check one out. If you are building a small web app for your company, and you can't afford an object database, you may be stuck with the relational database.

Java Data Objects to the Rescue

Although it is a relatively new technology, Java data objects (JDO) are based on some fairly old ideas, such as providing a single solution to make object persistence transparent. The Object Data Management Group (ODMG) tried to standardize this, but it never really caught on for one reason or another. However, with the failure of CMP, the transparent-persistence problem was once again brought to the forefront. A team of experts from the Java Community Process (JCP) took on the issue, and the result was JDO. JDO is designed to persist any object. Whether you want to persist an object you built or one that you bought from another company, JDO allows you to do so. Also, JDO doesn't reengineer the object, as other techniques, such as CMP, require. JDO simply looks at the object, determines its fields, and stores them. See the tutorial Java Data Objects by David Jordan and Craig Russel (Oracle) for more information on JDO.

Screenshot

For my JDO work, I use a product called Kodo JDO from SolarMetric (http://www.solarmetric.com/). There are many products out there, but, unfortunately, none are both fully JDO-compliant and free. However, SolarMetric provides a community version and evaluation copies so that you can experiment with the relevant code in this tutorial.


Essentially, using JDO is a three-step process. First, you write, or otherwise acquire, the objects you want to store. Then you run them through a process called enhancement. Finally, you use them. The first step is relatively simple because you can use JDO with virtually any Java object. The second step to JDO persistence is a process called enhancement, in which methods are added to a compiled class file to enhance the object. This is done with a JDO enhancer, which looks at the class file, analyzes it, and then adds various methods that the PersistenceManager, the boss of JDO, needs to manage the object. The resulting object is then ready to be persisted. Normally, enhancement is done with an IDE such as Eclipse or with Ant. Kodo JDO, for example, provides a number of plug-ins for various IDEs. After the objects are enhanced, you can proceed with an optional step: object identification. There are two ways to identify objects in JDO. You can either let the JDO engine define its own IDs, called data store identity, or you can define your own ID classes, called app identity. Whichever approach you use depends on your app and the data you are trying to persist. If you are trying to persist classes that weren't originally designed to be persisted, you will probably have to use data store identity. If you have designed data model classes yourself, then you will probably have to use app identity. Either way, the JDO vendor usually has tools to help you. With Kodo, for example, there is a tool that will generate app identity classes. Finally, the objects are ready to be used in the app. The beauty of JDO is that the objects are used as objects, stored as objects, and read as objects. Even the query language for JDO, JDOQL, is object-oriented. All of the details of writing to the database, caching, and reading are managed by the PersistenceManager. Furthermore, the user can use a relational or object database for the persistence, and the interface will stay the same. If you decide to switch to an object database, you don't need to rewrite the code. Also, JDO vendors usually offer tools to assist in enterprise development. For example, Kodo has an enterprise cache that will allow several Enterprise JavaBeans (EJBs) to share the same cache. JDO can handle the transactional issues and the overhead of using the data objects. Using JDO will make your life much easier. I use JDO as my persistence mechanism in all my projects.

      
Comments