Data Access Technology Choices

Let's begin by reviewing some of the leading data-access technologies available to J2EE apps. These technologies can be divided into two major categories: SQL-based data access that works with relational concepts; and data access based on O/R mapping.

SQL-Based Technologies

The following technologies are purely intended to work with relational databases. Thus they use SQL as the means of retrieving and manipulating data. While this requires Java code to work with relational, rather than purely object, concepts, it enables the use of efficient RDBMS constructs. Note that using RDBMS concepts in data-access code doesn't mean that business logic will depend on SQL and RDBMS. We will use the Data-Access Object pattern, discussed in , to decouple business logic from data access implementation.

JDBC

Most communication with relational databases, whether handled by an EJB container, a third-party O/R mapping product or the app developer, is likely to be based on JDBC. Much of the appeal of entity beans—and O/R mapping frameworks—is based on the assumption that using JDBC is error-prone and too complicated for app developers. In fact, this is a dubious contention so long as we use appropriate helper classes. JDBC is based around SQL, which is no bad thing. SQL is not an arcane technology, but a proven, practical language that simplifies many data operations. There may be an "impedance mismatch" between RDBMSs and Java apps, but SQL is a good language for querying and manipulating data. Many data operations can be done with far fewer lines of code in SQL than in Java classes working with mapped objects. The professional J2EE developer needs to have a sound knowledge of SQL and cannot afford to be ignorant of RDBMS fundamentals. JDBC is also an ideal technology when we need to call stored procedures, execute unusual custom queries, or use proprietary RDBMS features. The key is how we use JDBC. A naÏve approach, littering JDBC code and SQL statements through app code, is a recipe for disaster. It ties the entire app to a particular persistence strategy, guarantees problems if the database schema changes, and leaves app code containing an inappropriate mix of abstraction levels. However, JDBC can be used effectively in app code, so long as we follow a few golden rules:

Decouple business logic from JDBC access wherever possible. JDBC code should normally be found only in data-access objects.
Avoid raw JDBC code that works directly with the JDBC API. JDBC error handling is so cumbersome as to seriously reduce productivity (requiring a finally block to achieve anything, for example). Low-level details such as error handling are best left to helper classes that expose a higher-level API to app code. This is possible without sacrificing control over the SQL executed.

The J2EE orthodoxy that it's always better to let the container handle persistence than to write SQL is questionable. For example, while it can be difficult or impossible to tune container-generated statements, it's possible to test and tune SQL queries in a tool such as SQL*Plus, checking performance and behavior when different sessions access the same data. Only where significant object caching in an O/R mapping layer is feasible is coding using an O/R mapping likely to equal or exceed the performance of JDBC.

Important

There's nothing wrong with managing persistence using JDBC. In many cases, if we know we are dealing with an RDBMS, only through JDBC can we leverage the full capability of the RDMBS.

However, don't use JDBC directly from business objects such as a session EJBs or even DAOs. Use an abstraction layer to decouple your business component from the low-level JDBC API. If possible, make this abstract layer's API non-JDBC-specific (for example, try to avoid exposing SQL). We'll consider the implementation and usage of such an abstraction layer later in this chapter.

SQLJ

SQLJ is an alternative to JDBC that offers closer integration between Java code and RDBMS. It offers the same basic model as JDBC, but simplifies app code and may improve efficiency. It was developed by a group of companies including Oracle, IBM, Informix, Sun, and Sybase. SQLJ consists of three parts:

Part 0 - Embedded SQL
This provides the ability to embed static SQL statements in Java code by escaping into SQL. In contrast, JDBC uses dynamic SQL, and is based on API calls rather than an escape mechanism. SQLJ Part 0 has been adopted as an ANSI standard.
Part 1 - SQL Routines
This defines a mechanism for calling Java static methods as stored procedures.
Part 2 - SQL Types
This consists of specifications for using Java classes as SQL user-defined data types.

SQLJ Part 0, Embedded SQL, is comparable in functionality to JDBC. The syntax enables SQL statements to be expressed more concisely than with JDBC and facilitates getting Java variable values to and from the database. A SQLJ precompiler translates the SQLJ syntax (Java code with embedded SQL) into regular Java source code. The concept of embedded SQL is nothing new: Oracle's Pro*C and other products take the same approach to C and C++, and there are even similar solutions for COBOL. SQLJ provides convenient escaping in SQL and binding of Java variables to SQL parameters. The following example illustrates both features, the use of #sql escape syntax and binding to Java variables using the : syntax in the embedded SQL. The example inserts a new row into a table, taking the values from Java variables, and then selects the new data back into the variables:

 int age = 32;
 String forename = "Rod";
 String surname = "Rod";
 String email = "rod.johnson@bugmenot.com";
 #sql {
 INSERT INTO
 people (forename, surname, email)
 VALUES
 (:forename, :surname, :email)
 };

 #sql {
 SELECT
 forename, surname, email
 INTO
 :forename, :surname, :email
 FROM
 people
 WHERE
 email=' rod.johnson@bugmenot.com'
 };

Note that I've omitted error handling: we must catch Java.sql.SQLExceptions in SQLJ as in JDBC, and must ensure that we close the connection in the event of errors. However, we don't need to work with JDBC Statement and PreparedStatement objects, making this code less verbose than the JDBC equivalent. SQLJ Parts 1 and 2 attempt to standardize database-specific functionality such as stored procedures. SQLJ can be used in J2EE architecture in the same way as JDBC: for example, in implementations of data-access interfaces. So choosing between SQLJ and JDBC is an issue of only local significance within an app. The advantages of SQLJ over JDBC include:

SQLJ offers compile-time rather than run-time checking of SQL. While with JDBC we will only learn that a SQL statement is nonsense at run time, SQLJ will pick this up at compile time.
Performance may be enhanced through the use of static, rather than dynamic SQL.
SQLJ code is simpler to write and understand than JDBC code.

The disadvantages of SQLJ include:

The development-deployment cycle is more complex, requiring the use of the SQLJ precompiler.
SQLJ doesn't simplify JDBC's cumbersome error handling requirements.
SQLJ syntax is less elegant than Java syntax. On the other hand, SQLJ may achieve some tasks in far fewer lines of code than Java using JDBC.
Although SQLJ is a standard, a SQLJ solution is less portable than a JDBC solution as not all RDBMS vendors support SQLJ. (This needn't be an objection if SQLJ is used to implement a DAO targeted at a particular database.)

Despite its potential, and despite the fact that it is now several years old, SQLJ doesn't appear to be used widely, and is still strongly identified with Oracle.

Note

For more information on SQLJ see http://www.sqlj.org, and Java Programming with Oracle SQLJ from Oracle () - http://www.oracle.com/catalog/orasqlj/).

O/R Mapping Technologies

O/R mapping frameworks offer a completely different coding model to APIs such as JDBC or SQLJ. They can also be used to implement data access anywhere in a J2EE app. Some products provide a pluggable CMP persistence manager for EJB 2.0, which offers richer functionality than implementations shipped with app servers. The best O/R mapping frameworks have far more real-world experience behind them than entity beans. However, they have traditionally been expensive products (although this seems to be changing in 2002), and have required commitment to proprietary APIs. To my knowledge, no open source product currently offers an enterprise-ready O/R mapping solution (although I'd be very glad to hear that I'm wrong). A new entrant on the scene is Sun's JDO specification. Like J2EE, this is a specification—essentially an API—rather than a product. However, many companies already support or are implementing the JDO specification. Importantly, some of the major existing O/R mapping frameworks now support JDO, using their existing mapping infrastructure. Some ODBMS vendors are also supporting JDO. JDO may bring comparable standardization to use of O/R mapping in Java to that which JDBC delivered for RDBMS access in Java.

Established Commercial Products

Let's consider commercial O/R mapping products first, as they have been around longest and are proven in production. There are many products competing in this space, but we'll discuss only the two that are most widely used: TopLink and CocoBase. Many of the points made in relation to these two products apply to all O/R mapping products.

TopLink

TopLink is probably the market leader, and predates the EJB specification. It has experienced several changes of ownership and is now part of Oracle 9i app Server, although it can be downloaded separately and used with other app servers. It offers a pluggable EJB 2.0 CMP solution, although its documentation suggests that TopLink's developers do not favor the entity bean approach. TopLink allows entity beans to be mixed with lightweight persistent Java objects. TopLink's feature list shows how far entity beans have to go beyond standard EJB 2.0 CMP to become a mature O/R mapping framework. The following are some of the TopLink features missing in the EJB specification:

The ability to map a Java object to more than one table.
Support for Java object inheritance.
Optimistic locking support.
Support for nested transactions ("units of work" in TopLink parlance). This isn't supported by J2EE, not merely EJB.
Customizable transformation of data between database and object. As well as simply mapping a column value onto an object property, TopLink allows the programmer to customize the mapping.
The ability to customize SQL instead of relying on generated SQL.
Support for stored procedures.
Caching to reduce the number of accesses to the underlying database. Caching is synchronized between nodes.

Like most O/R mapping frameworks, TopLink can be used with either database-driven or object-driven modeling. If a database exists, TopLink's mapping tool can be used to generate Java objects to use it. Alternatively, TopLink can be used to create a database schema from mapped Java objects. Mappings are defined in TopLink's GUI Mapping Workbench. Mapped objects generally don't need to contain TopLink-specific code (unlike CMP entity beans, which must implement a special API); however, code that uses mapped objects is heavily dependent on proprietary APIs. TopLink offers its own object-based query language. The following example from the TopLink 4.6 demos uses an Expression object to filter for objects of a given class with a budget property greater than equal to a given value:

 Expression exp = new ExpressionBuilder().get("budget").greaterThanEqual(4000);
 Vector projects = session.readAllObjects(LargeProject.class, exp);

TopLink 4.6 additionally supports JDO as a query API. TopLink has traditionally been expensive (comparable in price to leading J2EE app servers). Following its acquisition by Oracle in June 2002 it is free for development use and its production pricing is likely to be more attractive. TopLink's strength is that it is a proven product with powerful and configurable O/R mapping capabilities. Its main weakness is that code that uses it is dependent on proprietary libraries. However, its support for JDO may remove this objection.

Note

See http://otn.oracle.com/products/ias/toplink/content.html for more information on TopLink.

CocoBase

CocoBase, from Thought Inc, is another established O/R mapping product. Like TopLink, CocoBase includes an EJB 2.0 CMP persistence manager and offers a distributed cache. It also offers transparent persistence: objects to be persisted don't need to implement special interfaces. Mappings are held outside classes to be persisted. CocoBase supports sophisticated O/R mapping: for example, the ability to map a Java object to more than one table. As with TopLink, it is possible to tune generated SQL queries. CocoBase is more closely integrated with the underlying RDBMS than many O/R mapping solutions: it's easy to call stored procedures, and queries are written in SQL, rather than in a proprietary query language.

Note

See http://www.thoughtinc.com/ for more information on CocoBase.

Java Data Objects (JDO)

JDO is a recent specification developed under the Java Community Process (1.0 release, March 2002) that describes persistence-store-neutral persistence of Java objects. While it's beyond the scope of the present tutorial to discuss JDO in detail, I feel that JDO is a very important API to J2EE developers, and hope that the following section will serve to whet your appetite. JDO is most likely to be used as an O/R mapping, but it is not tied to RDBMSs. For example, JDO may become the standard API for Java access to ODBMSs. In contrast to the entity bean model, JDO does not require objects to be persisted to implement a special interface. Most ordinary Java objects can be persisted, so long as their persistent state is held in their instance data. While objects used in a JDO persistence manager must implement the javax.jdo.PersistenceCapable interface, developers do not normally write code that implements this interface. The additional methods required by the PersistenceCapable interface are usually added to compiled class files by a byte code enhancer provided by the JDO vendor: a tool that can take ordinary class files and "enhance" them for use with a JDO persistence engine. JDO is more lightweight than entity beans, partly because objects being persisted are much closer to ordinary Java classes. JDO is likely to have much lower overhead in creating new objects: an important consideration when queries return large result sets. app code works with a JDO PersistenceManager, which handles one transaction at a time. PersistenceManager instances must be obtained from a PersistenceManagerFactory provided by the JDO vendor. This allows for a range of caching approaches, leaving the choice to the JDO vendor. Caching can occur at PersistenceManagerFactory level as well as within transactions. Several JDO implementations support caching in a clustered environment. JDO also allows for programmatic app control over caching. Entity beans don't provide similar control. JDO also explicitly supports optimistic locking and inheritance of persistent Java objects; again, this can be controlled programmatically. JDO mappings are defined in XML documents that are simpler and easier to understand than CMP entity bean deployment descriptors. However, mappings are not fully portable between JDO implementations, as they are dependent on vendor-specific extensions. JDO is not currently part of J2EE. However, it seems likely that JDO will eventually become a required API, in the same way as JDBC and JNDI.

Important

Despite Sun's denials, there is a definite overlap between JDO and entity beans. They both amount to a object-relational mapping and define a query language for operating on persistent data. It's hard to see the justification for two quite different O/R mapping standards and, especially, two different query languages, within J2EE.

However, JDO complements J2EE and EJB (in contrast to entity beans), rather than competes with them.

The JDO query language is called JDOQL. A JDO query is a reusable, threadsafe object that defines a result class and a JDOQL filter. JDOQL filter expressions are Java expressions, although they are held in string literals or variables and cannot be validated against the persistence store at compile time. Unlike EJB QL, JDOQL is used in Java code, meaning that queries are dynamic, rather than static. The following example shows the construction of a query that looks for Customer objects with a given email address and password (both instance variables):

 Query loginQuery = persistenceManager.newQuery(Customer.class);
 loginQuery.setFilter("email == pEmail && password == pPassword");
 loginQuery.declareParameters("String pEmail,String pPassword");
 loginQuery.compile();

This query can be executed as follows:

 Collection c = (Collection) loginQuery().execute "rod.johnson@bugmenot.com", "rj");
 Customer cust = (Customer) c.iterator().next();

Changes made to objects resulting from queries, such as the Customer object above, will normally cause the backing data store to be updated transparently. However, JDO allows objects to be explicitly disconnected from the data store if desired. Once JDOs are retrieved, it is possible to navigate associations (joins in an RDBMS) transparently. The most important of the several execute() methods on the JDO Query interface takes an array of Object as a parameter; the above example uses the convenience method that takes two strings and invokes the generic method. JDO has been criticized for its use of string query filters, rather than an object-based query API such as TopLink's proprietary API. The loss of compile-time checking is unfortunate, but the approach works and is convenient to use. It is easy, for example, to use Java syntax to build complex combinations of ORs and ANDs that are easily understandable by Java developers. Although JDO, like JDBC, provides a transaction management API that can be used without a global transaction infrastructure, this is inappropriate in J2EE. JDO implementations can detect and work within JTA transactions, whether created by user code or resulting from EJB CMT. of the JDO specification details the way in which JDO can be used from EJBs to handle persistence. Integration with session beans is particularly attractive, as JDO operations can use declarative transaction delimitation provided by session beans. Although it is sometimes advocated, it's hard to see the point of using JDO from entity beans with BMP. JDO already provides an object view of the database, rendering the additional abstraction layer of entity beans unnecessary and even harmful. (The result is likely to be a lowest common denominator of the functionality supported by both JDO and EJB.) As all JDO exceptions are runtime exceptions, any code using JDO isn't forced to catch them (although it can where exceptions are likely to be recoverable). This leads to a significant reduction in the volume of code and improvement in readability compared with APIs such as JDBC that use checked exceptions. For more information about JDO, see the following resources:

http://access1.oracle.com/jdo/
Index page maintained by Craig Russell, JDO specification lead. Links to other useful resources.
http://www.jdocentral.com/index.html
JDO Portal. The forums are particularly interesting, featuring discussion of many practical issues in JDO development.
http://www.onjava.com/pub/a/onjava/2002/02/06/jdo1.html
Simple introduction to JDO, including a complete example.
http://theserverside.com/discussion/thread.jsp?thread_id=771#23527
ServerSide.com discussion comparing JDO to entity beans.
http://www.ogilviepartners.com/JdoFaq.html
JDO FAQ maintained by a private consulting organization.
http://www.solarmetric.com/
Home page of Kodo, one of the first JDO implementations. Kodo is easy to use and supports caching across a cluster. Kodo is well documented, and a good choice for getting to grips with JDO.
http://www.prismtechnologies.com/English/Products/JDO/index.html
Home of OpenFusion, a JDO implementation from PrismTech.

Important

JDO is likely to prove a very important technology for J2EE. However, at the time of writing (mid 2002), JDO is yet to prove mature enough for use in mission-critical enterprise apps.

When JDO matures, I suspect it may render entity beans obsolete. JDO preserves the good qualities of entity beans (such as hiding data access detail from app code, an O/R mapping solution, use within transactions delimited by session beans, and the potential for object caching to reduce requests to the database) and eliminates many of their problems (such as the inability to use ordinary Java classes as persistent objects, gratuitous remote access to data objects, and the heavyweight run-time infrastructure).

Choosing a Data Access Strategy for the Sample app

We noted in that the sample app can benefit little from an O/R mapping layer. There's little opportunity for caching, for example. We also saw that use of stored procedures could efficiently conceal some of the data management complexity inside the RDBMS. This means that JDBC is the obvious data access strategy. As it's important to ensure that our design remains portable, we will conceal the use of JDBC behind an abstraction layer of data-access interfaces, using the DAO pattern discussed in . These interfaces will be database-agnostic. They won't depend on JDBC or RDBMS concepts, allowing the potential for very different implementations. In the remainder of this chapter we will:

Take a closer look at JDBC, considering some subtle points that we must understand in order to write robust, efficient code.
Examine the implementation and usage of an abstraction layer of generic infrastructure classes that simplifies working with JDBC and minimizes the likelihood of JDBC-related errors.
Look at the implementation of some of the sample app's data-access requirements using JDBC and our generic abstraction layer.

Previous Next

Comments