Entity Beans in EJB 2.0

The EJB 2.0 specification, released in September 2001, introduced significant enhancements relating to entity beans, especially those using CMP. As these enhancements force a reevaluation of strategies established for EJB 1.1 entity beans, it's important to examine them.

Local Interfaces

The introduction of local interfaces for EJBs (discussed in ) greatly reduces the overhead of using entity beans from session beans or other objects within the same JVM (However, entity beans will always have a greater overhead than ordinary Java objects, because the EJB container performs method interception on all calls to EJBs). The introduction of local interfaces makes entity beans much more workable, but throws out a basic assumption about entity beans (that they should be remote objects), and renders much advice on using entity beans obsolete. It's arguable that EJB 2.0 entities no longer have a philosophical basis, or justification for being part of the EJB specification. If an object is given only a local interface, the case for making it an EJB is greatly weakened. This leaves as the only argument for modeling objects as entity beans the data access capabilities that entity beans deliver, such as CMP; this must then be compared on equal terms with alternatives such as JDO.

Important

In EJB 2.0 apps, never give entity beans remote interfaces. This ensures that remote clients access entities through a layer of session beans implementing the app's use cases, minimizes the performance overhead of entity beans, and means that we don't need to get and set properties on entities using a value object.

Home Interface Business Methods

Another important EJB 2.0 enhancement is the addition of business methods on entity bean home interfaces: methods whose work is not specific to a single entity instance. Like the introduction of local interfaces, the introduction of home methods benefits both CMP and BMP entity beans. Home interface business methods are methods other than finders, create, or remove methods defined on an entity's local or remote home interface. Home business methods are executed on any entity instance of the container's choosing, without access to a primary key, as the work of a home method is not restricted to any one entity. Home method implementations have the same run-time context as finders. The implementation of a home interface can perform JNDI access, find out the caller's role, access resource managers and other entity beans, or mark the current transaction for rollback. The only restriction on home interface method signatures is that, to avoid confusion, the method name must not begin with create, find, or remove. For example, an EJB home method on a local interface might look like this:

 int getNumberOfAccountsWithBalanceOver (double balance);

The corresponding method on the bean implementation class must have a name beginning with ejbHome, in the same way that create methods must have names beginning ejbCreate():

 public int ejbHomeGetNumberOfAccountsWithBalanceOver (double balance);

Home interface methods do more than anything in the history of entity beans to allow efficient access to relational databases. They provide an escape from the row-oriented approach that fine-grained entities enforce, allowing efficient operations on multiple entities using RDBMS aggregate operations. In the case of CMP entities, home methods are often backed by another new kind of method defined in a bean implementation class: an ejbSelect() method. An ejbSelect() method is a query method. However, it's unlike a finder in that it is not exposed to clients through the bean's home or component interface. Like finders in EJB 2.0 CMP, ejbSelect() methods return the results of EJB QL queries defined in the ejb-jar .xml deployment descriptor. An ejbSelect() method must be abstract. It's impossible to implement an ejbSelect() method in an entity bean implementation class and avoid the use of an EJB QL query. Unlike finders, ejbSelect() methods need not return entity beans. They may return entity beans or fields with container-managed persistence. Unlike finders, ejbSelect() methods can be invoked on either an entity in the pooled state (without an identity) or an entity in the ready state (with an identity). Home business methods may call ejbSelect() methods to return data relating to multiple entities. Business methods on an individual entity may also invoke ejbSelect() methods if they need to obtain or operate on multiple entities. There are many situations in which the addition of home interface methods allows efficient use of entity beans where this would have proven impossible under the EJB 1.1 contract. The catch is that EJB QL, the portable EJB query language, which we'll discuss below, isn't mature enough to deliver the power many entity home interface methods need. We must write our own persistence code to use efficient RDBMS operations, using JDBC or another low-level API. Home interface methods can even be used to call stored procedures if necessary.

Note

Note that business logic - as opposed to persistence logic - is still better placed in session beans than in home interface methods.

EJB 2.0 CMP

The most talked about entity bean enhancement in EJB 2.0 is the addition of support for container-managed relationships between entity beans, which builds on the introduction of local interfaces. Support for CMP entities in EJB 1.1 is rudimentary and capable only of meeting simple requirements. Although the EJB 2.0 specification requires that containers honor the EJB 1.1 contract for CMP entities, the EJB 2.0 specification introduces a new and quite different contract for CMP entities.

Basic Concepts

In practice, EJB 1.1 CMP was limited to a means of mapping the instance variables of a Java object to columns in a single database table. It supported only primitive types and simple objects with a corresponding SQL type (such as dates). The contract was inelegant; entity bean fields with container-managed persistence needed to be public. An entity bean was a concrete class, and included fields like the following, which would be mapped onto the database by the container:

 public String firstName;
 public String lastName;

Since EJB 1.1 CMP was severely under-specified, apps using it became heavily dependent on the CMP implementation of their target server, severely compromising the portability that entity beans supposedly offered. For example, as CMP finder methods are not written by bean developers, but generated by the container, each container used its own custom query language in deployment descriptors. EJB 2.0 is a big advance, although it's still essentially based on mapping object fields to columns in a single database table. The EJB 2.0 contract for CMP is based on abstract methods, rather than public instance variables. CMP entities are abstract classes, with the container responsible for implementing the setting and retrieval of persistent properties. Simple persistent properties are known as CMP fields. The EJB 2.0 way of defining firstName and lastName CMP fields would be:

 public abstract String getFirstName();
 public abstract void setFirstName (String fname);
 public abstract String getLastName();
 public abstract void setLastName (String lname);

As in EJB 1.1 CMP, the mapping is defined outside Java code, in deployment descriptors. EJB 2.0 CMP introduces many more elements to handle its more complex capabilities. The ejb-jar.xml describes the persistent properties and the relationship between CMP entities. Additional proprietary deployment descriptors, such as WebLogic's weblogic-cmp-rdbms-jar.xml, define the mapping to an actual data source.

The use of abstract methods is a much superior approach to the use of public instance variables (for example, it allows the container to tell when fields have been modified, making optimization easier). The only disadvantage is that, as the concrete entity classes are generated by the container, an incomplete (abstract) CMP entity class will compile successfully, but fail to deploy.

Container-Managed Relationships (CMR)

EJB 2.0 CMP offers more than persistence of properties. It introduces the notion of CMRs (relationships between entity beans running in the same EJB container). This enables fine-grained entities to be used to model individual tables in an RDBMS. Relationships involve local, not remote, interfaces. An entity bean with a remote interface may have relationships, but these cannot be exposed through its remote interface. EJB 2.0 supports one-to-one, one-to-many and many-to-many relationships. (Many-to-many relationships will need to be backed by a join table in the RDBMS. This will be concealed from users of the entity beans.) CMRs may be unidirectional (navigable in one direction only) or bidirectional (navigable in both directions). Like CMP fields, CMRs are expressed in the bean's local interface by abstract methods. In a one-to-one relationship, the CMR will be expressed as a property with a value being the related entity's local interface:

 AddressLocal getAddress();
 void setAddress (AddressLocal p);

In the case of a one-to-many or many-to-many relationship, the CMR will be expressed as a Collection:

 Collection getInvoices();
 void setInvoices (Collection c);

It is possible for users of the bean's local interface to manipulate exposed Collections, subject to certain restrictions (for example, a Collection must never be set to null: the empty Collection must be used to indicate that no objects are in the specified role). The EJB 2.0 specification requires that containers preserve referential integrity - for example, by supporting cascading deletion. While abstract methods in the local interface determine how callers use CMR relationships, deployment descriptors are used to tell the EJB container how to map the relationships. The standard ejb-jar.xml file contains elements that describe relationships and navigability. The details of mapping to a database (such as the use of join tables) will be container-specific. For example, WebLogic defines several elements to configure relationships in the weblogic-cmp-rdbms-jar.xml file. In JBoss 3.0, the jbosscmp-jdbc.xml file performs the same role.

Important

Don't rely on using EJB 2.0 CMP to guarantee referential integrity of your data unless you're positive that no other processes will access the database. Use database constraints.

It is possible to use the coarse-grained entity concept of "dependent objects" in EJB 2.0. The specification (§10.3.3) terms them dependent value classes. Dependent objects are simply CMP fields defined through abstract get and set methods that are of Java object types with no corresponding SQL type. They must be serializable concrete classes, and will usually be persisted to the underlying data store as a binary object

Using dependent value objects is usually a bad idea. The problem is that it treats the underlying data source as a dumb storage facility. The database probably won't understand serialized Java objects. Thus the data will only be of use to the J2EE app that created it: for example, it will be impossible to run reports over the data. Aggregate operations won't be able to use it if the data store is an RDBMS. Dependent object serialization and deserialization will prove expensive. In my experience, long-term persistence of serialized objects is vulnerable to versioning problems, if the serialized object changes. The EJB specification suggests that dependent objects be used only for persisting legacy data.

EJB QL

The EJB 2.0 specification introduces a new portable query language for use by entities with CMP. This is a key element of the portability promise of entity beans, intended to free developers from the need to use database-specific query languages such as SQL or proprietary query languages as used in EJB 1.1 CMP. I have grave reservations about EJB QL. I don't believe that the result it seeks to achieve - total code portability for CMP entity beans - justifies the invention (and learning) of a new query language. Reinventing the wheel is an equally bad idea, whether done by specification committees and app server vendors, or by app developers. I see the following conceptual problems with EJB QL (we'll talk about some of the practical problems shortly):

It introduces a relatively low-level abstraction that isn't necessary in the vast majority of cases, and which makes it difficult to accomplish some tasks efficiently.
It's not particularly easy to use. SQL, on the other hand, is widely understood. EJB QL will need to become even more complex to be able to meet real-world requirements.
It's purely a query language. It's impossible to use it to perform updates. The only option is to obtain multiple entities that result from an ejbSelect() method and to modify them individually. This wastes bandwidth between J2EE server and RDBMS, requires the traversal of a Collection (with the necessary casts) and the issuing of many individual updates. This preserves the object-based concepts behind entity beans, but is likely to prove inefficient in many cases. It's more complex and much slower than using SQL to perform such an update in an RDBMS.
There's no support for subqueries, which can be used in SQL as an intuitive way of composing complex queries.
It doesn't support dynamic queries. Queries must be coded into deployment descriptors at deployment time.
It's tied to entity beans with CMP. JDO, on the other hand, provides a query language that can be used in any type of object.
EJB QL is hard to test. We can only establish that an EJB QL query doesn't behave as expected by testing the behavior of entities running in an EJB container. We may only be able to establish why an EJB QL query doesn't work by looking at the SQL that the EJB container is generating. Modifying the EJB QL and retesting will involve redeploying the EJBs (how big a deal this is, will vary between app servers). In contrast, SQL can be tested without any J2EE, by issuing SQL commands or running scripts in a database tool such as SQL*Plus when using Oracle.
EJB QL does not have an ORDER BY clause, meaning that sorting must take place after data is retrieved.
EJB QL seems torn in two directions, in neither of which it can succeed. If it's frankly intended to be translated to SQL (which seems to be the reality in practice), it's redundant, as SQL is already familiar and much more powerful. If it's to stay aloof from RDBMS concepts - for example, to allow implementation over legacy mainframe data sources - it's doomed to offer only a lowest common denominator of data operations and to be inadequate to solve real problems.

To redress some of these problems, EJB containers such as WebLogic implement extensions to EJB QL. However, given that the entire justification for EJB QL is its portability, the necessity for proprietary extensions severely reduces its value (although SQL dialects differ, the subset of SQL that will work across most RDBMSs is far more powerful than EJB QL).

Important

EJB 2.1 addresses some of the problems with EJB QL by introducing support for aggregate functions such as AVG, MAX, and SUM, and introducing an ORDER BY clause. However, it still does not support updates, and is never likely to. Other important features such as subqueries and dynamic queries are still deferred to future releases of the EJB specification.

Limitations of O/R Modeling with EJB 2.0 Entities

Despite the significant enhancements, CMP entity beans as specified remain a basic form of O/R mapping. The EJB specification ignores some of the toughest problems of O/R mapping, and makes it impossible to take advantage of some of the capabilities of relational databases. For example:

There is no support for optimistic locking.
There is poor support for batch updates (EJB 2.0 home methods at least make them possible, but the container - and EJB QL - provide no assistance in implementing them).
The concept of a mapping from an object to a single table is limiting, and the EJB 2.0 specification does not suggest how EJB containers should address this.
There is no support for inheritance in mapped objects. Some EJB containers such as WebSphere implement this as a proprietary extension. See http://www.transarc.ibm.com/Library/documentation/websphere/appserv/atswfg/atswfg12.htm#HDREJB_ENTITY_BEANS.

Custom Entity Behavior with CMP/BMP Hybrids

I previously mentioned the use of custom code to implement persistence operations that cannot be achieved using CMP, CMR, and EJB QL. This results in CMP/BMP hybrids. These are entities whose lifecycle is managed by the EJB container's CMP implementation, and which use CMP to persist their fields and simple relationships, but database-specific BMP code to handle more complex queries and updates. In general, home interface methods are the likeliest candidates to benefit from such BMP code. Home interface methods can also be implemented using JDBC when generated EJB QL proves slow and inefficient because the container does not permit the tuning of SQL generated from EJB QL. Unlike ejbSelect() methods and finders on CMP entities, the bean developer - not the EJB container - implements home interface business methods. If ejbSelect() methods cannot provide the necessary persistence operations, the developer is free to take control of database access. An entity bean with CMP is not restricted from performing resource manager access; it has merely chosen to leave most persistence operations to the container. It will need a datasource to be made available in the ejb-jar.xml deployment descriptor as for an entity with BMP. Datasource objects are not automatically exposed to entities with CMP. It's also possible to write custom extensions to data loading and storage, as the EJB container invokes the ejbLoad() and ejbStore() methods on entities with CMP. Section 10.3.9 of the EJB 2.0 Specification describes the contract for these methods. CMP/BMP hybrid beans are inelegant, but they are sometimes necessary given the present limitations of EJB QL. The only serious complication with CMP/BMP hybrids is the potential effect on an EJB container's ability to cache entity beans if custom code updates the database. The EJB container has no way of knowing what the custom code is doing to the underlying data source, so it must treat such changes in the same way as changes made by separate processes. Whether or not this will impair performance will depend on the locking strategy in use (see discussion on locking and caching later). Some containers (such as WebLogic) allow users to flush cached entities whose underlying data has changed as a result of aggregate operations.

Important

When using entity beans, if a CMP entity bean fails to accommodate a subset of the necessary operations, it's usually better to add custom data access code to the CMP entity than to switch to BMP. CMP/BMP hybrids are inelegant. However, they're sometimes the only way to use entity beans effectively.

When using CMP/BMP hybrids, remember that:

Updating data may break entity bean caching. Make sure you understand how any caching works in your container, and the implications of any updates performed by custom data access code.
The portability of such beans may improve as the EJB specification matures - if the BMP code queries, rather than updates. For example, EJB home methods that need to be implemented with BMP because EJB 2.0 doesn't offer aggregate functions may be able to be implemented in EJB QL in EJB 2.1.
If possible, isolate database-specific code in a helper class that implements a database-agnostic interface.

Previous Next

Comments