JaVa
   

Entity Bean Caching

Entity bean performance hinges largely on the EJB container's entity bean caching strategy. Caching in turn depends on the locking strategy the container applies.

Important 

In my opinion, the value of entity beans hinges on effective caching. Unfortunately, this differs widely between app scenarios and different EJB containers.

If it is possible to get heavy cache hits, using read-only entity beans or because your container has an efficient cache, entity beans are a good choice and will perform well.

Entity Bean Locking Strategies

There are two main locking strategies for entity beans, both foreshadowed in the EJB specification (§10.5.9 and §10.5.10). The terminology used to describe them varies between containers, but I have chosen to use the WebLogic terminology, as it's clear and concise. It's essential to understand how locking strategies are implemented by your EJB container before developing apps using entity beans. Entity beans do not allow us to ignore basic persistence issues.

Exclusive Locking

Exclusive locking was the default strategy used by WebLogic 5.1 and earlier generations of the WebLogic container. Many other EJB containers at least initially used this caching strategy. Exclusive locking is described as "Commit Option A" in the EJB specification (§10.5.9), and JBoss 3.0 documentation uses this name for it. With this locking strategy, the container will maintain a single instance of each entity in use. The state of the entity will usually be cached between transactions, which may minimize calls to the underlying database. The catch (and the reason for terming this "exclusive" locking) is that the container must serialize accesses to the entity, locking out users waiting to use it. Exclusive locking has the following advantages:

Exclusive locking has the following disadvantages:

Database Locking

With the database locking strategy, the responsibility for resolving concurrency issues lies with the database. If multiple clients access the same logical entity, the EJB container simply instantiates multiple entity objects with the same primary key. The locking strategy is up to the database, and will be determined by the transaction isolation level on entity bean methods. Database locking is described in "Commit Options B and C" in the EJB specification (§10.5.9), and JBoss documentation follows this terminology. Database locking has the following advantages:

Database locking has the following disadvantages:

WebLogic versions 6.0 and later support both exclusive and database locking, but default to using database locking. Other servers supporting database locking include JBoss, Sybase EAServer and Inprise app Server.

Note 

WebLogic 7.0 adds an "Optimistic Concurrency" strategy, in which no locks are held in EJB container or database, but a check for competing updates is made by the EJB container before committing a transaction. We discussed the advantages and disadvantages of optimistic locking in .

Read-only and "Read-mostly" Entities

How data is accessed affects the locking strategy we should use. Accordingly, some containers offer special locking strategies for read-only data. Again, the following discussion reflects WebLogic terminology, although the concepts aren't unique to WebLogic. WebLogic 6.0 and above provides a special locking strategy called read-only locking. A read-only entity bean is never updated by a client, but may periodically be updated (for example, to respond to changes in the underlying database). WebLogic never invokes the ejbStore() method of an entity bean with read-only locking. However, it invokes the ejbLoad() method at a regular interval set in the deployment descriptor. The deployment descriptor distinguishes between normal (read/write) and read-only entities. JBoss 3.0 provides similar functionality, terming this "Commit Option D". WebLogic allows user control over the cache by making the container-generated home interface implementations implement a special CachingHome interface. This interface provides the ability to invalidate individual entities, or all entities (the home interface of a read-only bean can be cast to WebLogic's proprietary CachingHome subinterface). In WebLogic 6.1 and above, invalidation works in a cluster. Read-only beans provide good performance if we know that data won't be modified by clients. They also make it possible to implement a "read mostly" pattern. This is achieved by mapping a read-only and a normal read-write entity to the same data. The two beans will have different JNDI names. Reads are performed through the read-only bean, while updates use the read/write bean. Updates can also use the CachingHome to invalidate the read-only entity. Dmitri Rakitine has proposed the "Seppuku" pattern, which achieves the same thing more portably. Seppuku requires only read-only beans (not proprietary invalidation support) to work. It invalidates read-only beans by relying on the container's obligation to discard a bean instance if a non-app exception is encountered (we'll discuss this mechanism in ). One catch is that the EJB container is also obliged to log the error, meaning that server logs will soon fill with error messages resulting from "normal" activity. The Seppuku pattern, like the Fat Key pattern, is an inspired flight of invention, but one that suggests that it is preferable to find a workaround for the entire problem. See http://dima.dhs.org/misc/readOnlyUpdates.html for details.

Note 

The name Seppuku was suggested by Cedric Beust of BEA, and refers to Japanese ritual disembowelment. It's certainly more memorable than prosaic names such as "Service-to-Worker"!

Tyler Jewell of BEA hails read mostly entities as the savior of EJB performance (see his article in defense of entity beans at http://www.onjava.com/lpt/a//onjava/2001/12/19/eejbs.html). He argues that a "develop once, deploy n times" model for entity beans is necessary to unleash their "true power", and proposes criteria to determine how entity beans should be deployed based on usage patterns. He advocates a separate deployment for each entity for each usage pattern. The multiple deployment approach has the potential to deliver significant performance improvements compared to traditional entity bean deployment. However, it has many disadvantages:

The performance benefits of multiple deployment apply only if data is read often and updated occasionally. Where static reference data is concerned, it will be better to cache closer to the user (such as in the web tier). Multiple deployment won't help in situations where we need aggregate operations, and the simple O/R mapping provided by EJB CMP is inadequate.

Even disregarding these problems, the multiple deployment approach would only demonstrate the "true power" of entity beans if it weren't possible to achieve its goals in any other way. In fact, entity beans are not the only way to deliver such multiple caches. JDO and other O/R mapping solutions also enable us to maintain several caches to support different usage patterns.

Transactional Entity Caching

Using read-only entity beans and multiple deployment is a cumbersome form of caching that requires substantial developer effort to configure. It's unsatisfactory because it's not truly portable and requires the developer to resort to devious tricks, based on the assumption that out-of-the-box entity bean performance is inadequate. What if entity bean caching was good enough to work without the developers' help? Persistence PowerTier (http://www.persistence.com/products/powertier/index.php) is an established product with a transactional and distributed entity bean cache. Persistence built its J2EE server around its C++ caching solution, rather than adding caching support to an EJB container. PowerTier's support for entity beans is very different from that of most other vendors. PowerTier effectively creates an in-memory object database to lessen the load on the underlying RDBMS. PowerTier uses a shared transactional cache, which allows in-memory data access and relationship navigation. (Relationships are cached in memory as pointers, avoiding the need to run SQL joins whenever relationships are traversed). Each transaction is also given its own private cache. Committed changes to cached data are replicated to the shared cache and transparently synchronized to the underlying database to maintain data integrity. Persistence claims that this can boost performance up to 50 times for apps (such as many web apps) that are biased in favor of reads. PowerTier's performance optimizations include support for optimistic locking. Persistence promotes a fine-grained entity bean model, and provides tools to generate entities (including finder methods) from RDBMS tables. PowerTier also supports the generation of RDBMS tables from an entity bean model. Third-party EJB 2.0 persistence providers such as TopLink also claim to implement distributed caching. (Note that TopLink provide similar caching services without the need to use entity beans, through its proprietary O/R mapping APIs.) I haven't worked with either of these products in production, so I can't verify the claims of their sales teams. However, Persistence boasts some very high volume, mission-critical, J2EE installations, such as the Reuters Instinet online trading system and FedEx's logistics system.

Important 

A really good entity bean cache will greatly improve the performance of entity beans. However, remember that entity beans are not the only way to deliver caching. The JDO architecture allows JDO persistence managers to offer caching that's at least as sophisticated as any entity bean cache.

JaVa
   
Comments