JaVa
   

Working with Relational Databases

As relational databases are used in most J2EE apps, let's examine some important RDBMS characteristics and capabilities, which we should take into account when formulating a data access strategy.

Referential Integrity

All RDBMSs offer sophisticated mechanisms for enforcing referential integrity, such as constraints (which prevent the addition of data that does not honor existing relationships) and cascade delete (in which related data is automatically deleted when its "parent" row is deleted). In my experience, it's unusual for a J2EE app to have a database to itself in an enterprise system. As databases are a good means of communication between different apps, rather than extensions of J2EE apps, this makes sense. Hence RDBMS-based referential integrity mechanisms are essential in most enterprise apps, and we should not rely on our app'sjava code as the sole guardian of data integrity.

Note 

EJB 2.0 offers a referential integrity mechanism for entity beans with CMP, as do many other 0/R mappings. This is useful, but it's only a supplement to referential integrity enforced by the RDBMS.

Stored Procedures, Triggers, and Views

Most RDBMSs offer stored procedures: operations on stored data that run within the database. Unfortunately, the language used differs between databases, although stored procedure languages tend to be SQL-oriented, such as Oracle's PL/SQL. Many RDBMSs offer triggers: stored procedures associated with a particular table that is automatically invoked on an event such as insertion, and don't need to be called by app code. Clearly, stored procedures and triggers can be abused. While an obvious use is to ensure referential integrity, what about the use of a trigger to enforce a business logic constraint? For example, what if a trigger was to veto an attempt to add a row to a order table based on the business rule that orders over $700 can only be accepted from Albania if the customer has previously made (and paid for) at least three orders over $50? This is business logic, and shouldn't be in the database in a J2EE app. Such business rales should be enforced by business objects. Views – virtual tables, usually based on a query executed transparentiy as the view is accessed – can be useful to simplify queries and enable O/R mappings to joins. However, the level of support depends on the underlying database (for example, join views are partially updateable in Oracle 7.3 and later, but not in Cloudscape 3.6.). Usually, views are suitable only for backing read-only objects. These implementation-specific features of RDBMSs have a place in J2EE apps, so long as the following criteria are satisfied:

Stored procedures are particularly important, and deserve more detailed discussion. J2EE developers (and Java developers in general) tend to hate stored procedures. There is some justification for this:

Some other common objections have less validity:

The use of stored procedures from J2EE apps is an area where we should be pragmatic, and avoid rigid positions. I feel that many J2EE developers' blanket rejection of stored procedures is a mistake. There are clear benefits in using stored procedures to implement persistence logic in some situations:

The danger in using stored procedures is the temptation to use them to implement business logic. This has many negative consequences, for example:

If we distinguish between persistence logic and business logic, using stored procedures will not break our architecture. Using a stored procedure is a good choice if it meets the following criteria:

Important 

Do not use stored procedures to implement business logic. This should be done in Java business objects. However, stored procedures are a legitimate choice to implement some of the functionality of a DAO. There is no reason to reject use of stored procedures on design grounds.

Note 

A case in point: In late 2001, Microsoft released a .NET version of Sun's Java Pet Store which they claimed to be 28 times faster. This performance gain appeared largely due to Microsoft's data access approach, which replaced entity beans with SQL Server stored procedures. I found reaction to Microsoft's announcement in the J2EE community disappointing and worrying (see, for example http://www.theserverside.com/discussion/thread.jsp?thread_id=9797). J2EE purists reacted in horror at Microsoft's use of stored procedures, arguing that the Java Pet Store reflected afar superior design, with the benefits of "an object oriented domain model" and portability between databases. Most of all, the purists were concerned about the likely corrupting effect of the Microsoft benchmark on managers, who should clearly never be allowed to determine how fast an app should run. I'm an enthusiastic advocate of OO design principles, as you know after digesting , but I read such responses with incredulity. Design is a tool to an end. Real apps must meet performance requirements, and design that impedes this is bad design.

Also, the panic and denial was unnecessary. The benchmark did not prove that J2EE is inherently less performant than .NET. The architectural approach Microsoft used could equally be implemented in J2EE (more easily than Sun's original Pet Shop example, in fact), but it did prove that J2EE orthodoxy can be dangerous.

Important 

Isn't the use of stored procedures going back to the bad old days of two-tiered apps? No, it's not; two-tiered solutions went to the database to perform business logic. Stored procedures should only be used in a J2EE system to perform operations that will always use the database heavily, whether they're implemented in the database or in Java code that exchanges a lot of data with the database.

RDBMS Performance Issues

The heavier a J2EE app's use of an RDBMS, the more important it will be to ensure that the schema is efficient and the database is tuned.

RDBMS Performance Tuning

RDBMSs, like J2EE app servers, are complex pieces of software: much more complex than the vast majority of user apps. As with J2EE servers, app performance can be significantly affected by a host of tuning options that require expert knowledge. Hire a DBA. Clearly, performance tuning is a losing battle if:

One potential quick win is the creation of indexes. Indexes, as the name implies, enable the RDBMS to locate a row very quickly, and based on values in one or more columns. These are automatically created on primary keys in most databases, but may need to be created to support some use cases regardless of what data access strategy we use in our Java code. For example, if we have several million users in our user table. Data for each user includes a numeric primary key (indexed by default), an e-mail address, and password. If the e-mail address is the user's login, we'll need to locate rows quickly by e-mail address and password when a user attempts to login. Without an index on these columns, this will require a full table scan. On a table this size this will take many seconds, and heavy disk access. With an index on e-mail and password, locating a user's row will be almost instant, and load on the RDBMS minimal.

Denormalization

Occasionally, certain queries remain slow regardless of query optimization and performance tuning, usually because they involve multi-table joins. In such cases, there is a last resort: denormalization, or the holding of redundant data in the database for performance reasons. Typically, this greatly reduces the complexity of joins required. I assume that the reader is familiar with the relational concept of normalization. This is essential knowledge for any J2EE developer, so please refer to a relational database primer if necessary. Denormalization carries serious risks. It increases the size of the database, which may prove a problem if there is a lot of data. Most importantly, it can impair data integrity. Whenever something is stored more than once, there's the potential for the copies to get out of sync. Sometimes it's possible to denormalize in Java, rather than in the database. This creates non-permanent redundancy, so is preferable in principle. This is usually only an option when we can cache a manageable amount of reference data in a partially read-only data structure: Java code may be able to navigate this structure more efficiently than it could be in the database. Occasionally, denormalization is attempted simply to enable J2EE apps to work more easily with an RDBMS. This is seldom a good idea, because of the risk involved.

Important 

Don't denormalize a relational database purely to support a J2EE app. The database schema may well outlive the J2EE app, in which case so will the costs of denormalization.

JaVa
   
Comments