JaVa
   

Coding Standards

J2EE projects tend to be big projects. Big projects require teamwork, and teamwork depends on consistent coding practices. We know that more effort is spent on software maintenance than initial development, so it's vital to ensure that apps are easy to work on. This makes good Java coding standards – as well as the practice of sound OO design principles – vital across J2EE projects. Coding standards are particularly important if we choose to use XP. Collective code ownership can only work if all code is written to the same standards, and there are no significant discrepancies in style within a team. Why does a section on Java coding standards (albeit with a J2EE emphasis) belong in a tutorial on J2EE? Because there's a danger in getting lost in the details of J2EE technology, and losing sight of good coding practice. This danger is shown by many J2EE sample apps, which contain sloppy code. Sun are serious offenders in this respect. For example, the Smart Ticket Demo version 1.1 contains practically no comments, uses meaningless method parameter names such as u, p, zc and cc, and contains serious coding errors such as consistently failing to close JDBC connections correctly in the event of exceptions. Code that isn't good enough to go into a production app is definitely not good enough to serve as an example. Perhaps the authors of such apps believe that omitting such "refinements" clarifies the architectural patterns they seek to illustrate. This is a mistake. J2EE is often used for large projects in which sloppy practices will wreak havoc. Furthermore, bringing code to production standard may expose inadequacies in the original, naïve implementation. As with design principles, this is a huge area, so the following discussion is far from comprehensive. However, it tries to address issues that I've found to be of particular importance in practice. Again, there are necessarily matters of opinion, and the discussion is based on my opinions and practical experience.

Start from the Standard

Don't invent your own coding conventions or import those from other languages you've worked in. Java is a relatively simple language, offering only one way to do many things. In contrast, Java's predecessor C++ usually offered several. Partly for this reason, there's a greater degree of standardization in the way developers write in Java, which should be respected. For example, you may be familiar with "Hungarian notation" or Smalltalk naming conventions. However, Hungarian Notation exists to solve problems (the proliferation of types in the Windows API) that don't exist in Java. A growing proportion of Java developers haven't worked in other languages, and will be baffled by code that imports naming conventions. Start from Sun's Java coding conventions (available at http://java.oracle.com/docs/codeconv/html/CodeConvTOC.doc.html). Introduce refinements and variations if you prefer, but don't stray too far from common Java practice. If you organization already has coding standards, work within them unless they are seriously non-standard or questionable. In that case, don't ignore them: initiate discussion on how to improve them. Some other coding standards worth a look are:

It is, however, worth mentioning one common problem that results from adhering to standard Java practice. This concerns the convention of using the instance variable name as a parameter, and resolving ambiguity using this. This is often used in property setters. For example:

 private String name;
 public void setName (String name) {
 this.name = name;
 }


On the positive side, this is a common Java idiom, so it's widely understood. On the negative, it's very easy to forget to use this to distinguish between the two variables with the same name (the parameter will mask the instance variable). The following form of this method will compile:

 public void setName(String name) {
 name = name;
 }


As will this, which contains a typo in the name of the method parameter:

 public void setName(String nme) {
 name = name;


In both these cases (assuming that the instance variable name started off as null) mysterious null pointer exceptions will occur at runtime. In the first erroneous version, we've assigned the method parameter to itself, accomplishing nothing. In the second, we've assigned the instance variable to itself, leaving it null. I don't advocate using the C++ convention of prefixing instance or member variables with m_ (for example, m_name), as it's ugly and inconsistent with other Java conventions (underscores are normally only used in constants in Java). However, I recommend the following three practices to avoid the likelihood of the two errors we've just seen:

Note 

See http://www.beust.com/cedric/naming/index.html for arguments against standard Java convention in this area, from Cedric Beust, lead developer of the WebLogic EJB container.

Consistent file organization is important, as it enables all developers on a project to grasp a class's structure quickly. I use the following conventions, which supplement Sun's conventions:

I use section delimiters like this:

 //---------------------------------------------------------------------
 // Implementation of interface MyInterface
 //---------------------------------------------------------------------


Please refer to the classes in the /framework/src directory in the download accompanying this tutorial for examples of use of the layout and conventions described here. The com.interface21.beans.factory.support.AbstractBeanFactory class is one good example.

Note 

If you need to be convinced of the need for coding standards, and have some time to spare, read http://www.mindprod.com/unmain.html.

Allocation of Responsibilities

Every class should have a clear responsibility. Code that doesn't fit should be refactored, usually into a helper class (inner classes are often a good way to do this in Java). If code at a different conceptual level will be reused by related objects, it may be promoted into a superclass. However, as we've seen, delegation to a helper is often preferable to concrete inheritance. Applying this rule generally prevents class size blowout. Even with generous Javadoc and internal comments, any class longer than 500 lines of code is a candidate for refactoring, as it probably has too much responsibility. Such refactoring also promotes flexibility. If the helper class's functionality might need to be implemented differently in different situations, an interface can be used to decouple the original class from the helper (in the Strategy design pattern). The same principle should be applied to methods:

Important 

A method should have a single clear responsibility, and all operations should be at the same level of abstraction.

Where this is not the case, the method should be refactored. In practice, I find that this prevents methods becoming too long.

I don't use any hard and fast rules for method lengths. My comfort threshold is largely dictated by how much code I can see at once on screen (given that I normally devote only part of my screen to viewing code, and sometimes work on a laptop). This tends to be 30 to 40 lines (including internal implementation comments, but not Javadoc method comments). I find that methods longer than this can usually be refactored. Even if a unit of several individual tasks within a method is invoked only once, it's a good idea to extract them into a private method. By giving such methods appropriate names (there are no prizes for short method names!) code is made easier to read and self-documenting.

Avoid Code Duplication

It may seem an obvious point, but code duplication is deadly. A simple example from the Java Pet Store 1.3 illustrates the point. One EJB implementation contains the following two methods:

 public void ejbCreate() {
 try {
 dao = CatalogDAOFactory.getDAO();
 } catch (CatalogDAOSysException se) {
 Debug.println("Exception getting dao " + se);
 throw new EJBException(se.getMessage());
 }
 }


and:

 public void ejbActivate() {
 try {
 dao = CatalogDAOFactory.getDAO();
 } catch (CatalogDAOSysException se) {
 throw new EJBException(se.getMessage());
 }
 }


This may seem trivial, but such code duplication leads to serious problems, such as:

The following refactoring is simpler and much more maintainable:

 public void ejbCreate() {
 initializeDAO();
 }
 public void ejbActivate() {
 initializeDAO();
 }
 private void initializeDAO() {
 try {
 dao = CatalogDAOFactory.getDAO();
 } catch (CatalogDAOSysException se) {
 Debug.println("Exception getting dao " + se);
 throw new EJBException(se.getMessage());
 }
 }


Note that we've consolidated the code; we can make a single line change to improve it to use the new EJBException constructor in EJB 2.0 that takes a message along with a nested exception. We'll also include information about what we were trying to do:

 throw new EJBException("Error loading data access object: " +
 se.getMessage(), se);


EJB 1.1 allowed EJBExceptions to contain nested exceptions, but it was impossible to construct an EJBException with both a message and a nested exception, forcing us to choose between including the nested exception or a meaningful message about what the EJB was trying to do when it caught the exception.

Avoid Literal Constants

Important 

With the exception of the well-known distinguished values 0, null and "" (the empty string) do not use literal constants inside Java classes.

Consider the following example. A class that contains the following code as part of processing an order:

 if (balance > 10000) {
 throw new SpendingLimitExceededException (balance, 10000);
 }


Unfortunately, we often see this kind of code. However, it leads to many problems:

It's better to use a constant. In Java, this means a static final instance variable. For example:

 private static final int SPENDING_LIMIT = 10000;
 if (balance > SPENDING_LIMIT) {
 throw new SpendingLimitExceededException(balance, SPENDING_LIMIT);
 }


This version is much more readable and much less error prone. In many cases, it's good enough. However, it's still problematic in some circumstances. What if the spending limit isn't always the same? Today's constant might be tomorrow's variable. The following alternative allows us more control:

 private static final int DEFAULT_SPENDING_LIMIT = 10000;
 protected int spendingLimit() {
 return DEFAULT_SPENDING_LIMIT;
 {
 if (balance > spendingLimit()) {
 throw new SpendingLimitExceededException(balance, spendingLimit());
 }


At the cost of a little more code, we can now calculate the spending limit at runtime if necessary. Also, a subclass can override the protected spendingLimit() method. In contrast, it's impossible to override a static variable. A subclass might even expose a bean property enabling the spending limit to be set outside Java code, by a configuration manager class (see the Avoiding a proliferation of Singletons by Using an app Registry section earlier). Whether the spendingLimit() method should be public is a separate issue. Unless other classes are known to need to use it, it's probably better to keep it protected. I suggest the following criteria to determine how to program a constant:

Requirement

Example

Recommendation

String constant that is effectively part of app code

Simple SQL SELECT statement used once only and which won't vary between databases.
JDO query used once only.

This is a rare exception to the overall rule when there's little benefit in using a named constant or method value instead of a literal string. In this case, it makes sense for the string to appear at the point in the app where it is used, as it's effectively part of app code.

Constant that will never vary

JNDI name – such as the name of an EJB – that will be same in all app servers.

Use a static final variable. Shared constants can be declared in an interface, which can be implemented by multiple classes to simplify syntax.

Constant that may vary at compile time

JNDI name – such as the name of the TransactionManager – that is likely to vary between app servers.

Use a protected method, which subclasses may override, or which may return a bean property, allowing external configuration,

Constant that may vary at runtime

Spending limit.

Use a protected method.

Constant subject to internationalization

Error message or other string that may need to vary in different locales.

Use a protected method or a ResourceBundle lookup. Note that a protected method may return a value that was obtained from a ResourceBundle lookup, possibly outside the class.

Visibility and Scoping

The visibility of instance variables and methods is one of the important questions on the boundary between coding standards and OO design principles. As field and method visibility can have a big impact on maintainability, it's important to apply consistent standards in this area. I recommend the following general rule:

Important 

Variables and methods should have the least possible visibility (of private, package, protected and public). Variables should be declared as locally as possible.

Let's consider some of the key issues in turn.

Public Instance Variables

The use of public instance variables is indefensible, except for rare special cases. It usually reflects bad design or programmer laziness. If any caller can manipulate the state of an object without using the object's methods, encapsulation is fatally compromised. We can never maintain any invariants about the object's state. Core J2EE Patterns suggests the use of public instance variables as an acceptable strategy in the Value Object J2EE pattern (value objects are serializable parameters containing data, rather than behavior, exchanged between JVMs in remote method calls). I believe that this is only acceptable if the variables are made final (preventing their values from being changed after object construction and avoiding the potential for callers to manipulate object state directory). However, there are many serious disadvantages that should be considered with any use of public instance variables in value objects, which I believe should rule it out. For example:

A value object using public instance variables is really a special case of a struct: a group of variables without any behavior. Unlike C++ (which is a superset of C) Java does not have a struct type. However, it is easy to define structs in Java, as objects containing only public instance variables. Due to their inflexibility, structs are only suited to local use: for example, as private and protected inner classes. A struct might be used to return multiple values from method, for example, given that Java doesn't support call by reference for primitive types. I don't see such concealed structs as a gross violation of OO principles. However, structs usually require constructors, bringing them closer to true objects. As IDEs make it easy to generate getter and setter methods for instance variables, using public instance variables is a very marginal time saving during development. In modern JVMs, any performance gain will be microscopic, except for very rare cases. I find that structs are usually elevated into true objects by refactoring, making it wiser to avoid their use in the first place.

Important 

The advantages in the rare legitimate uses of public instance variables are so marginal, and the consequence of misuse of public instance variables so grave, that I recommend banning the use of public instance variables altogether.

Protected and Package Protected Instance Variables

Instance variables should be private, with few exceptions. Expose such variables through protected accessor methods if necessary to support subclasses. I strongly disagree with coding standards (such as Doug Lea's) that advocate making instance variables protected, in order to maximize the freedom for subclasses. This is a questionable approach to concrete inheritance. It means that the integrity and invariants of the superclass can be compromised by buggy subclass code. In practice, I find that subclassing works as perfectly as a "black box" operation. There are many better ways of allowing class behavior to be modified than by exposing instance variables for subclasses to manipulate as they please, such as using the Template Method and Strategy design patterns (discussed above) and providing protected methods as necessary to allow controlled manipulation of superclass state. Allowing subclasses to access protected instance variables produces tight coupling between classes in an inheritance hierarchy, making it difficult to change the implementation of classes within it. Scott Ambler argues strongly that all instance variables should be private and, further, that "the ONLY member functions that are allowed to directly work with a field are the accessor member functions themselves" (that is, even methods within the declaring class should use getter and setter methods, rather than access the private instance variable directly). I feel that a protected instance variable is only acceptable if it's final (say, a logger that subclasses will use without initializing or modifying). This has the advantage of avoiding a method call, offering slightly simpler syntax. However, even in this case there are disadvantages. It's impossible to return a different object in different circumstances, and subclasses cannot override a variable as they can a method. I seldom see a legitimate use for Java's package (default) visibility for instance variables. It's a bit like C++'s friend mechanism: the fair-weather friend of lazy programmers.

Important 

Avoid protected instance variables. They usually reflect bad design: there's nearly always a better solution. The only exception is the rare case when an instance variable can be made final.

Method Visibility

Although method invocations can never pose the same danger as direct manipulation of instance variables, there are many benefits in reducing method visibility as far as possible. This is another way to reduce the coupling between classes. It's important to distinguish between the requirements of classes that use a class (even subclasses) and the class's internal requirements. This can both prevent accidental corruption of the class's internal state and simplify the task of developers working with the class, by offering them only the choices they need.

Important 

Hide methods as much as possible. The fewer methods that are public, package protected or protected, the cleaner a class is and the easier it will be to test, use, subclass and refactor. Often, the only public methods that a class exposes will be the methods of the interfaces it implements and methods exposing JavaBean properties.

It's a common practice to make a class's implementation methods protected rather than private, to allow them to be used by subclasses. This is inadvisable. In my experience, inheritance is best approached as a black box operation, rather than a white box operation. If class Dog extends Animal, this should mean that a Dog can be used where an Animal can be used, not that the Dog class needs to know the details of Animal's implementation. The protected modifier is best used for abstract methods (as in the Template Method design pattern), or for read-only helper methods required by subclasses. In both these cases, there are real advantages in making methods protected, rather than public.

I find that I seldom need to use package protected (default visibility) methods, although the objections to them are less severe than to protected instance variables. Sometimes package protected methods revealing class state can be helpful to test cases. Package protected classes are typically far more useful, enabling an entire class to be concealed within a package.

Variable Scoping

Variables should be declared as close as possible to where they are used. The fewer variables in scope, the easier code is to read and debug. It's a serious mistake to use an instance variable where an automatic method variable and/or additional method parameters can be used. Use C++/Java local declarations, in which variables are declared just before they're used, rather than C-style declarations at the beginning of methods.

Inner Classes and Interfaces

Inner classes and interfaces can be used in Java to avoid namespace pollution. Inner classes are often helpers, and can be used to ensure that the outer class has a consistent responsibility. Understand the difference between static and non-static inner classes. Static inner classes can be instantiated without the creation of an object of the enclosing type; non-static inner classes are linked to an instance of the enclosing type. There's no distinction for interfaces, which are always static. Inner interfaces are typically used when a class requires a helper that may vary in concrete class, but not in type, and when this helper is of no interest to other classes (we've already seen an example of this). Anonymous inner classes offer convenient implementation of simple interfaces, or overrides that add a small amount of new behavior. Their most idiomatic use is for action handlers in Swing GUIs, which is of limited relevance to J2EE apps. However, they can be useful when implementing callback methods, which we discussed above. For example, we could implement a JDBC callback interface with an anonymous inner class as follows:

 public void anonClass() {
 JdbcTemplate template = new JdbcTemplate(null);
 template.update(new PreparedStatementCreator() {
 public PreparedStatement createPreparedStatement
 (Connection conn) throws SQLException {
 PreparedStatement ps =
 conn.prepareStatement("DELETE FROM TAB WHERE ID=?");
 ps.setInt(1, 1);
 return ps;
 }
 }); 
 }



Anonymous inner classes have the disadvantages that they don't promote code reuse, can't have constructors that take arguments and are only accessible in the single method call. In the above example, these restrictions aren't a problem, as the anonymous inner class doesn't need constructor arguments and doesn't need to return data. Any inner class (including anonymous inner classes) can access superclass instance variables, which offers a way to read information from and update the enclosing class, to work around these restrictions. Personally I seldom use anonymous inner classes except when using Swing, as I've found that they're nearly always refactored into named inner classes. A halfway house between top-level inner classes (usable by all methods and potentially other objects) and anonymous inner classes is a named inner class defined within a method. This avoids polluting the class's namespace, but allows the use of a normal constructor. However, like anonymous inner classes, local classes may lead to code duplication. Named classes defined within methods have the advantages that they can implement constructors that take arguments and can be invoked multiple times. In the following example, the named inner class not only implements a callback interface, but adds a new public method, which we use to obtain data after its work is complete:

 public void methodClass() {
 JdbcTemplate template = new JdbcTemplate(dataSource);
 class Counter implements RowCallbackHandler {
 private int count = 0;
 public void processRow(ResultSet rs) throws SQLException {
 count++;
 }
 public int getCount() {
 return count;
 }
 } 
 Counter counter = new Counter();
 template.query("SELECT ID FROM MYTABLE", counter);
 int count = counter.getCount();
 }



It would be impossible to implement the above example with an anonymous inner class without making (inappropriate) use of an instance variable in the enclosing class to hold the count value.

Using the final Keyword

The final keyword can be used in several situations to good effect.

Method Overriding and Final Methods

There is a common misconception that making methods final reduces the reusability of a class, because it unduly constrains the implementation of subclasses. In fact, overriding concrete methods is a poor way of achieving extensibility. I recommend making public and protected non-abstract methods final. This can help to eliminate a common cause of bugs: subclasses corrupting the state of their superclasses. Overriding methods is inherently dangerous. Consider the following problems and questions:

Another OO principle – the Open Closed Principle – states that an object should be open to extension, but closed to modification. By overriding concrete methods, we effectively modify an object, and can no longer guarantee its integrity. Following the Open Closed Principle helps to reduce the likelihood of bugs as new functionality is added to an app, because the new functionality is added in new code, rather than by modifying existing code, potentially breaking it. Especially in the case of classes that will be overridden by many different subclasses, making superclass methods final when methods cannot be private (for example, if they implement an interface and hence must be public) will simplify the job of programmers developing subclass implementations. For example, most programmers will create subclasses using IDEs offering code helpers: it's much preferable if these present a list of just those non-final methods that can – or, in the case of abstract methods, must – be overridden. Making methods final will produce a slight performance gain, although this is likely to be too marginal to be a consideration in most cases. Note that there are better ways of extending an object than by overriding concrete methods. For example, the Strategy design pattern (discussed earlier) can be used to parameterize some of the object's behavior by delegating to an interface. Different implementations of the interface can be provided at runtime to alter the behavior (but not compromise the integrity) of the object. I've used final methods as suggested here in several large projects, and the result has been the virtual elimination of bugs relating to corruption of superclass state, with no adverse impact on class reusability. Final methods are often used in conjunction with protected abstract methods. An idiomatic use of this is what I call "chaining initializers". Consider a hypothetical servlet superclass, AbstractServlet. Suppose that one of the purposes of this convenient superclass is to initialize a number of helper classes required by subclasses. The AbstractServlet class initializes these helper classes in its implementation of the Servlet API init() method. To preserve the integrity of the superclass, this method should be made final (otherwise, a subclass could override init() without invoking AbstractServlet's implementation of this method, meaning that the superclass state wouldn't be correctly initialized). However, subclasses may need to implement their own initialization, distinct from that of the superclass. The answer is for the superclass to invoke a chained method in a final implementation of init(), like this:

 public final void init() {
 // init helpers
 //...
 onInit();
 }
protected abstract void onInit();


The onInit() method is sometimes called a hook method. A variation in this situation is to provide an empty implementation of the onInit() method, rather than making it abstract. This prevents subclasses that don't need their own initialization from being forced to implement this method. However, it has the disadvantage that a simple typo could result in the subclass providing a method that is never invoked: for example, by calling it oninit(). This technique can be used in many situations, not just initialization. In my experience, it's particularly important in frameworks, whose classes will often be subclassed, and for which developers of subclasses should have no reason to manipulate (or closely examine) superclass behavior. I recommend that public or protected non-abstract methods should usually be made final, unless one of the following conditions applies:

My views in this area are somewhat controversial. However, experience in several large projects has convinced me of the value of writing code that helps to minimize the potential for errors in code written around it. This position was summarized by the distinguished computer scientist (and inventor of quicksort) C.A.R. Hoare as follows:

Note 

"I was eventually persuaded of the need to design coding notations so as to maximize the number of errors which cannot be made, or if made, can be reliably detected at compile time" (1980 Turing Award Lecture).

Final Classes

Final classes are used less frequently than final methods, as they're a more drastic way of curtailing object modification. The UML Reference Manual (Oracle; X) goes so far as to recommend that only abstract classes should be sub-classed (for the reasons we've discussed when considering final methods). However, I feel that if final methods are used appropriately, there's little need to make classes final to preserve object integrity.

I tend to use final classes only for objects that must be guaranteed to be immutable: for example, value objects that contain data resulting from an insurance quotation.

Final instance Variables

I've already mentioned the use of final protected instance variables. A final instance variable may be initialized at most once, either at its declaration or in a constructor. Final instance variables are the only way to define constants in Java, which is their normal use. However, they can occasionally be used to allow superclasses to expose protected instance variables without allowing subclasses to manipulate them, or to allow any class to expose public instance variables that cannot be manipulated.

Note 

Java language gurus will also note that final instance variables can be initialized in a class initializer: a block of code that appears in a class outside a method body, and is evaluated when an object is instantiated. Class initializers are used less often than static initializers, as constructors are usually preferable.

Implementing toString() Methods Useful for Diagnostics

It's good practice for classes to implement toString() methods that summarize their state. This can be especially helpful in generating log messages (we'll discuss logging below). For example, consider the following code, which might be used in a value object representing a user, and which provides a concise, easily readable dump of the object's state which will prove very useful in debugging:

 public String toString() {
 StringBuffer sb = new StringBuffer(getClass() .getName() + ": ");
 sb.append("pk=" + id + "; ");
 sb.append("surname="' + getSurname() + "'; ");
 sb.append("forename="' + getForename() + "'; ");
 sb.append(" systemHashCode=" + System.identityHashCode());
 return sb.toString();
 }


Note the use of a StringBuffer, which is more efficient than concatenating strings with the + operator. Also note that the string forename and surname values are enclosed in single quotes, which will make any white space which may be causing unexpected behavior easy to detect. Note also that the state string includes the object's hash code. This can be very useful to verify if objects are distinct at runtime. The example uses System.identityHashCode() instead of the object's hashCode() method as the System.identityHashCode() method returns the default Object hash code, which in most JVMs will be based on an object's location in memory, rather than any override of this method that the object may implement.

Another important use of toString() values is to show the type and configuration of an implementation of an interface.

Defensive Coding Practices

NullPointerExceptions are a common cause of bugs. Since NullPointerExceptions don't carry helpful messages, the problems they cause can be hard to track down. Let's consider some coding standards we can apply to reduce the likelihood of them occurring at runtime.

Handle Nulls Correctly

It's particularly important to consider what will happen when an object is null. I recommend the following guidelines for handling the possibility of nulls:

Consider the Ordering of Object Comparisons

The following two lines of code will produce the same result in normal operation:

 if (myStringVariable.equals(MY_STRING_CONSTANT))
 if (MY_STRING_CONSTANT.equals(myStringVariable))


However, the second form is more robust. What if myStringVariable is null? The second condition will evaluate to false, without error, while the first will throw a NullPointerException. It's usually a good idea to perform object comparisons by calling the equals() method on the object less likely to be null. If it's an error for the other object to be null, perform an explicit check for null and throw the appropriate exception (which won't be NullPointerException).

Use Short-circuit Evaluation

Sometimes we can rely on Java's short-circuit evaluation of Boolean expressions to avoid potential errors: for example, with null objects. Consider the following code fragment:

 if ( (o != null) && (o.getValue() < 0))


This is safe even if the object o is null. In this case, the second test won't be executed, as the condition has already evaluated to false. Of course, this idiom can only be used if it reflects the intention of the code. Something quite different might need to be done (besides evaluating this condition to false) if o is null. However, it's a safe bet that we don't want a NullPointerException.

An alternative is to perform the second check in an inner if statement, only after an outer if statement has established that the object is non-null. However, I don't recommend this approach unless there is some other justification for the nested if statements (which, however, there often will be), as statement nesting adds complexity.

Distinguish Whitespace in Debug Statements and Error Messages

Consider the following scenario. A web app fails with the following error: Error in com.foo.bar.MagicServlet: Cannot load class com.foo.bar.Magic The developer checks and establishes that the class com.foo.bar.Magic, as expected, is in the web app's classpath, in a JAR file in the /WEB–INF/lib directory. The problem makes no sense: is it an obscure J2EE classloading issue? The developer writes a JSP that successfully loads the class by name, and is still more puzzled. Now, consider the alternative error message: Error in com.foo.bar.MagicServlet: Cannot load class ‘com.foo.bar.Magic’

Now the problem is obvious: com.foo.bar.MagicServlet is trying to load class com.foo.bar.Magic by name, and somehow a trailing space has gotten into the class name. The moral of the story is that white space is important in debug statements and error messages. String literals should be enclosed in delimiters that clearly show what is part of the string and what isn't. Where possible, the delimiters should be illegal in the variable itself.

Prefer Arrays to Collections in Public Method Signatures

Java's lack of generic types mean that whenever we use a collection, we're forced to cast to access its elements, even when – as we usually do – we know that all its elements are of the same type. This longstanding issue may be addressed in Java 1.5 with the introduction of a simpler analog of C++'s template mechanism. Casts are slow, complicate code, and are potentially fragile. Using collections seldom poses seriously problems within a class's implementation. However, it's more problematic when collections are used as parameters in a class's public interface, as there's a risk that external callers may supply collections containing elements of incorrect types. Public interface methods returning a collection will require callers to cast.

Important 

Use a typed array in preference to a collection if possible when defining the signatures for public methods.

Preferring collections to arrays provides a much clearer indication of method purpose and usage, and may eliminate the need to perform casts, which carry a heavy performance cost. This recommendation shouldn't be applied rigidly. Note that there are several situations where a collection is the correct choice:

Note that it's possible to convert a collection to a typed array in a single line of code, if we know that all the elements are of the required type. For example, if we know that the collection c consists of Product objects we can use the following code:

 Product[] products = (Product[]) c.toArray(new Product(c.size()]);


Documenting Code

There is no excuse for inadequate code documentation, in any language. Java goes a step further than most languages in helping developers to document code by standardizing documentation conventions with Javadoc.

Important 

Code that isn't fully documented is unfinished and potentially useless.

Remember that documentation should serve to:

I suggest the following documentation guidelines:

Finally, if you don't already, learn to touch type. It's much easier to write comments if you can type fluently. It's surprisingly easy to learn to touch type (and no, non-touch typists never approach the speed of touch typists, even if they seem to have a flurry of activity).

Logging

It's important to instrument code: to add logging capabilities that help to trace the app's execution. Adequate instrumentation is so important that it should be a required coding standard. Logging has many uses, but the most important is probably to facilitate debugging. It's not a fashionable position, but I think that debugging tools are overrated. However, I'm in good company; coding gurus Brian Kernighan and Rob Pike argue this point in The Practice of Programming, from Oracle (X). I find that I seldom need to use debuggers when working in Java. Writing code to emit log messages is a lower-tech but more lasting solution. Consider the following issues:

A good logging framework can provide detailed information about program flow. Both Java 1.4 logging and the Log4j logging package offer settings that show the class, method and line number that generated the log output. As with configuration in general, it's best to configure log output outside Java classes. It's common to see "verbose" flags and the like in Java classes themselves, enabling logging to be switched on. This is poor practice. It necessitates recompiling classes to reconfigure logging. Especially when using EJB, this can mean multiple deployments as debugging progresses. If logging options are held outside Java code, they can be changed without the need to change object code itself. Requirements of a production logging package should include:

Important 

Never use System.out for logging. Console output can't be configured. For example, we can't switch it off for a particular class, or choose to display a subset of messages. Console output may also seriously degrade performance when running in some servers.

Even code that is believed to be "finished" and bug free should be capable of generating log output. There may turn out to be bugs after all, bugs may be introduced by changes, or it may be necessary to switch on logging in a trusted module to see what's going wrong with other classes in development. For this reason, all app servers are capable of generating detailed log messages, if configured to do so. This is not only useful for the server's developers, but can help to track down problems in apps running on them.

Important 

Remember that unit tests are valuable in indicating what may be wrong with an object, but won't necessarily indicate where the problem is. Logging can provide valuable assistance here.

Instrumentation is also vital in performance tuning. By knowing what an app is doing and how it's doing it, it's much easier to establish which operations are unreasonably slow.

Important 

Code isn't ready for production unless it is capable of generating log messages and its log output can easily be configured.

Log messages should be divided into different priorities, and debug messages should indicate the whole workflow through a component. Debug log messages should often show object state (usually by invoking toString() methods).

Choosing a Logging API

Until the release of Java 1.4, Java had no standard logging functionality. Some APIs such as the Servlet API provided primitive logging functionality, but developers were forced to rely on third-party logging products such as Apache Log4j to achieve an app-wide logging solution. Such products added dependencies, as app code referenced them directly, and were potentially problematic in the EJB tier.

Java 1.4 Logging and a Pre-1.4 Emulation Package

Java 1.4 introduces a new package – java.util.logging – that provides a standard logging API meeting the criteria we've discussed. Since this tutorial is about J2EE 1.3, the following discussion assumes that Java 1.4 isn't available – if it is, simply use standard Java 1.4 logging functionality. Fortunately, it's possible to benefit from the standard API introduced in Java 1.4 even when running Java 1.3. This approach avoids dependence on proprietary logging APIs and makes eventual migration to Java 1.4 logging trivial. It also eliminates the need to learn a third-party API. Java 1.4 logging is merely an addition to the core Java class library, rather than a language change like Java 1.4 assertion support. Thus it is possible to provide an API emulating the Java 1.4 API and use it in Java 1.2 and 1.3 apps. app code can then use the Java 1.4 API. Although the full Java 1.4 logging infrastructure won't be available, actual log output can be generated by another logging package such as Log4j (Log4j is the most powerful and widely used pre-Java 1.4 logging solution). Thus the Java 1.4 emulation package is a fairly simple wrapper, which imposes negligible runtime overhead. The only catch is that Java 1.4 defines the logging classes in a new java.util.logging package. Packages under java are reserved for Sun. Hence we must import a distinctly named emulation package – I've chosen java14.java.util.logging – in place of the Java 1.4 java.util.logging package. This import can be changed when code is migrated to Java 1.4. See Appendix A for a discussion of the implementation of the Java 1.4 logging emulation package used in the infrastructure code and sample app accompanying this tutorial.

Note 

Log4j is arguably more powerful than Java 1.4 logging, so why not use Log4j directly ? Using Log4j may be problematic in some app servers; there is a clear advantage in using a standard Java API, and it's possible to use the powerful log output features of Log4j while using the Java 1.4 API (which differs comparatively little). However, using Log4j directly may be a good choice when using a third-party product (such as many open source projects) that already uses Log4j.

Important 

We have yet another choice for logging in web apps. The Servlet API provides logging methods available to any web component with access to the app's ServletContext. The javax.servlet.GenericServlet servlet superclass provided by the Servlet API provides convenient access to the same logging functionality. Don't use Servlet API logging. Most of an app's work should be done in ordinary Java classes, without access to Servlet API objects. Don't end up with components logging to different logs. Use the one solution for all logging, including from servlets.

Java 1.4 Logging Idioms

Once we've imported the emulation package, we can use the Java 1.4 API. Please refer to the Java 1.4 Javadoc for details. The most important class is the java.util.logging.Logger class, used both to obtain a logger and to write log output The most important methods are:

 Logger.getLogger(String name)


This obtains a logger object associated with a given component. The convention is that the name for a component should be the class name. For example:

 Logger logger = Logger.getLogger(getclass().getName());


Loggers are threadsafe, so it's significantly more efficient and results in simpler code to obtain and cache a logger to be used throughout the class's lifecycle. I normally use the following instance variable definition:

 protected final Logger logger = Logger.getLogger(getclass().getName());


Often an abstract superclass will include this definition, allowing subclasses to perform logging without importing any logging classes or obtaining a logger. Note that the protected instance variable is final, in accordance with the visibility guidelines discussed earlier. Logging calls will look like this:

 logger.fine("Found error number element <" +
 ERROR_NUMBER_ELEMENT + " >: checking numeric value");


Java 1.4 logging defines the following log level constants in the java.util.logging.Level class:

Each level has a corresponding convenience method, such as severe() and fine(). Generic methods allow the assigning of a level to a message and logging an exception.

Each message must be assigned one of these logging levels, to ensure that the granularity of logging can be controlled easily at runtime.

Logging and Performance

Correct use of a logging framework should have negligible effect on performance, as a logging framework should consume few resources. apps should usually be configured to log only errors in production, to avoid excessive overhead and the generation of excessively large log files. It's important to ensure that generating log messages doesn't slow down the app, even if these messages are never displayed. A common offender in this regard is using toString() methods on complex objects that access many methods and build large strings. If a log message might be slow to generate, it's important to check whether or not it will be displayed before generating it. A logging framework must provide fast methods that indicate whether messages with a given log priority will be displayed at runtime. Java 1.4 allows the ability to perform checks such as the following:

 if (logger. isLoggable (Level.FINE)) {
 logger.fine ("The state of my complex object is " + complexObject);
}


This code will execute very quickly if FINE log output is disabled for the given class, as the toString() method won't be invoked on complexObject. String operations are surprisingly expensive, so this is a very important optimization. Also remember to take care that logging statements cannot cause failures, by ensuring that objects they will call toString() cannot be null. An equally important performance issue with logging concerns log output. Both Java 1.4 logging and Log4j offer settings that show the class, method and line number that generated the log output. This setting should be switched off in production, as it's very expensive to generate this information (it can only be done by generating a new exception and parsing its stack trace string as generated by one of its printStackTrace() methods). However, it can be very useful during development. Java 1.4 logging allows the programmer to supply the class and method name through the logging API. At the cost of making logging messages harder to write and slightly more troublesome to read, this guarantees that this information will be available efficiently, even if a JIT makes it impossible to find sufficient detail from a stack trace. Other logging system configuration options with a significant impact on performance are:

Logging in the EJB Tier

In logging as in many other respects, the EJB tier poses special problems.

Let's discuss each issue in turn. Logging configuration isn't a major problem. We can load logging configuration from the classpath, rather than the file system, allowing it be included in EJB JAR files. What to do with log output is a more serious problem. Two solutions sometimes proposed are to write log output using enterprise resources that EJBs are allowed to use, such as databases; or to use JMS to publish log messages, hoping that a JMS message consumer will be able to do something legal with them. Neither of these solutions is attractive. Using a database will cause logging to have a severe impact on performance, which calls the viability of logging in question. Nor is a database a logical place to look for log messages. Using JMS merely pushes the problem somewhere else, and is also technological overkill (JMS is also likely to have a significant overhead). Another powerful argument against using enterprise resources such as databases and JMS topics or queues for logging is the real possibility that we will need to log a failure in the enterprise resource being used to generate the log output Imagine that we need to log the failure of the app server to access its database. If we attempt to write a log message to the same database, we'll produce another failure, and fail to generate a log message. It's important not to be too doctrinaire about EJB coding restrictions. Remember that EJB should be used to help us achieve our goals; we shouldn't let adopting it make life more difficult. The destination of log messages is best handled in logging system configuration, not Java code. In my view it's best to ignore these restrictions and log to a file, unless your EJB container objects (remember that EJB containers must perform logging internally; JBoss, for example, uses Log4j). Logging configuration can be changed if it is necessary to use a database or other output destination (this may be necessary if the EJB container doesn't necessarily sit on a file system; for example, if it is implemented on a database). I feel that the synchronization issue calls for a similar tempering of rigid interpretation of the EJB specification with practical considerations. It's impracticable to avoid using libraries that use synchronization in EJB (for example, it would rule out using all pre Java 1.2 collections, such as java.util.Vector; while there's seldom good reason to use these legacy classes today, vast amounts of existing code does and it's impossible to exclude it from EJB world). In we'll discuss the EJB coding restrictions in more detail. Finally, where distributed apps using EJB are concerned, we must consider the issue of remote method invocation. Java 1.4 loggers aren't serializable. Accordingly, we need to take special care when using logging in objects that will be passed between architectural tiers, such as value objects created in the EJB container and subsequently accessed in a remote client JVM. There are three plausible alternative approaches:

The third method allows caching, and will offer the best performance, although the complexity isn't usually justified. The following code fragment illustrates the approach. Note that the logger instance variable is transient. When such an object is passed as a remote parameter, this value will be left null, prompting the getLogger() method to cache the logger for the new JVM:

 private transient Logger logger;
 private Logger getLogger ( ) {
 if (this. logger == null) {
 // Need to get logger
 this.logger = Logger.getLogger(getClass().getNarae()) ; 
 }
 return this.logger;
}



A race condition is possible at the highlighted line. However, this isn't a problem, as object references (such as the logger instance variable) are atomic. The worse that can happen is that heavy concurrent access may result in multiple threads making unnecessary calls to Logger.getLogger(). The object's state cannot be corrupted, so there's no reason to synchronize this call (which would be undesirable when the object is used within the EJB container).

JaVa
   
Comments