Java ScreenShot
     

Screenshot Core Java 2: Volume I - Fundamentals

Table of Contents
 12.  Streams and Files


Object Streams

Using a fixed-length record format is a good choice if you need to store data of the same type. However, objects that you create in an object-oriented program are rarely all of the same type. For example, you may have an array called staff that is nominally an array of Employee records but contains objects that are actually instances of a child class such as Manager. If we want to save files that contain this kind of information, we must first save the type of each object and then the data that defines the current state of the object. When we read this information back from a file, we must:

  • Read the object type;
  • Create a blank object of that type;
  • Fill it with the data that we stored in the file.

It is entirely possible (if very tedious) to do this by hand, and in the first version of this tutorial we did exactly this. However, Oracle developed a powerful mechanism that allows this to be done with much less effort. As you will soon see, this mechanism, called object serialization, almost completely automates what was previously a very tedious process. (You will see later in this chapter where the term "serialization" comes from.)

Storing Objects of Variable Type

To save object data, you first need to open an ObjectOutputStream object:

ObjectOutputStream out = new ObjectOutputStream(new
 FileOutputStream("employee.dat"));


Now, to save an object, you simply use the writeObject method of the ObjectOutputStream class as in the following fragment:

Employee harry = new Employee("Harry Hacker", 50000,
 1989, 10, 1);
Manager boss = new Manager("Carl Cracker", 80000,
 1987, 12, 15);
out.writeObject(harry);
out.writeObject(boss);


To read the objects back in, first get an ObjectInputStream object:

ObjectInputStream in = new ObjectInputStream(new
 FileInputStream("employee.dat"));


Then, retrieve the objects in the same order in which they were written, using the readObject method.

Employee e1 = (Employee)in.readObject();
Employee e2 = (Employee)in.readObject();


When reading back objects, you must carefully keep track of the number of objects that were saved, their order, and their types. Each call to readObject reads in another object of the type Object. You, therefore, will need to cast it to its correct type. If you don't need the exact type or you don't remember it, then you can cast it to any superclass or even leave it as type Object. For example, e2 is an Employee object variable even though it actually refers to a Manager object. If you need to dynamically query the type of the object, you can use the getClass method that we described in . You can write and read only objects with the writeObject/readObject methods, not numbers. To write and read numbers, you use methods such as writeInt/readInt or writeDouble/readDouble. (The object stream classes implement the DataInput/DataOutput interfaces.) Of course, numbers inside objects (such as the salary field of an Employee object) are saved and restored automatically. Recall that, in Java, strings and arrays are objects and can, therefore, be restored with the writeObject/readObject methods. There is, however, one change you need to make to any class that you want to save and restore in an object stream. The class must implement the Serializable interface:

class Employee implements Serializable { . . . }


The Serializable interface has no methods, so you don't need to change your classes in any way. In this regard, it is similar to the Cloneable interface that we also discussed in . However, to make a class cloneable, you still had to override the clone method of the Object class. To make a class serializable, you do not need to do anything else. Why aren't all classes serializable by default? We will discuss this in the section "Security." Example 12-4 is a test program that writes an array containing two employees and one manager to disk and then restores it. Writing an array is done with a single operation:

Employee[] staff = new Employee[3];
. . .
out.writeObject(staff);


Similarly, reading in the result is done with a single operation. However, we must apply a cast to the return value of the readObject method:

Employee[] newStaff = (Employee[])in.readObject();


Once the information is restored, we give each employee a 100% raise, not because we are feeling generous, but because you can then easily distinguish employee and manager objects by their different raiseSalary actions. This should convince you that we did restore the correct types.

Example 12-4 ObjectFileTest.java
 1. import java.io.*;
 2. import java.util.*;
 3.
 4. class ObjectFileTest
 5. {
 6. public static void main(String[] args)
 7. {
 8. Manager boss = new Manager("Carl Cracker", 80000,
 9. 1987, 12, 15);
 10. boss.setBonus(5000);
 11.
 12. Employee[] staff = new Employee[3];
 13.
 14. staff[0] = boss;
 15. staff[1] = new Employee("Harry Hacker", 50000,
 16. 1989, 10, 1);
 17. staff[2] = new Employee("Tony Tester", 40000,
 18. 1990, 3, 15);
 19.
 20. try
 21. {
 22. // save all employee records to the file employee.dat
 23. ObjectOutputStream out = new ObjectOutputStream(new
 24. FileOutputStream("employee.dat"));
 25. out.writeObject(staff);
 26. out.close();
 27.
 28. // retrieve all records into a new array
 29. ObjectInputStream in = new ObjectInputStream(new
 30. FileInputStream("employee.dat"));
 31. Employee[] newStaff = (Employee[])in.readObject();
 32. in.close();
 33.
 34. // print the newly read employee records
 35. for (int i = 0; i < newStaff.length; i++)
 36. System.out.println(newStaff[i]);
 37. }
 38. catch (Exception e)
 39. {
 40. e.printStackTrace();
 41. }
 42. }
 43. }
 44.
 45. class Employee implements Serializable
 46. {
 47. public Employee() {}
 48.
 49. public Employee(String n, double s,
 50. int year, int month, int day)
 51. {
 52. name = n;
 53. salary = s;
 54. GregorianCalendar calendar
 55. = new GregorianCalendar(year, month - 1, day);
 56. // GregorianCalendar uses 0 = January
 57. hireDay = calendar.getTime();
 58. }
 59.
 60. public String getName()
 61. {
 62. return name;
 63. }
 64.
 65. public double getSalary()
 66. {
 67. return salary;
 68. }
 69.
 70. public Date getHireDay()
 71. {
 72. return hireDay;
 73. }
 74.
 75. public void raiseSalary(double byPercent)
 76. {
 77. double raise = salary * byPercent / 100;
 78. salary += raise;
 79. }
 80.
 81. public String toString()
 82. {
 83. return getClass().getName()
 84. + "[name=" + name
 85. + ",salary=" + salary
 86. + ",hireDay=" + hireDay
 87. + "]";
 88. }
 89.
 90. private String name;
 91. private double salary;
 92. private Date hireDay;
 93. }
 94.
 95. class Manager extends Employee
 96. {
 97. /**
 98. @param n the employee's name
 99. @param s the salary
100. @param year the hire year
101. @param year the hire month
102. @param year the hire day
103. */
104. public Manager(String n, double s,
105. int year, int month, int day)
106. {
107. super(n, s, year, month, day);
108. bonus = 0;
109. }
110.
111. public double getSalary()
112. {
113. double baseSalary = super.getSalary();
114. return baseSalary + bonus;
115. }
116.
117. public void setBonus(double b)
118. {
119. bonus = b;
120. }
121.
122. public String toString()
123. {
124. return super.toString()
125. + "[bonus=" + bonus
126. + "]";
127. }
128.
129. private double bonus;
130. }


java.io.ObjectOutputStream 1.1

Java graphics api_icon.gif
  • ObjectOutputStream(OutputStream out)

    creates an ObjectOutputStream so that you can write objects to the specified OutputStream.

  • void writeObject(Object obj)

    writes the specified object to the ObjectOutputStream. This method saves the class of the object, the signature of the class, and the values of any non-static, non-transient field of the class and its superclasses.

java.io.ObjectInputStream 1.1

Java graphics api_icon.gif
  • ObjectInputStream(InputStream is)

    creates an ObjectInputStream to read back object information from the specified InputStream.

  • Object readObject()

    reads an object from the ObjectInputStream. In particular, this reads back the class of the object, the signature of the class, and the values of the nontransient and nonstatic fields of the class and all of its superclasses. It does deserializing to allow multiple object references to be recovered.

Object Serialization File Format

Object serialization saves object data in a particular file format. Of course, you can use the writeObject/readObject methods without having to know the exact sequence of bytes that represents objects in a file. Nonetheless, we found studying the data format to be extremely helpful for gaining insight into the object streaming process. We did this by looking at hex dumps of various saved object files. However, the details are somewhat technical, so feel free to skip this section if you are not interested in the implementation. Every file begins with the 2-byte "magic number"

AC ED


followed by the version number of the object serialization format, which is currently

00 05


(We will be using hexadecimal numbers throughout this section to denote bytes.) Then, it contains a sequence of objects, in the order that they were saved. String objects are saved as

74

2-byte length

Characters

For example, the string "Harry" is saved as
74 00 05 Harry


The Unicode characters of the string are saved in UTF format. When an object is saved, the class of that object must be saved as well. The class description contains

  1. The name of the class;

  2. The serial version unique ID, which is a fingerprint of the data field types and method signatures;

  3. A set of flags describing the serialization method;

  4. A description of the data fields.

Java gets the fingerprint by:

  1. Ordering descriptions of the class, superclass, interfaces, field types, and method signatures in a canonical way;

  2. Then applying the so-called Secure Hash Algorithm (SHA) to that data.

SHA is a very fast algorithm that gives a "fingerprint" to a larger block of information. This fingerprint is always a 20-byte data packet, regardless of the size of the original data. It is created by a clever sequence of bit operations on the data that makes it essentially 100 percent certain that the fingerprint will change if the information is altered in any way. SHA is a U.S. standard, recommended by the National Institute for Science and Technology (NIST). (For more details on SHA, see, for example, Cryptography and Network Security: Principles and Practice, by William Stallings [Prentice Hall].) However, Java uses only the first 8 bytes of the SHA code as a class fingerprint. It is still very likely that the class fingerprint will change if the data fields or methods change in any way. Java can then check the class fingerprint to protect us from the following scenario: An object is saved to a disk file. Later, the designer of the class makes a change, for example, by removing a data field. Then, the old disk file is read in again. Now the data layout on the disk no longer matches the data layout in memory. If the data were read back in its old form, it could corrupt memory. Java takes great care to make such memory corruption close to impossible. Hence, it checks, using the fingerprint, that the class definition has not changed when restoring an object. It does this by comparing the fingerprint on disk with the fingerprint of the current class.

Java graphics notes_icon.gif

Technically, as long as the data layout of a class has not changed, it ought to be safe to read objects back in. But Java is conservative and checks that the methods have not changed either. (After all, the methods describe the meaning of the stored data.) Of course, in practice, classes do evolve, and it may be necessary for a program to read in older versions of objects. We will discuss this in the section entitled "Versioning."

Here is how a class identifier is stored:

72

2-byte length of class name

class name

8-byte fingerprint

1-byte flag

2-byte count of data field descriptors

data field descriptors

78 (end marker)

superclass type (70 if none)

The flag byte is composed of 3 bit masks, defined in

java.io.ObjectStreamConstants:
static final byte SC_WRITE_METHOD = 1;
 // class has writeObject method that writes additional data static final byte SC_SERIALIZABLE = 2;
 // class implements Serializable interface static final byte SC_EXTERNALIZABLE = 4;
 // class implements Externalizable interface


We will discuss the Externalizable interface later in this chapter. Externalizable classes supply custom read and write methods that take over the output of their instance fields. The classes that we write implement the Serializable interface and will have a flag value of 02. However, the java.util.Date class is externalizable and has a flag of 03. Each data field descriptor has the format:

1-byte type code

2-byte length of field name

field name

class name (if field is an object)

The type code is one of the following:

B

byte

C

char

D

double

F

float

I

int

J

long

L

object

S

short

Z

boolean

[

array

When the type code is L, the field name is followed by the field type. Class and field name strings do not start with the string code 74, but field types do. Field types use a slightly different encoding of their names, namely, the format used by native methods. (See Volume 2 for native methods.) For example, the salary field of the Employee class is encoded as:
D 00 06 salary


Here is the complete class descriptor of the Employee class:

Employee

 
 

E6 D2 86 7D AE AC 18 1B 02

Fingerprint and flags

 

03

Number of instance fields

 

D 00 06 salary

Instance field type and name

 

L 00 07 hireDay

Instance field type and name

 

Ljava/util/Date;

Instance field class name—String

 

L 00 04 name

Instance field type and name

 

Ljava/lang/String;

Instance field class name—String

 

78

End marker

 

70

No superclass

These descriptors are fairly long. If the same class descriptor is needed again in the file, then an abbreviated form is used:

71

4-byte serial number

The serial number refers to the previous explicit class descriptor. We will discuss the numbering scheme later. An object is stored as

73

class descriptor

object data

For example, here is how an Employee object is stored:

E8 6A 00 00 00 00 00

salary field value—double

73

hireDay field value—new object

 

7E 00 08

Existing class java.util.Date

 

1B 4E B1 80 78

External storage—details later

0C Harry Hacker

name field value—String

As you can see, the data file contains enough information to restore the Employee object. Arrays are saved in the following format:

75

class descriptor

4-byte number of entries

entries

The array class name in the class descriptor is in the same format as that used by native methods (which is slightly different from the class name used by class names in other class descriptors). In this format, class names start with an L and end with a semicolon. For example, an array of three Employee objects starts out like this:

75

Array

 

0B [LEmployee;

New class, string length, class name Employee[]

   

FC BF 36 11 C5 91 11 C7 02

Fingerprint and flags

   

00

Number of instance fields

   

78

End marker

   

70

No superclass

   

03

Number of array entries

Note that the fingerprint for an array of Employee objects is different from a fingerprint of the Employee class itself. Of course, studying these codes can be about as exciting as reading the average phone tutorial. But it is still instructive to know that the object stream contains a detailed description of all the objects that it contains, with sufficient detail to allow reconstruction of both objects and arrays of objects.

The Problem of Saving Object References

We now know how to save objects that contain numbers, strings, or other simple objects. However, there is one important situation that we still need to consider. What happens when one object is shared by several objects as part of its state? To illustrate the problem, let us make a slight modification to the Manager class. Let's assume that each manager has a secretary, implemented as an instance variable secretary of type Employee. (It would make sense to derive a class Secretary from Employee for this purpose, but we will not do that here.)

class Manager extends Employee
{
 . . .
 private Employee secretary;
}


Having done this, you must keep in mind that the Manager object now contains a reference to the Employee object that describes the secretary, not a separate copy of the object. In particular, two managers can share the same secretary, as is the case in Screenshot-5 and the following code:

harry = new Employee("Harry Hacker", . . .);
Manager carl = new Manager("Carl Cracker", . . .);
carl.setSecretary(harry);
Manager tony = new Manager("Tony Tester", . . .);
tony.setSecretary(harry);


Screenshot-5. Two managers can share a mutual employee

Java graphics 12fig05.gif


Now, suppose we write the employee data to disk. What we don't want is for the Manager to save its information according to the following logic:

  • Save employee data;
  • Save secretary data.

Then, the data for harry would be saved three times. When reloaded, the objects would have the configuration shown in Screenshot-6.

Screenshot-6. Here, Harry is saved three times

Java graphics 12fig06.gif


This is not what we want. Suppose the secretary gets a raise. We would not want to hunt for all other copies of that object and apply the raise as well. We want to save and restore only one copy of the secretary. To do this, we must copy and restore the original references to the objects. In other words, we want the object layout on disk to be exactly like the object layout in memory. This is called persistence in object-oriented circles. Of course, we cannot save and restore the memory addresses for the secretary objects. When an object is reloaded, it will likely occupy a completely different memory address than it originally did. Instead, Java uses a serialization approach. Hence, the name object serialization for this mechanism. Here is the algorithm:

  • All objects that are saved to disk are given a serial number (1, 2, 3, and so on, as shown in Screenshot-7).
    Screenshot-7. An example of object serialization

    Java graphics 12fig07.gif


  • When saving an object to disk, find out if the same object has already been stored.
  • If it has been stored previously, just write "same as previously saved object with serial number x." If not, store all its data.

When reading back the objects, simply reverse the procedure. For each object that you load, note its sequence number and remember where you put it in memory. When you encounter the tag "same as previously saved object with serial number x," you look up where you put the object with serial number x and set the object reference to that memory address. Note that the objects need not be saved in any particular order. Screenshot-8 shows what happens when a manager occurs first in the staff array.

Screenshot-8. Objects saved in random order

Java graphics 12fig08.gif


All of this sounds confusing, and it is. Fortunately, when object streams are used, the process is also completely automatic. Object streams assign the serial numbers and keep track of duplicate objects. The exact numbering scheme is slightly different from that used in the figures—see the next section.

Java graphics notes_icon.gif

In this chapter, we use serialization to save a collection of objects to a disk file and retrieve it exactly as we stored it. Another very important app is the transmittal of a collection of objects across a network connection to another computer. Just as raw memory addresses are meaningless in a file, they are also meaningless when communicating with a different processor. Since serialization replaces memory addresses with serial numbers, it permits the transport of object collections from one machine to another. We will study that use of serialization when discussing remote method invocation in Volume 2.

Example 12-5 is a program that saves and reloads a network of employee and manager objects (some of which share the same employee as a secretary). Note that the secretary object is unique after reloading—when newStaff[1] gets a raise, that is reflected in the secretary fields of the managers.
Example 12-5 ObjectRefTest.java
 1. import java.io.*;
 2. import java.util.*;
 3.
 4. class ObjectRefTest
 5. {
 6. public static void main(String[] args)
 7. {
 8. Employee harry = new Employee("Harry Hacker", 50000,
 9. 1989, 10, 1);
 10. Manager boss = new Manager("Carl Cracker", 80000,
 11. 1987, 12, 15);
 12. boss.setSecretary(harry);
 13.
 14. Employee[] staff = new Employee[3];
 15.
 16. staff[0] = boss;
 17. staff[1] = harry;
 18. staff[2] = new Employee("Tony Tester", 40000,
 19. 1990, 3, 15);
 20.
 21. try
 22. {
 23. // save all employee records to the file employee.dat
 24. ObjectOutputStream out = new ObjectOutputStream(new
 25. FileOutputStream("employee.dat"));
 26. out.writeObject(staff);
 27. out.close();
 28.
 29. // retrieve all records into a new array
 30. ObjectInputStream in = new ObjectInputStream(new
 31. FileInputStream("employee.dat"));
 32. Employee[] newStaff = (Employee[])in.readObject();
 33. in.close();
 34.
 35. // raise secretary's salary
 36. newStaff[1].raiseSalary(10);
 37.
 38. // print the newly read employee records
 39. for (int i = 0; i < newStaff.length; i++)
 40. System.out.println(newStaff[i]);
 41. }
 42. catch (Exception e)
 43. {
 44. e.printStackTrace();
 45. }
 46. }
 47. }
 48.
 49. class Employee implements Serializable
 50. {
 51. public Employee() {}
 52.
 53. public Employee(String n, double s,
 54. int year, int month, int day)
 55. {
 56. name = n;
 57. salary = s;
 58. GregorianCalendar calendar
 59. = new GregorianCalendar(year, month - 1, day);
 60. // GregorianCalendar uses 0 = January
 61. hireDay = calendar.getTime();
 62. }
 63.
 64. public String getName()
 65. {
 66. return name;
 67. }
 68.
 69. public double getSalary()
 70. {
 71. return salary;
 72. }
 73.
 74. public Date getHireDay()
 75. {
 76. return hireDay;
 77. }
 78.
 79. public void raiseSalary(double byPercent)
 80. {
 81. double raise = salary * byPercent / 100;
 82. salary += raise;
 83. }
 84.
 85. public String toString()
 86. {
 87. return getClass().getName()
 88. + "[name=" + name
 89. + ",salary=" + salary
 90. + ",hireDay=" + hireDay
 91. + "]";
 92. }
 93.
 94. private String name;
 95. private double salary;
 96. private Date hireDay;
 97. }
 98.
 99. class Manager extends Employee
100. {
101. /**
102. Constructs a Manager without a secretary
103. @param n the employee's name
104. @param s the salary
105. @param year the hire year
106. @param month the hire month
107. @param day the hire day
108. */
109. public Manager(String n, double s,
110. int year, int month, int day)
111. {
112. super(n, s, year, month, day);
113. secretary = null;
114. }
115.
116. /**
117. Assigns a secretary to the manager.
118. @param s the secretary
119. */
120. public void setSecretary(Employee s)
121. {
122. secretary = s;
123. }
124.
125. public String toString()
126. {
127. return super.toString()
128. + "[secretary=" + secretary
129. + "]";
130. }
131.
132. private Employee secretary;
133. }


Output Format for Object References

This section continues the discussion of the output format of object streams. If you skipped the previous discussion, you should skip this section as well. All objects (including arrays and strings) and all class descriptors are given serial numbers as they are saved in the output file. This process is referred to as serialization because every saved object is assigned a serial number. (The count starts at 7E 00 00.) We already saw that a full class descriptor for any given class occurs only once. Subsequent descriptors refer to it. For example, in our previous example, the second reference to the Day class in the array of days was coded as

71 00 7E 00 02


The same mechanism is used for objects. If a reference to a previously saved object is written, it is saved in exactly the same way, that is, 71 followed by the serial number. It is always clear from the context whether the particular serial reference denotes a class descriptor or an object. Finally, a null reference is stored as

70


Here is the commented output of the ObjectRefTest program of the preceding section. If you like, run the program, look at a hex dump of its data file employee.dat, and compare it with the commented listing. The important lines toward the end of the output show the reference to a previously saved object.

AC ED 00 05

File header

75

Array staff (serial #1)

 

0B [LEmployee;

New class, string length, class name Employee[] (serial #0)

   

FC BF 36 11 C5 91 11 C7 02

Fingerprint and flags

   

00

Number of instance fields

   

78

End marker

   

70

No superclass

   

03

Number of array entries

 

73

staff[0]—new object (serial #7)

   

Manager

New class, string length, class name (serial #2)

     

AE 13 63 8F 59 B7 02

Fingerprint and flags

     

01

Number of data fields

     

L 00 09 secretary

Instance field type and name

     

0A LEmployee;

Instance field class name—String (serial #3)

     

78

End marker

     

Employee

Superclass—new class, string length, class name (serial #4)

       

E6 D2 86 7D AE AC 18 1B 02

Fingerprint and flags

       

03

Number of instance fields

       

D 00 06 salary

Instance field type and name

       

L 00 07 hireDay

Instance field type and name

       

Ljava/util/Date;

Instance field class name—String (serial #5)

       

L 00 04 name

Instance field type and name

       

Ljava/lang/String;

Instance field class name—String (serial #6)

       

78

End marker

       

70

No superclass

   

F3 88 00 00 00 00 00

salary field value—double

   

73

hireDay field value—new object (serial #9)

     

0E java.util.Date

New class, string length, class name (serial #8)

       

6A 81 01 4B 59 74 19 03

Fingerprint and flags

       

00

No instance variables

       

78

End marker

       

70

No superclass

     

08

External storage, number of bytes

     

E9 39 E0 00

Date

     

78

End marker

   

0C Carl Cracker

name field value—String (serial #10)

   

73

secretary field value—new object (serial #11)

     

7E 00 04

existing class (use serial #4)

     

E8 6A 00 00 00 00 00

salary field value—double

     

73

hireDay field value—new object (serial #12)

       

7E 00 08

Existing class (use serial #8)

       

08

External storage, number of bytes

       

1B 4E B1 80

Date

       

78

End marker

     

0C Harry Hacker

name field value—String (serial #13)

 

7E 00 0B

staff[1]—existing object (use serial #11)

 

73

staff[2]—new object (serial #14)

   

7E 00 04

Existing class (use serial #4)

   

E3 88 00 00 00 00 00

salary field value—double

   

73

hireDay field value—new object (serial #15)

     

7E 00 08

Existing class (use serial #8)

     

08

External storage, number of bytes

     

6D 3E EC 00 00

Date

     

78

End marker

   

0B Tony Tester

name field value—String (serial # 16)

It is usually not important to know the exact file format (unless you are trying to create an evil effect by modifying the data—see the next section). What you should remember is this:
  • The object stream output contains the types and data fields of all objects.
  • Each object is assigned a serial number.
  • Repeated occurrences of the same object are stored as references to that serial number.

Modifying the Default Serialization Mechanism

Certain data fields should never be serialized, for example, integer values that store file handles or handles of windows that are only meaningful to native methods. Such information is guaranteed to be useless when you reload an object at a later time or transport it to a different machine. In fact, improper values for such fields can actually cause native methods to crash. Java has an easy mechanism to prevent such fields from ever being serialized. Mark them with the keyword transient. You also need to tag fields as transient if they belong to nonserializable classes. Transient fields are always skipped when objects are serialized. The serialization mechanism provides a way for individual classes to add validation or any other desired action to the default read and write behavior. A serializable class can define methods with the signature

private void readObject(ObjectInputStream in)
 throws IOException, ClassNotFoundException;
private void writeObject(ObjectOutputStream out)
 throws IOException;


Then, the data fields are no longer automatically serialized, and these methods are called instead. Here is a typical example. A number of classes in the java.awt.geom package, such as Point2D.Double, are not serializable. Now suppose you want to serialize a class LabeledPoint that stores a String and a Point2D.Double. First, you need to mark the Point2D.Double field as transient to avoid a NotSerializableException.

public class LabeledPoint
{
 . . .
 private String label;
 private transient Point2D.Double point;
}


In the writeObject method, we first write the object descriptor and the String field, state, by calling the defaultWriteObject method. This is a special method of the ObjectOutputStream class that can only be called from within a writeObject method of a serializable class. Then we write the point coordinates, using the standard DataOutput calls.

private void writeObject(ObjectOutputStream out)
 throws IOException
{
 out.defaultWriteObject();
 out.writeDouble(point.getX());
 out.writeDouble(point.getY());
}


In the readObject method, we reverse the process:

private void readObject(ObjectInputStream in)
 throws IOException
{
 in.defaultReadObject();
 double x = in.readDouble();
 double y = in.readDouble();
 point = new Point2D.Double(x, y);
}


Another example is the java.util.Date class that supplies its own readObject and writeObject methods. These methods write the date as a number of milliseconds from the epoch (January 1, 1970, midnight UTC). The Date class has a complex internal representation that stores both a Calendar object and a millisecond count, to optimize lookups. The state of the Calendar is redundant and does not have to be saved. The readObject and writeObject methods only need to save and load their data fields. They should not concern themselves with superclass data or any other class information. Rather than letting the serialization mechanism save and restore object data, a class can define its own mechanism. To do this, a class must implement the Externalizable interface. This in turn requires it to define two methods:

public void readExternal(ObjectInputStream in)
 throws IOException, ClassNotFoundException;
public void writeExternal(ObjectOutputStream out)
 throws IOException;


Unlike the readObject and writeObject methods that were described in the preceding section, these methods are fully responsible for saving and restoring the entire object, including the superclass data. The serialization mechanism merely records the class of the object in the stream. When reading an externalizable object, the object stream creates an object with the default constructor and then calls the readExternal method. Here is how you can implement these methods for the Employee class:

public void readExternal(ObjectInput s)
 throws IOException
{
 name = s.readUTF();
 salary = s.readUTF();
 hireDay = new Date(s.readLong());
}
public void writeExternal(ObjectOutput s)
 throws IOException
{
 s.writeUTF(name);
 s.writeDouble(salary);
 s.writeLong(hireDay.getTime());
}


Java graphics exclamatory_icon.gif

Serialization is somewhat slow because the virtual machine must discover the structure of each object. If you are very concerned about performance and if you read and write a large number of objects of a particular class, you should investigate the use of the Externalizable interface. The tech tip http://developer.java.oracle.com/developer/TechTips/txtarchive/Apr00_Stu.txt demonstrates that in the case of an employee class, using external reading and writing was about 35-40% faster than the default serialization.

Java graphics caution_icon.gif

Unlike the readObject and writeObject methods, which are private and can only be called by the serialization mechanism, the readExternal and writeExternal methods are public. In particular, readExternal potentially permits modification of the state of an existing object.

Java graphics notes_icon.gif

For even more exotic variations of serialization, see http://www.absolutejava.com/serialization.

Serializing Typesafe Enumerations

You have to pay particular attention when serializing and deserializing objects that are assumed to be unique. This commonly happens when implementing typesafe enumerations. An enumerated type is a data type with a finite number of values. The Java coding language has no built-in mechanism for enumerated types. They are often simulated with sets of numbers or strings, but such a simulation is not typesafe. Consider for example the JSlider class. You can construct a slider by specifying an orientation, minimum and maximum values, and the current value. Here is an example:

JSlider slider = new JSlider(SwingConstants.HORIZONTAL,
 0, 100, 50);


The SwingConstants interface defines the constant HORIZONTAL as an integer with value 1. Now suppose a harried programmer doesn't remember the order of the parameters and writes

JSlider slider = new JSlider(0, 100, 50,
 SwingConstants.HORIZONTAL); // wrong order of parameters


This call compiles with no error since the compiler just looks for four values of type int. The problem could be solved if the first parameter had a separate type, say, Orientation. Then the compiler can report a type error if an int is passed instead of a value of type Orientation. In the Java coding language, all types need to be implemented as classes. A class representing an enumerated type is special: we want to make sure that only a finite number of objects can be created. This is achieved in the following way:

public class Orientation
{
 public static final Orientation HORIZONTAL
 = new Orientation(1);
 public static final Orientation VERTICAL
 = new Orientation(2);
 private Orientation(int v) { value = v; }
 private int value;
}


Note that the constructor is private. Thus, no objects can be created beyond Orientation.HORIZONTAL and Orientation.VERTICAL. In particular, you can use the == operator to test for object equality:

if (orientation == Orientation.HORIZONTAL) . . .


This coding idiom is called a typesafe enumeration. There is an important twist that you need to remember when a typesafe enumeration implements the Serializable interface. The default serialization mechanism is not appropriate. Suppose we write a value of type Orientation and read it in again:

Orientation original = Orientation.HORIZONTAL;
ObjectOutputStream out = . . .;
out.write(value);
out.close();
ObjectInputStream in = . . .;
Orientation saved = (Orientation)in.read();


Now the test

if (saved == Orientation.HORIZONTAL) . . .


will fail. In fact, the saved value is a completely new object of the Orientation type and not equal to any of the predefined constants. Even though the constructor is private, the serialization mechanism can create new objects! To solve this problem, you need to define another special serialization methods, called readResolve. If the readResolve method is defined, it is called after the object is deserialized. It must return an object that then becomes the return value of the readObject method. In our case, the readResolve method will inspect the value field and return the appropriate enumerated constant:

protected Object readResolve() throws ObjectStreamException
{
 if (value == 1) return Orientation.HORIZONTAL;
 if (value == 2) return Orientation.VERTICAL;
 return null; // this shouldn't happen
}


Remember to add a readResolve method to all typesafe enumerations. Also note that the enumeration class must store a value from which the constant can be recovered.

Versioning

In the past sections, we showed you how to save relatively small collections of objects via an object stream. But those were just demonstration programs. With object streams, it helps to think big. Suppose you write a program that lets the user produce a document. This document contains paragraphs of text, tables, graphs, and so on. You can stream out the entire document object with a single call to writeObject:

out.writeObject(doc);


The paragraph, table, and graph objects are automatically streamed out as well. One user of your program can then give the output file to another user who also has a copy of your program, and that program loads the entire document with a single call to readObject:

doc = (Document)in.readObject();


This is very useful, but your program will inevitably change, and you will release a version 1.1. Can version 1.1 read the old files? Can the users who still use 1.0 read the files that the new version is now producing? Clearly, it would be desirable if object files could cope with the evolution of classes. At first glance it seems that this would not be possible. When a class definition changes in any way, then its SHA fingerprint also changes, and you know that object streams will refuse to read in objects with different fingerprints. However, a class can indicate that it is compatible with an earlier version of itself. To do this, you must first obtain the fingerprint of the earlier version of the class. You use the stand-alone serialver program that is part of the SDK to obtain this number. For example, running

serialver Employee


prints out

Employee: static final long serialVersionUID =
-1814239825517340645L;


If you start the serialver program with the -show option, then the program brings up a graphical dialog box (see Screenshot-9).

Screenshot-9. The graphical version of the serialver program

Java graphics 12fig09.gif


All later versions of the class must define the serialVersionUID constant to the same fingerprint as the original.

class Employee // version 1.1
{ . . .
 public static final long serialVersionUID
 = -1814239825517340645L;
}


When a class has a static data member named serialVersionUID, it will not compute the fingerprint manually but instead will use that value. Once that static data member has been placed inside a class, the serialization system is now willing to read in different versions of objects of that class. If only the methods of the class change, then there is no problem with reading the new object data. However, if data fields change, then you may have problems. For example, the old file object may have more or fewer data fields than the one in the program, or the types of the data fields may be different. In that case, the object stream makes an effort to convert the stream object to the current version of the class. The object stream compares the data fields of the current version of the class with the data fields of the version in the stream. Of course, the object stream considers only the nontransient and nonstatic data fields. If two fields have matching names but different types, then the object stream makes no effort to convert one type to the other—the objects are incompatible. If the object in the stream has data fields that are not present in the current version, then the object stream ignores the additional data. If the current version has data fields that are not present in the streamed object, the added fields are set to their default (null for objects, zero for numbers and false for Boolean values). Here is an example. Suppose we have saved a number of employee records on disk, using the original version (1.0) of the class. Now we change the Employee class to version 2.0 by adding a data field called department. Screenshot-10 shows what happens when a 1.0 object is read into a program that uses 2.0 objects. The department field is set to null. Screenshot-11 shows the opposite scenario: a program using 1.0 objects reads a 2.0 object. The additional department field is ignored.

Screenshot-10. Reading an object with fewer data fields

Java graphics 12fig10.gif


Screenshot-11. Reading an object with more data fields

Java graphics 12fig11.gif


Is this process safe? It depends. Dropping a data field seems harmless—the recipient still has all the data that it knew how to manipulate. Setting a data field to null may not be so safe. Many classes work hard to initialize all data fields in all constructors to non-null values, so that the methods don't have to be prepared to handle null data. It is up to the class designer to implement additional code in the readObject method to fix version incompatibilities or to make sure the methods are robust enough to handle null data.

Using Serialization for Cloning

There is an amusing (and, occasionally, very useful) use for the serialization mechanism: it gives you an easy way to clone an object provided the class is serializable. (Recall from that you need to do a bit of work to allow an object to be cloned.) To clone a serializable object, simply serialize it to an output stream, and then read it back in. The result is a new object that is a deep copy of the existing object. You don't have to write the object to a file—you can use a ByteArrayOutputStream to save the data into a byte array. As Example 12-6 shows, to get clone for free, simply derive from the SerialCloneable class, and you are done. You should be aware that this method, although clever, will usually be much slower than a clone method that explicitly constructs a new object and copies or clones the data fields (as you saw in ).

Example 12-6 SerialCloneTest.java
 1. import java.io.*;
 2. import java.util.*;
 3.
 4. public class SerialCloneTest
 5. {
 6. public static void main(String[] args)
 7. {
 8. Employee harry = new Employee("Harry Hacker", 35000,
 9. 1989, 10, 1);
 10. // clone harry
 11. Employee harry2 = (Employee)harry.clone();
 12.
 13. // mutate harry
 14. harry.raiseSalary(10);
 15.
 16. // now harry and the clone are different
 17. System.out.println(harry);
 18. System.out.println(harry2);
 19. }
 20. }
 21.
 22. /**
 23. A class whose clone method uses serialization.
 24. */
 25. class SerialCloneable implements Cloneable, Serializable
 26. {
 27. public Object clone()
 28. {
 29. try
 30. {
 31. // save the object to a byte array
 32. ByteArrayOutputStream bout = new
 33. ByteArrayOutputStream();
 34. ObjectOutputStream out
 35. = new ObjectOutputStream(bout);
 36. out.writeObject(this);
 37. out.close();
 38.
 39. // read a clone of the object from the byte array
 40. ByteArrayInputStream bin = new
 41. ByteArrayInputStream(bout.toByteArray());
 42. ObjectInputStream in = new ObjectInputStream(bin);
 43. Object ret = in.readObject();
 44. in.close();
 45.
 46. return ret;
 47. }
 48. catch (Exception e)
 49. {
 50. return null;
 51. }
 52. }
 53. }
 54.
 55. /**
 56. The familiar Employee class, redefined to extend the
 57. SerialCloneable class.
 58. */
 59. class Employee extends SerialCloneable
 60. {
 61. public Employee(String n, double s,
 62. int year, int month, int day)
 63. {
 64. name = n;
 65. salary = s;
 66. GregorianCalendar calendar
 67. = new GregorianCalendar(year, month - 1, day);
 68. // GregorianCalendar uses 0 = January
 69. hireDay = calendar.getTime();
 70. }
 71.
 72. public String getName()
 73. {
 74. return name;
 75. }
 76.
 77. public double getSalary()
 78. {
 79. return salary;
 80. }
 81.
 82. public Date getHireDay()
 83. {
 84. return hireDay;
 85. }
 86.
 87. public void raiseSalary(double byPercent)
 88. {
 89. double raise = salary * byPercent / 100;
 90. salary += raise;
 91. }
 92.
 93. public String toString()
 94. {
 95. return getClass().getName()
 96. + "[name=" + name
 97. + ",salary=" + salary
 98. + ",hireDay=" + hireDay
 99. + "]";
100. }
101.
102. private String name;
103. private double salary;
104. private Date hireDay;
105. }


Screenshot

Java ScreenShot
     
Top
 

Comments