TESTING CLASS MEMBERSHIP

Some object-oriented languages allow the program to test membership of an object in a class at run time, as summarized in Table 14.6.

Table 14.6. Facilities for type testing and safe casting.
	Modula-3	Java

Test whether object `x` belongs class `C`, or to any subclass of `C`.	ISTYPE(`x,C`)	`x instanceof C`

Given a variable `x` of class `C`, where `x` actually points to an object of class `D` that extends `C`, yield an expression whose compile-time type is class `D`.	NARROW(`x,D`)	`(D)x`

Since each object points to its class descriptor, the address of the class descriptor can serve as a "type-tag." However, if x is an instance of D, and D extends C, then x is also an instance of C. Assuming there is no multiple inheritance, a simple way to implement x instanceof C is to generate code that performs the following loop at run time: Java ScreenShot

goto L₁

where t₁.super is the superclass (parent class) of class t₁. However, there is a faster approach using a display of parent classes. Assume that the class nesting depth is limited to some constant, such as 20. Reserve a 20-word block in each class descriptor. In the descriptor for a class D whose nesting depth is j, put a pointer to descriptor D in the jth slot, a pointer to D.super in the (j − 1)th slot, a pointer to D.super.super in slot j − 2, andsoonupto Object in slot 0. In all slots numbered greater than j, put nil. Now, if x is an instance of D, or of any subclass of D, then the jth slot of x's class descriptor will point to the class descriptor D. Otherwise it will not. So x instanceof D requires

Fetch the class descriptor d at offset 0 from object c.
Fetch the jth class-pointer slot from d.
Compare with the class descriptor D.

This works because the class-nesting depth of D is known at compile time. Type coercions Given a variable c of type C, it is always legal to treat c as any supertype of C - if C extends B, and variable b has type B, then the assignment b ← c is legal and safe. But the reverse is not true. The assignment c ← b is safe only if b is really (at run time) an instance of C, which is not always the case. If we have b ← new B, c ← b, followed by fetching some field of c that is part of class C but not class B, then this fetch will lead to unpredictable behavior. Thus, safe object-oriented languages (such as Modula-3 and Java) accompany any coercion from a superclass to a subclass with a run-time type-check that raises an exception unless the run-time value is really an instance of the subclass (e.g., unless b instanceof C). It is a common idiom to write

Modula-3: Java:
IF ISTYPE(b,C) if (b instanceof C)
 THEN f(NARROW(b,C)) f((C)b)
 ELSE ... else ...

Now there are two consecutive, identical type tests: one explicit (ISTYPE or instanceof) and one implicit (in NARROW or the cast). A good compiler will do enough flow analysis to notice that the then-clause is reached only if b is in fact an instance of C, so that the type-check in the narrowing operation can be eliminated. C++ is an unsafe object-oriented language. It has a static cast mechanism without run-time checking; careless use of this mechanism can make the program "go wrong" in unpredictable ways. C++ also has dynamic_cast with run-time checking, which is like the mechanisms in Modula-3 and Java. Typecase Explicit instanceof testing, followed by a narrowing cast to a subclass, is not a wholesome "object-oriented" style. Instead of using this idiom, programmers are expected to use dynamic methods that accomplish the right thing in each subclass. Nevertheless, the test-then-narrow idiom is fairly common. Modula-3 has a typecase facility that makes the idiom more beautiful and efficient (but not any more "object-oriented"):

TYPECASE expr
OF C₁ (v₁) => S₁
 | C₂ (v₂) => S₂
 ⋮
 | C_n (v_n) => S_n
ELSE S₀
END

If the expr evaluates to an instance of class C_i, then a new variable v_i of type C_i points to the result of the expr, and statement S_i is executed. The declaration of v_i is implicit in the TYPECASE, and its scope covers only S_i. If more than one of the C_i match (which can happen if, for example, one is a superclass of another), then only the first matching clause is taken. If none of the C_i match, then the ELSE clause is taken (statement S₀ is executed). Typecase can be converted straightforwardly to a chain of else-ifs, with each if doing an instance test, a narrowing, and a local variable declaration. However, if there are very many clauses, then it can take a long time to go through all the else-ifs. Therefore it is attractive to treat it like a case (switch) statement on integers, using an indexed jump (computed goto). That is, an ordinary case statement on integers:

ML: C, Java:
case i switch (i) {
 of 0 => s₀ case 0: s₀; break;
 | 1=> s₁ case 1: s₁; break;
 | 2=> s₂ case 2: s₂; break;
 | 3=> s₃ case 3: s₃; break;
 | 4=> s₄ case 4: s₄; break;
 |_=> s_d default: s_d;
 }

is compiled as follows: First a range-check comparison is made to ensure that i is within the range of case labels (0-4, in this case); then the address of the ith statement is fetched from the ith slot of a table, and control jumps to s_i. This approach will not work for typecase, because of subclassing. That is, even if we could make class descriptors be small integers instead of pointers, we cannot do an indexed jump based on the class of the object, because we will miss clauses that match superclasses of that class. Thus, Modula-3 typecase is implemented as a chain of else-ifs. Assigning integers to classes is not trivial, because separately compiled modules can each define their own classes, and we do not want the integers to clash. But a sophisticated linker might be able to assign the integers at link time.

If all the classes in the typecase were final classes (in the sense used by Java, that they cannot be extended), then this problem would not apply. Modula-3 does not have final classes; and Java does not have typecase. But a clever Java system might be able to recognize a chain of else-ifs that do instanceof tests for a set of final classes, and generate a indexed jump.