POLYMORPHIC TYPE-CHECKING

Type-checking for a polymorphic language is not as straightforward as for a monomorphic language. Before embarking on an implementation, we must be clear about what the typing rules are. The types used in the basic MiniJava compiler could all be represented as atomic strings: int, boolean, int[], and class identifiers such as IntList. In GJ we have three kinds of type expressions:

Primitive types such as int and boolean;
Type apps of the form c 〈t₁, t₂,… t_n〉, where c is a type constructor - a polymorphic class such as List in our example - and t₁ through t_n are type expressions; and Type variables such as the X in the type expression ListX.

All class identifiers will be considered polymorphic, but those with no type arguments can be considered as types of the form C<>. In this chapter we will omit discussion of array types, though in a "MiniGJ" compiler int[] could be treated as a primitive type. Syntactic conventions The type-checking algorithm will have several kinds of variables:

T, T_i, U, V stand for type expressions;
N, N_i, P, Q stand for nonvariable type expressions (that do not contain type variables);
X, X_i, Y, Z stand for type variables;
C stands for class names;
m stands for method names;
f stands for field names;
e stands for program expressions; and
x stands for program variables (local variables of methods).

We will often abbreviate the keyword extends with the symbol ◃. In discussing the type-checking algorithm we will require that every TyParams have an explicit bound, so instead of writing class List<X> we would write class List<X◃Object>. Method types It's convenient to separate the type of a method from the name of the method. For example, consider the declaration

<X extends Printable> GPair<X> firstTwo(List<X> x) { ... }

For any type X that's a subclass of Printable, the firstTwo method takes a list of X's, and returns a pair of X's. The type of the firstTwo method is 〈X ◃ Printable〉 List 〈X〉 → GPair 〈X〉. (Notice that the binding occurrence of X requires a bounding clause 〈X ◃ Printable〉, but the applied occurrences of X do not use the ◃ symbol.) In general, the form of a method type is Java ScreenShot

meaning that, with type parameters TyParams and value parameters whose types are TyList, the method returns a value of type Ty. Substitution Suppose we want to know the type of the firstTwo method when it's applied to a list of PrintableInt. We take List 〈X〉 → GPair 〈X〉 and substitute PrintableInt for X, yielding the method type

List 〈PrintableInt〉 → GPair 〈PrintableInt〉

We can write a substitution function in the compiler's type-checker to do this; in this chapter we'll write [V₁,…, V_k/X₁,…, X_k]U to mean the substitution of type V_i for every occurrence of type variable X_i in type expression (or function prototype) U. Because class parameterization is not nested in GJ - that is, U cannot contain class declarations - we don't need to worry about internal redeclarations of the type variables X_i; the problem of avoiding variable capture (described in ) does not occur. Class table, type table, and var table explained that, because classes can mutually refer to each other, type-checking must proceed in two phases: First, build a symbol table (the class table) mapping class names to class declarations; second, type-check the interior of the class declarations. The same approach works for GJ. The first phase is quite straightforward; the rest of this section explains the algorithm for the second phase. This phase will use the class table, and maintain a type table mapping formal type parameters to their bounds, and a var table mapping ordinary (value) variables to their types. For example, in processing the firstTwo method (described above), the type table would map X to Printable, and the var table would map x to List<X>. Well-formed types A type is well formed if every type variable has a bound in the appropriate type table. Thus, the type List<X> is well formed when processing the declaration of firstTwo because X ◃ Printable is in the type table. In general, we must also account for substitutions. Algorithm 16.1 shows the procedure checktype for checking wellformedness of types.ALGORITHM 16.1: Checking wellformedness of types and subtyping.

checkType(T) =
 if T is primitive (int, boolean, etc.) then OK
 else if T is Object then OK
 else if (T ↦ ...) is in the type table then OK
 else if T is C 〈T₁,..., T_n〉
 look up C in the class table, yielding class C 〈X₁ ◃ N₁,..., X_n ◃ N_n〉 ◃ N{...}
 for each T_i do checkType(T_i
 for each T_i do checkSubtype(T_i, [T₁,..., Tⁿ)X₁,..., Xⁿ]N_i
checkSubtype(T, U) =
 if T = U then OK
 else if (T ↦ T′) is in the type table then subtype(T′, U
 else if T is C 〈T₁,..., T_n〉
 look up C in the class table, yielding class C 〈X₁ ◃ N₁,..., X_n ◃ N_n〉 ◃ N{...}
 subtype([T₁,..., T_n/X₁,..., X_n]N, U)
 else error

Subtyping In type-checking Java (and GJ) it is often necessary to test whether one type is a subtype of another: A is a subtype of B if, whenever a value of type B is required, a value of type A will suffice. A subtype is not exactly the same as a subclass, but still, in class-based languages like Java, if class A extends B, then there is also a subtype relationship. Checking subtyping in GJ is made more complex by the need to perform type substitution and the need to look up type variables in the type table; Algorithm 16.1 shows how to do it. Finding the bound of a type The formal type parameters of a polymorphic class or method are of the form hX ◃ N_i. Inside that class or method, suppose there's a local variable declaration X x, where X is the type and x is the variable. When type-checking expressions such as the field access x. f, we need to know what class x is an instance of. We don't know exactly what class x belongs to - it belongs to some unknown class X - but we do know that X is a subclass of N. Therefore, we take the type X and look up its bound, N, in the type table. The function getBound(T) looks up type bounds. Looking up field and method types When type-checking a field access e: f or a method call e.m 〈T 〉 (), one must look up the field f or method m in e's class. This may involve a search through superclasses, and in the process, types must be substituted for type variables. These searches are handled by fieldType and methodType in Algorithm 16.2. Algorithm 16.3 shows how to type-check expressions using all the auxiliary procedures defined thus far.ALGORITHM 16.2: Field and method search.

getBound(T) =
 if T is a type variable X
 then if (X ↦ N) is in the type table then N else error
 else if T is a nonvariable type N then N
 else error
fieldType(f;) =
 if T is C 〈T₁,..., T_n〉
 look in the class table for class C 〈X₁ ◃ N₁,..., X_n ◃ N_n〉 ◃ N{fields, ...}
 if field f with type S is in fields
 return [T₁,..., T_n/X₁,..., X_n]S
 else return fieldType(f, [T₁,..., T_n/X₁,..., Xⁿ]N)
 else error
methodType(m, T ) =
 if T is C 〈T₁,..., T_n〉
 look in the class table for class C 〈X₁ ◃ N₁,..., X_n ◃ N_n〉 ◃ N{..., methods}
 if method m is in methods; with the declaration
 〈Y₁ ◃ P₁,..., Y_k ◃ P_k〉 U m(U₁ x₁ ... U_l x_l){return e}
 then return [T₁,..., T_n/X₁,..., X_n](〈Y₁ ◃ P₁,..., Y_k ◃ P_k〉 (U₁ x₁,..., U_l x_l) → U)
 else return methodType(m, [T₁,..., T_n/X₁,..., X_n]N)

ALGORITHM 16.3: Type-checking expressions. Expressions with integer type are omitted, because they are checked just as in MiniJava

checkExp(e) =
 if e is a variable x
 look up (x ↦ T ) in the var table; return T
 else if e is a field-selection e₀. f
 return fieldType(f; getBound(checkExp(e₀)))
 else if e is a method-call e₀m 〈V₁,..., V_n〉 (e₁,..., e^l)
 call methodType(m, getBound(checkExp(e₀))), yielding
 〈Y₁ ◃ P₁,..., Y_k ◃ P_k〉 (U₁ x₁,..., U^l x_l) → U)
 for each V_i do checkType(V_i)
 for each V_i do checkSubtype(V_i, [V₁,..., V^k/Y₁,..., Y_k]Pⁱ)
 for i ∈ 1 ... l do checkSubtype(checkExp(e_i), [V₁,..., V_k/Y₁,..., Y_k]U_i)
 return [V₁,..., V_k/Y₁,..., Y_k]U
 else if e is new N)
 checktype(N/)
 return N

Algorithm 16.4 shows how to check class and method declarations. The formal type parameters of the class are added to the type table, then all the methods are checked. For each method, the formal type parameters and the (ordinary) value parameters of the method are added to the type table and value table, then the body of the method is type-checked. The four lines ending at the last call to checkSubtype are to make sure that the method over-riding is valid; in GJ as in Java, one can only override a method at the same argument and result type as in the superclass, but for GJ there are more substitutions to perform in order to check this.ALGORITHM 16.4: Type-checking class declarations.

checkClass(cdecl) =
 suppose cdecl is class C 〈X₁ ◃ N₁,..., X_n ◃ N_n〉 ◃ N{fields, methods}
 add (X₁ ↦ N₁,... X_k ↦ N_k) to the type table
 checkType(N)
 for each N_i do checkType(N_i)
 for each mdecl in methods
 suppose mdecl is 〈Y₁ ◃ P₁,..., Y_j〉 ◃ P_j〉 T m(T₁ x₁ ... T_l x_l) {return e}
 add (Y₁ ↦ P₁,... Y_j ↦ P_j) to the type table
 checkType(T)
 for each T_i do checkType(T_i)
 for each P_i do checkType(P_i)
 add (this ↦ C 〈X₁,..., X_n〉) to the var table
 add (x₁ ↦ T₁,... x_l ↦ T_l) to the var table
 checkSubtype(checkType(e), T )
 suppose methodType(m, N) is 〈Z₁ ◃ Q₁,..., Z_j ◃ Q_j 〉 (U₁ x₁,..., U_l x_l) → U
 for each T_i check that T_i = ([Y₁,..., Y_j/Z₁,..., Z_j]Uⁱ)
 for each P_i check that P_i = ([Y₁,..., Y_j/Z₁,..., Z_j]Q_i)
 checkSubtype(T, [Y₁,..., Y_j/Z₁,..., Z_j]U)
 pop j most recent bindings from the type table
 pop l + 1 most recent bindings from the var table
 pop k most recent bindings from the type table