Screenshot CONTENTS Screenshot

Advanced Classes

Java graphics chic01.gif Special attributes, those identifiers with two leading and trailing underscores, are abundant in Jython classes. They are the primary means of creating highly customized objects with complex behaviors. This chapter considers advanced Jython classes as those that leverage these special attributes. Describing these objects as advanced might be misconstrued to mean difficult or reserved for those more studied in Python, but that is not the case. Adding special attributes to classes is part of the ongoing battle against complexity. The ability to tune an object's behavior to act like a list or the ability to intercept attribute access only costs a few special methods while potential gains in design, reusability and flexibility are great.

Pre-Existing Class Attributes

Classes and instances implicitly have certain special attributes—those that automatically appear when a class definition executes or an instance is created. Jython wraps Java classes and instances so that they also have special attributes. Jython classes have five special attributes, whereas Java classes have three of the same five. Note that while all of these attributes are readable, only some allow assignment to them. To further examine these special attributes, let's first define a minimal Jython class suitable for exploring. Listing 7.1 is a Jython module that contains an import statement and a single class definition: LineDatum. The LineDatum class merely defines a line expression (datum) based on the slope and intercept supplied to the constructor. The instance then may add points that lie on the line with the addPoint method.

Listing 7.1 A Jython Class Tracking Points on a Line
# file: datum.py import java class LineDatum(java.util.Hashtable): """Collects points that lie on a line. Instantiate with line slope and intercept: e.g. LineDatum(.5, 3)""" def __init__(self, slope, incpt): self.slope = slope self.incpt = incpt def addPoint(self, x, y): """addPoint(x, y) – > 1 or 0 Accepts coordinates for a cartesian point (x,y). If point is on the line, it adds the point to the instance.""" if y == self.slope * x + self.incpt: self.put((self.slope, self.incpt), (x, y)) return 1 return 0 


Using the LineDatum class from within the interactive interpreter looks like this:

>>> import datum >>> ld = datum.LineDatum(.5, 3) >>> ld.addPoint(0, 3) 1 >>> ld.addPoint(2, 3) 0 >>> ld.addPoint(2, 4) 1 


The special class variables are described in the next few sections.

__name__

This read-only attribute contains the name of the class. The name of the LineDatum class in Listing 7.1 is LineDatum. Even if the import as syntax is used, the __name__ is LineDatum. An example of this follows:

>>> from datum import LineDatum >>> LineDatum.__name__ 'LineDatum' >>> from datum import LineDatum as ld >>> ld.__name__ 'LineDatum' 


Java classes used within Jython also have the __name__ attribute as demonstrated here:

>>> import java >>> java.util.Hashtable.__name__ 'java.util.Hashtable' 


__doc__

This attribute contains the class documentation string, or None if it is not provided. You can assign to __doc__.

>>> from datum import LineDatum >>> print LineDatum.__doc__ Collects points that lie on a line. Instantiate with line slope and intercept: LineDatum(.5, 3) 


Java classes used within Jython do not have a __doc__ attribute.

__module__

This attribute contains the name of the module in which the class is defined. You can assign to __module__. For the LineDatum class in Listing 7.1, it is defined in the module datum and is confirmed with this example:

>>> from datum import LineDatum >>> LineDatum.__module__ 'datum' 


Java classes used within Jython do not have a __module__ attribute. Java doesn't have modules, so this makes sense.

__dict__

This attribute is a PyStringMap object containing all the class attributes. We can guess from looking at the LineDatum class definition in Listing 7.1 what keys should be found in LineDatum.__dict__: There must be an __init__ and an addPoint because those are the two methods defined. There should also be __doc__ and __module__ keys as described previously. The __name__ key is a good guess, but it actually doesn't appear in the class.__dict__ in Jython. The LineDatum class in Listing 7.1 is more complex than a normal class because it actually subclasses a Java class. The implementation of this requires additional attributes as seen in this example:

>>> from datum import LineDatum >>> LineDatum.__dict__ {'__doc__': 'Collects points that lie on a line.\n Instantiate with line slope and intercept: LineDatum(.5, 3)\n ', 'rehash': <java function rehash at 1298249>, '__init__': <function __init__ at 913493>, 'addPoint': <function addPoint at 1939121>, 'finalize': <java function finalize at 1068455>, '__module__': 'datum'} 


Attributes defined in a class appear in the class's __dict__. The conventional notation for accessing attribute b in class A is A.b; however, the class __dict__ allows an alternate notation of A.__dict__['b']. Here's an example of the differing syntaxes for attribute access:

>>> from datum import LineDatum >>> LineDatum.__module__ 'datum' >>> LineDatum.__dict__['__module__'] # same as 'LineDatum.__module__' 'datum' 


You can even call methods with this alternate naming. A bit of a trick is required for the addPoint method of Listing 7.1 because it is an instance method. An instance must be the first parameter. Fortunately, Jython isn't fussy about which instance, so you can just create an instance before testing the syntax and pass it as the first argument:

>>> from datum import LineDatum >>> inst = LineDatum(.5, 3) # get a surrogate instance >>> LineDatum.addPoint(inst, 5, 4) 1 >>> LineDatum.__dict__['addPoint'](inst, 5, 4) # does same as above 1 


This indirect way of accessing class and instance attributes is part of some popular Python patterns. Directly using a class __dict__ is something many flexible Jython object designs employ, and is sometime required when certain special methods are defined such as the __setattr__ method described later. Even more interesting is that Java classes used within Jython also have a __dict__, which contains their members. Looking at the java.io.File class's __dict__ in Jython requires the following:

>>> import java >>> java.io.File.__dict__ {'createNewFile': <java function createNewFile at 4294600>, 'lastModified': <java function lastModified at 3759986>, ... } 


The full results of looking at the java.io.File.__dict__ is left to the reader to discover due to its length. If you just want to look at the member names use the following:

>>> java.io.File.__dict__.keys() ['mkdirs', 'exists', ...] 


Additionally, you can call a method such as listRoots by using its key in java.io.File.__dict__. What would traditionally be called with java.io.File.listRoots() works with java.io.File.__dict__['listRoots']() as demonstrated in this example:

>>> java.io.File.__dict__['listRoots']() array([A:\, C:\, D:\, G:\], java.io.File) 


Currently, you can also assign to a Java class's __dict__. While this is likely bad practice for beginners, it clarifies the nature of __dict__. If you wish to alter the lookup of a Java class member, you can change the value of that key in its __dict__. Suppose you wanted a different method called for java.io.File.listRoots(); you can alter it this way:

>>> import java >>> def newListRoots(): ... return (['c:\\']) ... >>> java.io.File.__dict__['listRoots'] = newListRoots >>> java.io.File.listRoots() # try the circumvented method ['c:\\'] 


__bases__

This is a tuple of bases, or super classes. In Jython version 2.0 and the first alpha versions of Jython 2.1 designate this variable as read-only; however, versions after 2.1a1 allow assignments to this variable. The implementation at the time of this writing differs slightly from CPython because you can alter CPython's __bases__. In Listing 7.1, the superclass of LineDatum is java.util.Hashtable. The special variable __bases__ confirms this:

>>> from datum import LineDatum >>> LineDatum.__bases__ (<jclass java.util.Hashtable at 5012120>,) 


Java classes used within Jython also have the special __bases__ variable, which includes base classes and interfaces implemented:

>>> import java >>> java.io.File.__bases__ (<jclass java.lang.Object at 7290061>, <jclass java.io.Serializable at 62789>, <jclass java.lang.Comparable at 6728374>) 


Pre-Existing Instance Attributes

Jython instances have two special variables that are implicitly defined, while Java instances used within Jython have one. These attributes are readable and assignable.

__class__

The __class__ variable denotes the class that the current object is an instance of. If we continue abusing Listing 7.1, we can demonstrate how an instance of LineDatum knows its class.

>>> from datum import LineDatum >>> ld = LineDatum(.66, -2) >>> ld.__class__ <class datum.LineDatum at 867682> 


You can see that the __class__ variable is not just a string representing the class, but an actual reference to the class. If you know the required parameters, you can create an instance of an instance's class. Continuing the previous example to do so looks like this:

>>> ld2 = ld.__class__(1.2, 6) >>> ld2.__class__ <class datum.LineDatum at 867682> 


You can examine all class properties of instance.__class__ just as you can with the actual class. This is especially advantageous when examining Java instances. The dir() of a Java instances isn't very informative about its instance members because of the nature of the proxy used to access them. That means being able to examine a Java instance's __class__ dictionary aids in exploring a Java instance in the interactive interpreter. If you forget methods available in a Java.io.File instance, you can examine the instance's class for attributes.

>>> import java >>> f = java.io.File("c:\\jython") >>> dir(f) [] >>> dir(f.__class__) ['__init__', 'absolute', 'absoluteFile', 'absolutePath', 'canRead', 'canWrite','canonicalFile', 'canonicalPath', 'compareTo', 'createNewFile', 'createTempFile', 'delete', 'deleteOnExit', 'directory', 'exists', 'file', 'getAbsoluteFile', 'getAbsolutePath', 'getCanonicalFile', 'getCanonicalPath', 'getName', 'getParent', 'getParentFile', 'getPath', 'hidden', 'isAbsolute', 'isDirectory', 'isFile', 'isHidden', 'lastModified', 'length', 'list', 'listFiles', 'listRoots', 'mkdir', 'mkdirs', 'name', 'parent', 'parentFile', 'path', 'pathSeparator', 'pathSeparatorChar', 'renameTo', 'separator', 'separatorChar', 'setLastModified', 'setReadOnly', 'toURL'] 


__dict__

This represents the instance's name space. It is the same idea as the class __dict__, except that it instead contains instance attributes. You can read and alter the contents of an instance's __dict__ just like a class __dict__.

Special Methods for General Customization

Although three of the four general customization methods for objects were introduced in , "Classes, Instances, and Inheritance," they are reiterated here to make this chapter a more complete reference of special attributes.

__init__

The __init__ method is a Jython object's constructor; it gets called for instance creation. Jython superclasses requiring explicit initialization should be initialized in a constructor with baseclass.__init__(self, [args...]). The same syntax works for explicitly initializing a Java superclasses. If a Java superclass is not explicitly initialized, its empty constructor is called at the completion of a Jython subclass's __init__ method. See for more on constructors.

__del__

The __del__ method is a Jython object's destructor or finalizer. It accepts no arguments so it's parameter list should only contain self. There is no guarantee as to when garbage collection will collect an object and thus call the __del__ method. Java does not even guarantee it will be called at all. Because of this, it is best to plan objects so that contents of the __del__ method are minimal or so that a finalizer is unnecessary. Also note that Jython classes that do define __del__ incur a performance penalty. Finalizing methods of Java superclasses are automatically called along with the __del__ method of a Jython instance, but Jython superclass destructors must be explicitly called when their execution is required. Syntax for calling a parent class's destructor is demonstrated here:

>>> class superclass: ... def __del__(self): ... print "superclass destroyed" ... >>> class subclass(superclass): ... def __del__(self): ... superclass.__del__(self) ... print "subclass destroyed" ... >>> s = subclass() >>> del s >>> >>> # wait for a while and hit enter a few time until GC comes around superclass destroyed subclass destroyed 


Exceptions that either Java or Jython finalizing methods raise are all ignored. The only effect a raised exception has is that the finalizing method returns at the point of the exception rather than running to normal completion.

__repr__

The __repr__ method provides an object with string conversion behavior. The use of reverse-quotes or the repr() built-in method calls an object's __repr__ method. Also, if no __str__ attribute exists in the object, the object's __repr__ method is called when it is printed. The __repr__ method should return a valid Python expression as a string that represents the formal data structure of the object. If an appropriate expression is not possible, convention suggests a technical description within angle brackets (<>). Assume that you have an object that is supposed to act like a list. The __repr__ method of this object should return a string that looks like a list (for example, '[1, 2, 3, 4]').

__str__

The __str__ method provides an informal representation of an object called when the object is printed, or when the built-in str() method is used on the object. This differs from __repr__. The __repr__ method returns an expression or data-full representation of an object while the __str__ method usually returns a brief description or characterization of the object. Listing 7.2 demonstrates the implementation of both the special methods __str__ and __repr__. The class in Listing 7.2 implements both these methods so there may be a canonical data object representation (the __repr__ results) and an HTML characterization (the __str__ results).

Listing 7.2 Implementing __str__ and __repr__
# file: html.py class HtmlMetaTag: """Constructor requires "name" field of metatag. Use the intance's "append" method to add to the list""" def __init__(self, name): self.name = name self.list = [] def append(self, item): self.list.append(item) def __repr__(self): return `{'name':self.name, 'list':self.list}` def __str__(self): S = '<meta name="%s" content="%s">' return S % (self.name, ", ".join(self.list)) if __name__=='__main__': mt = HtmlMetaTag("keywords") map(mt.append, ['Jython', 'Python', 'programming']) print "The __str__ results are:\n ", mt print print "The __repr__ results are:\n ", repr(mt) 


The results from running jython html.py:

The __str__ results are: <meta name="keywords" content="Jython, Python, programming"> The __repr__ results are: {'name': 'keywords', 'list': ['Jython', 'Python', 'programming']} 


Dynamic Attribute Access

Jython allows programmers to customize the access, setting and deletion of instance attributes with the corresponding special methods __getattr__, __setattr__, and __delattr__. There is an implied interrelationship between these methods, but there is no requirement to define certain ones. If you want dynamic access, define __getattr__. If you want dynamic attribute assignments, define __setattr__, and if you want dynamic attribute deletion, define __delattr__. What makes these different than using normal attribute access is that they are dynamic: They evaluate at the time the attribute is requested during runtime.

__getattr__

In a class that lacks dynamic attribute lookups, accessing a non-existing attribute is an AttributeError:

>>> class test: ... pass ... >>> t = test() >>> t.a Traceback (innermost last): File "<console>", line 1, in ? AttributeError: instance of 'test' has no attribute 'a' 


A minimal example of dynamic attribute access is the ability to avoid such AttributeErrors as is done in Listing 7.3. Adding dynamic attribute access to an instance requires defining the __getattr__ method. This method must have two parameter slots, the first for self and second for the attribute name. Once __getattr__ is defined, instance attribute lookups that fail in traditional means continue to call the __getattr__ method to fulfill the request. Listing 7.3 is a module containing a class that merely avoids the AttributeError by supplying a default __getattr__ value of None.

Listing 7.3 Adding Dynamic Attribute Access to a Class
# file: getattr.py class test: a = 10 def __getattr__(self, name): return None if __name__ == "__main__": t = test() print "The value of t.a is:", t.a print "The value of t.b is:", t.b 


Results from running jython getattr.py:

The value of t.a is: 10 The value of t.b is: None 


The __getattr__ in Listing 7.3 provides a default value of None for missing attributes. It is a succinct example of using __getattr__, but I should mention that it is somewhat suspect in design. An implementation of __getattr__ normally returns a useful value or raises an AttributeError if it's unable to compute a useful value. Default values are certainly appropriate at times, but they can also hide design flaws. Listing 7.4 more closely resembles common implementations of __getattr__. The valuable principle exploited in Listing 7.4 is the use of an object separate from the class and instance for locating object attributes. Why this is valuable is related to an asymmetry between __getattr__ and __setattr__, which is explained later. In Listing 7.4 the data object used to hold instance attributes is a module-level dictionary, but it could just as easily be a list, an instance of another class, or a network resource. It all depends on what you do in the __getattr__ method (and __setattr__).

Listing 7.4 Attributes Supplied from a Separate Object
# file extern.py data = {"a":1, "b":2} class test: def __getattr__(self, attr): if data.has_key(attr): # lookup attribute in module-global "data" return data[attr] else: raise AttributeError if __name__=="__main__": t = test() print "attribute a =", t.a print "attribute b =", t.b print "attribute c =", t.c # doesn't exist in "data"- is error 


Results from running jython extern.py

attribute a = 1 attribute b = 2 attribute c =Traceback (innermost last): File "extern.py", line 15, in ? AttributeError: instance of "test" has no attribute "c" 


As noted earlier, calling the __getattr__ method occurs only after traditional attribute lookup fails. What is the traditional lookup? Listing 7.3 proves that it must obviously include class variables; otherwise, the output would not include 10. The typical scenario is that attribute lookup begins with the instance dictionary, then the instance dictionary of initialized base classes, then the class dictionary, and finally base class dictionaries. Only after those fail does Jython call the __getattr__ method. In Listing 7.3, looking up the attribute a does not go through __getattr__ because a is found in the class __dict__. One catch in instance initialization is that if you define an __init__ method in a subclass, you need to explicitly call the __init__ in Jython superclasses to ensure proper instance lookup. If you do not define an __init__ method, the superclass constructor is automatically called, as demonstrated here:

>>> class A: ... def __init__(self): ... self.val = "'val' found in instance of superclass" ... >>> class B(A): ... pass ... >>> c = B() >>> c.val "'val' found in instance of superclass" 


If you do define an __init__ in a subclass, but fail to call the constructor of the Jython superclass, attributes cannot be resolved in the instance of the superclass:

>>> class B(A): # assume class A is the same as the previous example ... def __init__(self): ... pass ... >>> c = B() >>> c.val Traceback (innermost last): File "<console>", line 1, in ? AttributeError: instance of 'B' has no attribute 'val' 


This initialization catch does not apply to Java superclasses. If a Java superclass is not explicitly initialized, its empty constructor is called upon completion of the subclasses __init__ method, and instance attribute lookup proceeds as normal.

__setattr__

Adding dynamic attribute assignment to an instance requires a __setattr__ method. This method must have three parameter slots, the first for self, the second for the attribute name, and the third is the value assigned to the attribute. Once defined, the __setattr__ method intercepts all assignments to instance attributes except the implicitly defined __class__ and __dict__. Because of this, you cannot directly set an instance attribute in the __setattr__ method without creating a circular lookup and overflow exception. You can however rebind the instance __dict__ without such errors because of its exemption from the __setattr__ hook, and you can access __dict__ directly because of its exemption from __getattr__. Listing 7.5 demonstrates using the __setattr__ method to restrict field types to integers. Additionally, the use of both __getattr__ and __setattr__ methods allows storage of instance attributes in a data object other than the instances __dict__. Listing 7.5 instead uses a Hashtable called _data to store any instance fields assigned. The assignment of _data in the class constructor does not use self._data=..., but instead uses the instance __dict__ directly. Why? Instance assignments, including those in the constructor, now all go through __setattr__; however, __setattr__ is expecting a _data key in the instance __dict__. This paradox is avoided by adding _data directly to the instance dictionary with the self.__dict__[key]=value syntax, and thus avoids the __setattr__ hook. Another valuable quality of Listing 7.5 is that the __setattr__ method ensures instance variables are not stored in the instance's __dict__. Why is this good? To understand the value, we must first look at the asymmetry between __setattr__ and __getattr__. The __setattr__ always intercepts instance attribute assignments, but __getattr__ is called only when normal attribute lookup fails. This is bad if you want each get and set to perform some symmetrical action such as always accessing and storing values from an external database. Keeping instance values outside the instance __dict__ ensures they are not found and __getattr__ is called. Instance variables in superclasses, and class variables short-circuit this control, so careful planning in subclasses is due. Also note that there is a bit of a performance hit for each __getattr__ call considering the normal lookup must complete unsuccessfully before calling the __getattr__ method.

Listing 7.5 Using __setattr__ and __getattr__
# file: setter.py from types import IntType, LongType import java class IntsOnly: def __init__(self): self.__dict__['_data'] = java.util.Hashtable() def __getattr__(self, name): if self._data.containsKey(name): return self._data.get(name) else: raise AttributeError, name def __setattr__(self, name, value): test = lambda x: type(x)==IntType or type(x)==LongType assert test(value), "All fields in this class must be integers" self._data.put(name, value) if __name__ == '__main__': c = IntsOnly() c.a = 1 print "c.a=", c.a c.a = 200L print "c.a=", c.a c.a = "string" print "c.a=", c.a # Shouldn't get here 


Results from running jython setter.py:

c.a= 1 c.a= 200 Traceback (innermost last): File "setter.py", line 25, in ? File "setter.py", line 17, in __setattr__ AssertionError: All fields in this class must be integers 


__delattr__

Adding dynamic attribute deletion requires defining the __delattr__ method. This method must have two parameter slots, the first for self, and the second for the attribute name. The __delattr__ method is called when using del object.attribute. The attribute could be a resource requiring flushing/closing before deletion. The attribute could also be part of a persistent resource requiring deletion from a database or file system, or could be an attribute you don't want users of your class to delete. It could also be that attributes are stored in a data object other than the instance __dict__ and require special handling for deletion, as would be required for Listing 7.5. The __delattr__ hook allows the programmer to properly handle these types of situations. Listing 7.6 examples using the __delattr__ hook to prevent deletion of an attribute:

Listing 7.6 Using __delattr__ to Protect an Attribute from Deletion
# file: immortal.py class A: def __init__(self, var): self.immortalVar = var def __delattr__(self, name): assert name!="immortalVar", "Cannot delete- it's immortal" del self.__dict__[name] c = A("some value")print "The immortalVar=", c.immortalVar del c.immortalVar 


Results from running jython immortal.py

The immortalVar= some value Traceback (innermost last): File "immortal.py", line 12, in ? File "immortal.py", line 7, in __delattr__ AssertionError: Cannot delete- it's immortal 


Callable Hook—__call__

The special method __call__ makes an instance callable. The number of parameters of the call method is not restricted in any way. On a basic level, this makes an instance act like a function. Creating a function-like instance that prints a simple message looks like this:

>>> class hello: ... def __call__(self): ... print "Hello" ... >>> h = hello() >>> h() # call the instance as if it were a function Hello 


The hello example is a bit misleading in that it disguises the real potential of the __call__ method. Listing 7.7 is a slightly more interesting example that fakes static methods with an inner class that implements the __call__ method. The inner class in Listing 7.7 gets a java.lang.Runtime instance and defines a __call__ method for running a system command and returning its output. The inner class in Listing 7.7 is named _static_runcommand. What a user would call is the instance of the inner class. Remember, the instance is what is callable because of __call__, and is what becomes the static method. A user of the class would instantiate the outer class commands. Let's call this instance A, and then would call the runcommand instance with A.runcommand(command). The runcommand instance looks and acts like a method. Despite numerous instances of the outer commands class, only a single instance of _static_runcommand, and thus a single instance of java.lang.Runtime, is required (this assumes no synchronization requirements).

Listing 7.7 Faking Static Methods with __call__ and Inner Classes
# file: staticmeth.py from java import lang, io class commands: class _static_runcommand: "inner class whose instance is used to fake a static method" rt = lang.Runtime.getRuntime() def __call__(self, cmd): stream = self.rt.exec(cmd).getInputStream() isr = io.InputStreamReader(stream) results = [] ch = isr.read() while (ch > -1): results.append(chr(ch)) ch = isr.read() return "".join(results) runcommand = _static_runcommand() # create instance in class scope if __name__ == '__main__': inst1 = commands() inst2 = commands() # now make sure runcommand is static (is shared by both instances) assert inst1.runcommand is inst2.runcommand, "Not class static" # now call the "faked" static method from either instance print inst1.runcommand("mem") # for windows users #print inst1.runcommand("cat /proc/meminfo") # for linux users 


The results from running jython staticmeth.py on Windows 2000 is:

bytes total conventional memory 655360 bytes available to MS-DOS 633024 largest executable program size 1048576 bytes total contiguous extended memory 0 bytes available contiguous extended memory 941056 bytes available XMS memory MS-DOS resident in High Memory Area 


Special Comparison Methods

Comparison is the use of operators ==, !=, <, <=, >, and >=. This is simple for many objects like integers (5 > 4), but what about user-defined classes. Jython allows the definition of methods that implement such comparisons, but Jython version 2.1 (and Python 2.1) introduced some changes in implementing class comparisons methods. This new feature is rich comparisons. For the sake of contrast, the old comparison method is dubbed poor comparisons.

Three-Way, or Poor, Comparisons

Three-way, or poor comparisons entail a single special comparison method called __cmp__, which returns -1, 0, or 1 depending on whether self evaluates as less, equal, or more than another object. The __cmp__ method must have two parameter slots, the first of which for self and the second for the other object. The verbose version of what compare should do is this:

def __cmp__(self, other): if (self < other): return -1 if (self == other): return 0 if (self > other): return 1 


Listing 7.8 uses the _ _cmp_ _ method and a class attribute, role, to determine sort order for a list of members of a family circle. The built-in cmp() function is used on the appropriate class value as determined by the role the class is set to. If the other class is not an instance of the family class, it always compares as less (note that 'eldest' inverts things so less is really more). Note that setting a class instance flag, role in this case, is not the preferred way to control sort behavior in more complex classes.

Listing 7.8 Comparison by Role
#file: poorcompare.py class family: role = "familyMember" # default value def __init__(self, name, age, relation, communicationSkills): self.name = name self.age = age self.relation = relation self.communicationSkills = communicationSkills self._roles = {1:"familyMember", 2:"communicator", 3:"eldest"} def __cmp__(self, other): if other.__class__ != self.__class__: return -1 # non-family classes are always less if self.role=="familyMember": relations = {"mother":1, "father":1, "aunt":2, "uncle":2, "cousin":3, "unrelated":4} return cmp(relations[self.relation], relations[other.relation]) elif self.role=="communicator": return cmp(self.communicationSkills, other.communicationSkills) elif self.role=="eldest": return cmp(other.age, self.age) #, other.age) def __repr__(self): # This is an abuse of __repr__- "canonical" data not returned.. # Included only for sake of example. return self.name if __name__ == '__main__': L = [] # add ppl to list L.append(family("Fester", 80, "uncle", 2)) L.append(family("Gomez", 50, "father", 1)) L.append(family("Lurch", 75, "unrelated", 3)) L.append(family("Cousin It", 113, "cousin", 4)) L.append("other data-type") # print list sorted by default role L.sort() print "by relation:", L # print list sorted by communication skills: family.role = "communicator" L.sort() print "by communication skills:", L # print list eldest to youngest family.role = "eldest" L.sort() print "eldest to youngest:", L 


Output from running jython poorcompare.py is:

by relation: [Gomez, Fester, Cousin It, Lurch, 'other data-type'] by communication skills: [Gomez, Fester, Lurch, Cousin It, 'other data-type'] eldest to youngest: [Cousin It, Fester, Lurch, Gomez, 'other data-type'] 


Rich Comparisons

Rich comparisons appear in Jython's 2.1 versions and are not restricted to the -1, 0, 1 return values that __cmp__ is. If comparing two lists or two matrices, you can return a list or matrix containing the element-wise comparisons, another object, None, NotImplemented, a Boolean or raise an exception. The special rich comparison methods are a set of six special methods representing the six comparison operators. Each method requires two parameters: the first for self and the second for other. Table 7.1 lists the operator and associated special, rich-comparison method.

Table 7.1. Rich Comparison Methods

Operator

Method

<

__lt__ (self, other)

<=

__le__ (self, other)

==

__eq__ (self, other)

!=

__ne__ (self, other)

>

__gt__ (self, other)

>=

__ge__ (self, other)

For objects A and B, the rich comparison of A < B becomes A.__lt__(B). If the comparison is B > A, the method B.__gt__(A) is evaluated. Each operator has a natural compliment, but there is no enforcement of an invariant such as A.__lt__(B) == B.__gt__(A). The left-hand object is first searched for an appropriate rich comparison method. If one is not defined, the right-hand object is searched for the compliment method. If both halves define an appropriate method, the left hand object's method is used. Listing 7.9 defines two classes: A and B. Class A defines all six rich comparison methods such that they compare self 's class name with the name of the other class. Class B defines only one comparison method: greater-than (__gt__). Note that the means of comparison in class B's __gt__ method is by an instance creation timestamp: something incongruous with class A's definition of comparability. Such divergent definitions are troublesome and are cautioned against without strong cause. The remainder of Listing 7.9 goes through testing the comparison combinations using one instance of each class. You can see from the output how appropriate comparison methods are resolved.

Listing 7.9 Rich Comparison Methods
# file: rich.py import time class A: def __init__(self): self.timestamp = time.time() def __lt__(self, other): print "...Using A's __lt__ method...", return self.__class__.__name__ < other.__class__.__name__ def __le__(self, other): print "...Using A's __le__ method...", return self.__class__.__name__ <= other.__class__.__name__ def __ne__(self, other): print "...using A's __ne__ method...", return self.__class__.__name__ != other.__class__.__name__ def __gt__(self, other): print "...Using A's __gt__ method...", return self.__class__.__name__ > other.__class__.__name__ def __ge__(self, other): print "...Using A's __ge__ method...", return self.__class__.__name__ >= other.__class__.__name__ def __eq__(self, other): print "...Using A's __eq__ method...", return self.__class__.__name__ == other.__class__.__name__ class B: def __init__(self): self.timestamp = time.time() def __gt__(self, other): print "...Using B's __gt__ method...", return self.timestamp > other.timestamp if __name__ == '__main__': inst_b = B() inst_a = A() print "Is a < b?", inst_a < inst_b print "Is b < a?", inst_b < inst_a print "Is a <= b?", inst_a <= inst_b print "Is b <= a?", inst_b <= inst_a print "Is a == b?", inst_a == inst_b print "Is b == a?", inst_b == inst_a print "Is a != b?", inst_a != inst_b print "Is b != a?", inst_b != inst_a print "Is a > b?", inst_a > inst_b print "Is b > a?", inst_b > inst_a print "Is a >= b?", inst_a >= inst_b print "Is b >= a?", inst_b >= inst_a 


The output from running jython rich.py is:

Is a < b? ...Using A's __lt__ method... 1 Is b < a? ...Using A's __gt__ method... 0 Is a <= b? ...Using A's __le__ method... 1 Is b <= a? ...Using A's __ge__ method... 0 Is a == b? ...Using A's __eq__ method... 0 Is b == a? ...Using A's __eq__ method... 0 Is a != b? ...using A's __ne__ method... 1 Is b != a? ...using A's __ne__ method... 1 Is a > b? ...Using A's __gt__ method... 0 Is b > a? ...Using B's __gt__ method... 0 Is a >= b? ...Using A's __ge__ method... 0 Is b >= a? ...Using A's __le__ method... 1 


Listing 7.9 clarifies the rich comparison methods but is pointless with regard to practical apps. A more plausible usage is the element-wise comparison of list objects. Listing 7.10 defines a class called listemulator, which includes an __lt__ method definition. The listemulator class behaves like a list thanks to the help of the UserList class imported at the beginning of the listing. The details of emulating other types occurs later in this chapter, but for now, let's assume an instance of the listemulator class acts exactly like a normal Jython list except for the __lt__ comparison. The __lt__ method in Listing 7.10 does two things. First, it compares the length of itself and other to ensure the element-wise comparison is legitimate (they are the same length). Then, the __lt__ method compares each element of self and other and returns the list of comparison results.

Listing 7.10 Element-Wise Rich Comparison
# file: richlist.py from UserList import UserList class listemulator(UserList): def __init__(self, list): self.data = list UserList.__init__(self, self.data) def __lt__(self, other): if len(self) != len(other): raise ValueError, ("Instance of %s differs in size from %s" % (self.__class__.__name__, other.__class__.__name__)) return map(lambda x, y: x < y, self, other) L = [2,3,4,5] LC = listemulator([2,3,3,4]) print LC < L 


The results from running jython richlist.py are:

[0, 0, 1, 1] 


__hash__

Dictionary operations rely on the hash value of those objects used as keys. Jython objects can determine their own hash value for dictionary key operations and the built-in hash function by defining the special __hash__ instance method. The __hash__ method has only one parameter and it is self, and the return value should be an integer. A restriction in implementing __hash__ is that objects of the same value should return the same hash value. Because dictionary keys must be immutable, objects that define a comparison method but no __hash__ method cannot be used as dictionary keys.

Object "Truth"

The search for the truth in Jython objects follows these rules:

Implementing trueness with the __nonzero__ method looks like this:

import random class gamble: def __nonzero__(self): return random.choice([0,1]) 


The __nonzero__ parameter list includes only self, and the return value is 1 for true, and 0 for false.

Emulating Built-In Data Objects

Numerous occasions call for creating classes that emulate built-in data objects. Maybe a project requires a Jython dictionary, but needs that dictionary to maintain order, or maybe you need a list with a special lookup. Emulating built-in objects allows you to extend their behavior, add constraints and instrument object operations with minimal work, yet end up with a familiar interface. Implementing extended behavior with the familiarity of a built-in interface adds near zero complexity, which is the primary goal of objects. Jython's special methods allow user-defined classes to emulate Jython's built-in numeric, sequence and mapping objects. The emulation of objects that have associated methods requires implementations of those non-special methods as well to truly emulate that object. The examples in this section often use a Jython class that internally uses a java object to illustrate this functionality. This is something that is not always required. Jython is very good about converting types to meet the situation. The java.util.Vector object already supports the PyList index syntax (v[inex]) and java.util.Hashtable and java.util.HashMap already support key assignment (h[key]), and numeric objects automatically convert to the appropriate types where needed. With this substantial intuitive support for Java objects there is often no need to wrap them in special methods; however, there may be little things like a java.util.Vector not supporting slice syntax in Jython:

>>> import java >>> v = java.util.Vector() >>> v.addElement(10) >>> v.addElement(20) >>> v[0] # this works 10 >>> v[0:2] Traceback (innermost last): File "<console>", line 1, in ? TypeError: only integer keys accepted 


Emulating built-in types allows you to specify every behavior to ensure an object acts indistinguishably similar to a built-in type. Because Jython and Java are so very integrated, passing objects between these languages is pervasive. It is often convenient to allow Java objects to better emulate Jython built-ins so users of your code need not care which is Java and which isn't. Java objects often already contain methods that do the same thing as a comparable method in a Jython built-in, but are named differently. Listing 7.11 shows a convenient way to map such Java methods to Jython methods to further ease emulating built-in objects. The HashWrap class in Listing 7.11 is a subclass of java.util.Hashtable that assigns class identifiers to Hashtable methods that already perform the expected behavior. Notice that the HashWrap class doesn't define values(), clear(), or get(). These names already exist in the superclass, java.util.Hashtable, and perform close enough to what is expected. The only catch in Listing 7.11 is that some of the Hashtable functions return unexpected types, such as the Enumeration returned from keys() and items(). These methods are wrapped in a simple lambda expression to convert them into a list. Some of Jython's dictionary methods don't have direct parallels in Java Hashtable's, so setdefault(), popitem(), and copy() are defined in the HashWrap class. Listing 7.11 also contains special methods—those that begin and end with two underscores. The meaning of these special methods might be discernable from the Java methods they are mapped to, but the idea at this point is only to show how to map identifiers to Java methods.

Listing 7.11 Assigning Java Methods to Jython Class Identifiers
# file: hashwrap.py import java import copy class HashWrap(java.util.Hashtable): #map jython names to Hashtable names has_key = java.util.Hashtable.containsKey update = java.util.Hashtable.putAll # Hashtable returns an Enumeration for keys = lambda self: map(None, java.util.Hashtable.keys(self)) items = lambda self: map(None, java.util.Hashtable.elements(self)) # these don't have direct parallels in Hashtable, so define here def setdefault(self, key, value): if self.containsKey(key): return self.get(key) else: self.put(key, value) return value def popitem(self): return self.remove(self.keys()[0]) def copy(self): return copy.copy(self) # These are the special methods introduced in this section. # Read on to find out more. __getitem__ = java.util.Hashtable.get __setitem__ = java.util.Hashtable.put __delitem__ = java.util.Hashtable.remove __repr__ = java.util.Hashtable.toString __len__ = java.util.Hashtable.size if __name__ == '__main__': hw = HashWrap() hw["A"] = "Alpha" hw["B"] = "Beta" print hw print hw.setdefault("G", "Gamma") print hw.setdefault("D", "Delta") print hw["A"] print "keys=", hw.keys() print "values=", hw.values() print "items=", hw.items() 


Output from running jython hashwrap. py is:

{A=Alpha, B=Beta} Gamma Delta Alpha keys= ['A', 'G', 'D', 'B'] values= [Alpha, Gamma, Delta, Beta] items= ['Alpha', 'Gamma', 'Delta', 'Beta'] 


Emulating Sequences

Built-in sequences come in two flavors: mutable and immutable. Immutable sequences (PyTuples) have no associated methods, while mutable sequences (PyLists) do; both flavors have similar sequence behaviors such as indexes and slices. Truly emulating a PyList would involve defining its associated methods (append, count, extend, index, insert, pop, remove, reverse, and sort) as well as the special methods associated with sequence length, indexes and slices. Emulating immutable sequences (PyTuples) requires only a subset of the special methods as some of those methods implement operations unique to mutable objects (they change object contents). A user-defined object need not implement all sequence behaviors, so you are free to define only those methods that suite your design. However, defining all sequence methods allows users to be blissfully unaware of inconsequential differences between your class and a built-in data object, which is really better for abstraction, reusability and the holy grail of objects: reduced complexity. The following subsections delineate sequence behaviors, and their associated special methods. The descriptions used in subsections assume a sequence S,a PyList L and a PyTuple T. The use of sequence and S indicates special functions applicable to sequences in general. The use of PyTuple and T indicates implementation of an immutable sequence and the use of PyList and L indicates comments specific to mutable sequences.

__len__

A sequence should have a length equal to len(S). The special method that returns an object's length is __len__(self). Calling len(S) is equivalent to S.__len__(). The __len__(self) method must return an integer >= 0. Note that there are no means of enforcing the accuracy of __len__. A list-like object with ten elements can define a __len__ method that returns 1. Listing 7.12 keeps sequence elements in a java.util.Vector instance. The length method returns the results from the vector's size() method as the object's length:

Listing 7.12 Implementing Sequence Length
# file: seqlen.py import java class usrList: def __init__(self): # use name-mangled, private identifier for vector self.__data = java.util.Vector() def append(self, o): self.__data.add(o) def __len__(self): return self.__data.size() if __name__ == '__main__': L = usrList() L.append("A") L.append("B") L.append("C") print "The length of object L is:", len(L) 


The output of jython seqlen.py is:

The length of object L is: 3 


__getitem__

S[i] gets the value at sequence index i. The index i can be a positive integer (counted from the left side of a sequence), a negative integer (counted from the right side), or a slice object. The special method used to return an object designated by a specific index or slice is __getitem__(self, index). Calling S[i] is equivalent to S.__getitem__(i). The __getitem__ method should raise an IndexError exception when the specified index is out of range, and a ValueError for non-supported index types. To truly emulate built-in sequences, you must allow for an index value of a positive integer, negative integer, or slice object. The __getitem__ method is sometimes confused with the special class attribute method __getattr__ so it's worth noting their differences. The __getitem__ method retrieves what the object defines as list or mapping entries (mapping implementations appear later) instead of attributes of the object itself. Additionally, __getitem__ is called for each item retrieval, unlike __getattr__, which is called only after a normal object attribute lookup fails. Because __getitem__ is always called, it's a good candidate for implementing persistence and other special behavior that require symmetry between setting and getting objects. Listing 7.13 is similar to Listing 7.12 in that it is a list-like class wrapped around a java.util.Vector. Now we use this concept to illustrate the __getitem__ method. The __getitem__ implementation in Listing 7.13 allows for positive, negative, and slice indexes by converting the specified index value into a list of positive integers. The implementation should allow for positive, negative, and slice indexes and should raise the IndexError and ValueError exceptions where appropriate. Remember that negative indexes mean they are counted from the right, -1 being the last sequence item, -2 the penultimate, and so on. Jython slicing conventions assume default values for missing slice elements, so a user-defined, list-like object should also allow for this. For the slice [::3], Jython assumes start of sequence and end of sequence for the first two missing values, then uses the 3 as the step value. Implementing all this in a user-defined object may sound daunting, but it need not be difficult or convoluted. An important tool in the battle against complexity is leveraging functionality already found in familiar objects. The premise of all this is that __getitem__ should emulate the behavior of a built-in PyList object—there couldn't be a bigger hint. Use a list object in the __getitem__ implementation. If the internal data is in a vector, make a list the size of the vector and apply the index or slice to the list— you've now handled positive, negative and slice indexes, as well as default index values and the IndexError and ValueError exceptions. That's the trick used in Listing 7.13. Here is an example of just the list trick to make it more clear:

>>> import java >>> v = java.util.Vector() >>> map(v.addElement, range(10)) [None, None, None, None, None, None, None, None, None, None] >>> v # take a peek at the vector [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> # create a slice object >>> i = slice(2, -3, 4) >>> >>> ## Next line handles all slice logic, and appropriate exceptions >>> indexes = range(v.size())[i] >>> indexes [2, 6] # remember- these are indexes of values, not values 


Whereas the java.util.Vector object doesn't support slices, a PyList does. Make a PyList containing the vector's range of index numbers and apply the index or slice to that list. The result is a positive list of vector indexes or a single vector index number. Using the PyList also ensures that any ValueErrors or IndexErrors are raised as needed. Searching for ways to reuse existing functionality, especially that of built-ins, is vital in battling complexity. Listing 7.13 acquires either a single, positive index or a list of positive indexes from using the list trick. Once the appropriate positive index values are determined, the __getitem__ method returns the appropriate value or values.

Listing 7.13 Sequence Item Retrieval with __getitem__
# file: seqget.py import java import types class usrList: def __init__(self, initial_values): data = java.util.Vector() map(data.add, initial_values) self.__data = data def __getitem__(self, index): indexes = range(self.__data.size())[index] try: if not isinstance(indexes, types.ListType): return self.__data.elementAt(indexes) else: return map(self.__data.elementAt, indexes) except java.lang.ArrayIndexOutOfBoundsException: raise IndexError, "index out of range: %s" %% index if __name__ == '__main__': S = usrList(range(1,10)) print "S=", S[:] print "S[3]=", S[3] print "S[-2]=", S[-2] print "S[1:7:2]=", S[1:7:2] print "S[-5:8]=", S[-5:8] print "S[-8:]=", S[-8:] 


The output from running jython seqget. py is:

S= [1, 2, 3, 4, 5, 6, 7, 8, 9] S[3]= 4 S[-2]= 8 S[1:7:2]= [2, 4, 6] S[-5:8]= [5, 6, 7, 8] S[-8:]= [2, 3, 4, 5, 6, 7, 8, 9] 


Note that in Listing 7.13, using the vector's elementAt() method is inside a try/except. If an index is out of range, the more verbose Java exception and traceback is caught and replaced with a Jython IndexError exception. This is only an implementation choice—there is no obligation to wrap Java exceptions, but the IndexError helps the userList class act more like a built-in. Although Listing 7.13 uses a PyList to do the dirty work, there may be instances requiring explicit handling of types. This introduces type testing. Normally in Jython a variable's type can be tested against a reference type such as one of these three following examples:

>>> import types >>> a = 1024L >>> if type(a) == types.LongType: "a is a LongType" ... 'a is a LongType' >>> if type(a) == type(1L): "a is a LongType" ... 'a is a LongType' >>> if type(a) in [types.IntType, types.LongType]: "a is an ok type" ... 'a is an ok type' 


The test for appropriate types is interesting when you introduce Java types. Allowing for sufficient discrimination of Java types is a bit odd. For example, consider the following:

>>> import java >>> v = java.util.Vector() >>> i = java.lang.Integer(3) >>> type(v) == type(i) 1 


If type() can't tell the difference between an integer and vector, how can you allow for limited Java types? One way to do so is to test an object's class. All Jython objects have a _ _class_ _ attribute, including the Java ones, so you could do the following:

>>> import java >>> i = java.lang.Integer(5) >>> v = java.util.Vector() >>> ok_types = [(1).__class__, (1L).__class__, java.lang.Integer, java.lang.Long] >>> i.__class__ in ok_types 1 >>> v.__class__ in ok_types 0 


Another means of confirming the appropriateness of object types is the use of the built-in isinstance function. The isinstance function accepts an object and a class as arguments and returns 1 if the object is an instance of the specified class, 0 otherwise. Using isinstance to check types is preferred because of its appropriateness when working with inheritance hierarchies. Using isinstance to check types would look like this:

>>> import types >>> a = 1024L >>> if isinstance(a, types.LongType): "a is a LongType" ... 'a is a LongType' 


__setitem__

The expression L[i] = object should bind object to index i in list-like object L. This is specific to classes designed to emulate lists, not immutable objects (PyTuple -like). Only mutable objects should implement this behavior. The special method used to bind an object to a specific index is __setitem__(self, index, value). This method should raise an IndexError exception when the specified index is out of range. The index value could be negative, positive or a slice object. Raise a ValueError exception for those index values the __setitem__ implementation does not allow for. Assigning to a slice has some special constraints, at least for the built-in PyList. You don't have to respect this behavior in user-defined objects, but it is recommended. The restrictions are that the step value must be 1, and the value need not be the same length as the slice. First, look at an assignment to a single index:

>>> S = ["a", "b", "c"] >>> S[1] = [1, 2, 3] >>> S ['a', [1, 2, 3], 'c'] 


The value assigned to index 1 is a list, which shows up as a single object in index 1. Assigning to a slice differs:

>>> S = ["a", "b", "c"] >>> S[3:4] = [1, 2, 3] >>> S ['a', 'b', 'c', 1, 2, 3] 


There is still only 1 index involved which is index 3, but assigning to a slice means something different in that the right side must be a sequence, and the resulting list is the concatenation:

S[0:slice.start] + values + S[slice.stop:len(S)] 


Listing 7.14 defines a class that implements the __setitem__ method, but chooses to implement two constraints: All values must be strings and the list is a static size designated in a constructor parameter. Each list index internally represents a line in the file usrList.dat. Setting L[2]=Some string changes the second line of the file to Some string. This makes the internal list similar to class static variables considering all instances would be reading from a single file (this can be changed with another constructor argument however). The opening and closing of the file within each method is expensive, so this would only really occur if this file were a shared resource. The persistence could otherwise be implemented in the __init__ method and possible a close method (note that __del__ would work, but is often avoided because exceptions in __del_ _ are ignored and there's no guarantee of when that method gets called). Listing 7.14 does support assignment to list indexes and slices, but because the lines of the file are stored in a real PyList as an intermediary, this functionality is automatic. Listing 7.14 raises ValueError and IndexError exceptions appropriately, but note that the IndexErrors would propagate from normal list operations rather than catching and re-raising the exception.

Listing 7.14 Adding Persistence with __setitem__
# file: seqset.py import types import os class usrList: def __init__(self, size): self.__size = size self.__file = "usrList.dat" if not os.path.isfile(self.__file): f = open(self.__file, "w") print >> f, " \n" * size f.close() def __repr__(self): f = open(self.__file) L = f.readlines()[:self.__size] f.close() return str(map(lambda x: x[:-1], L)) def __setitem__(self, index, value): f = open(self.__file, "r+") L = f.readlines()[:self.__size] if isinstance(index, types.SliceType): if len(L[index]) != len(value): raise ValueError, "Bad value: %s" % value for x in value: if not isinstance(x, types.StringType): raise ValueError, "Only String values supported" L[index] == map(lambda x: x + "\n", value) if (isinstance(index, types.IntType) or isinstance(index, types.LongType)): if type(value) != types.StringType: raise ValueError, "Only String values supported" L[index] = value + "\n" f.seek(0) f.writelines(L) f.close() if __name__ == '__main__': S = usrList(10) for x in range(10): S[x] == str(x) print "First List=", S S[4:-4] = "four", "five" print "Second List =", S for x in range(10, 20): S[x-10] = str(x) print "Last list = ", S 


Output from running jython seqset.py

First List= ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] Second List = ['0', '1', '2', '3', 'four', 'five', '6', '7', '8', '9'] Last list = ['10', '11', '12', '13', '14', '15', '16', '17', '18', '19'] 


The values in the usrList instance are stored in the usrList.dat file, so future instantiations of the usrList class will start with those values. You can confirm this in the interactive interpreter—just make sure to start the interpreter from within the same directory as the usrList.dat file (the usrList instance only looks in the current directory):

>>> import seqset >>> L = seqset.usrList(10) >>> print L # <- print persistent values ['10', '11', '12', '13', '14', '15', '16', '17', '18', '19'] 


The convenience of an automatically persistent data type is great, but the performance of Listing 7.14 isn't. Using this same technique with a speedy database helps greatly. Listing 7.14 is instructive, but there's little about handling the unique constraints of assignments to a slice within it because the internal data is a PyList. Listing 7.15 uses a java.util.Vector for the internal data so supporting slices in __setitem__ is clarified.

Listing 7.15 Wrapping a Java Vector in a List Class
#file: seqset1.py import java import types class usrList: def __init__(self): self.__data = java.util.Vector() map(self.__data.addElement, range(10)) def __getitem__(self, index): indexes = range(self.__data.size())[index] if isinstance(index, types.SliceType): return map(self.__data.elementAt, indexes) else: return self.__data.elementAt(indexes) def __setitem__(self, index, value): if isinstance(index, types.SliceType): size = self.__data.size() if index.step != 1: raise ValueError, "Step size must be 1 for setting list slice" newdata = java.util.Vector() map(newdata.addElement, range(0, index.start)) map(newdata.addElement, value) map(newdata.addElement, range(index.stop, size)) self.__data = newdata else: self.__data.setElementAt(value, index) def __delitem__(self, index): indexes = range(self.__data.size())[index] indexes.reverse() # so we can delete High to Low for i in indexes: self.__data.removeElementAt(i) def __repr__(self): return str(map(None, self.__data)) if __name__ == "__main__": L = usrList() print "L=", L print "L[1:]=", L[1:7] print "L[-4:9]=", L[-4:9] L[3:6:1] = range(100, 110) print L 


Output from running jython seqset1. py:

L= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] L[1:]= [1, 2, 3, 4, 5, 6] L[-4:9]= [6, 7, 8] [0, 1, 2, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 6, 7, 8, 9] 


__delitem__

The expression del L[i] should delete an object from list-like object L. This is obviously specific to mutable objects (PyList -like object). The special method used to delete a specific index is __delitem__(self, index, value). This method should raise an IndexError exception when the specified index is out of range and a ValueError exception when the index type is unsupported. Listing 7.16 defines a class that implements the __delitem__ method. The class holds its list contents in an internal java. util. Vector object, so the __delitem__ method must use the vector's removeElementAt() (or remove()) method for each index deleted. Adding support for slices is familiar from previous examples, but one additional trick is required to delete items from the vector. Deleting a slice becomes consecutive removeElementAt() operations on the vector. If indexes were deleted from low to high, the index of the next higher item to be removed is reduced by one every time an item is deleted. The list of indexes requiring deletion is reversed in Listing 7.16 to ensure that the indexes are in highest to lowest order. This allows deletion without decrementing indexes of future deletions.

Listing 7.16 Implementing __delitem__
# file: seqdel.py import java import types class userList: def __init__(self): self.__data = java.util.Vector() def append(self, object): self.__data.addElement(object) def __repr__(self): return str(list(self.__data)) def __delitem__(self, index): if isinstance(index, types.SliceType): if index.step != 1: raise ValueError, "Step size must be 1 for setting list slice" delList = range(self.__data.size())[index] else: delList = [range(self.__data.size())[index]] delList.reverse() map(self.__data.removeElementAt, delList) if __name__ == '__main__': S = userList() map(S.append, range(5, 20, 2)) print "Before deletes:", S del S[3] del S[4:6] print "After deletes: ", S 


Output from running jython seqdel.py:

Before deletes: [5, 7, 9, 11, 13, 15, 17, 19] After deletes: [5, 7, 9, 13, 19] 


Sequence Concatenation and Multiplication

Table 7.2 contains the math operations a sequence-like object should support as well as the special methods associated with that operation.

Table 7.2. Sequence Math Operations

Operation

Description

Special Method

S1 + S2

The concatenation of two sequences.

__add__ and __radd__

L1 += L2

Changing L1 into the concatenation of L1 and L2 with augmented assignment.

__iadd__

S * i

Repeating a sequence for integer i times.

__mul__ and __rmul__

L *= i

Changing list L into i number of repetitions of L with augmented assignment.

__imul__

Implementing the concatenation and repetition operations requires implementing the special operator methods for addition and multiplication. Operator methods occur in threes: addition has __add__, __radd__, and __iadd__, whereas multiplication has __mul__, __rmul__, and __imul__. The first of these (__add__, or __mul__) is called when the object defining it is on the left side of an operation. For the sequence S it would mean S + X or S * X are really implemented as S.__add__(X) and S.__mul__(X). The second of these methods (__radd__, and __rmul__) are reflected versions of the first—for when the object is on the right side of the expression. The reflected methods are called only if objects on the left do not define __add__ or __mul__. If S defines __radd__ and __rmul__, but X does not, then X + S and X * S become S.__radd__(X) and S.__rmul__(X). The __iadd__ and __imul__ methods implement augmented assignment, meaning S += X and S *= X are implemented as S.__iadd__(X) and S.__imul__(X). Listing 7.17 implements all six methods listed in Table 7.2. The obligations of these methods are that they raise exceptions for unsupported types, and that only the augmented assignment operations modify self.

Listing 7.17 Sequence Concatenation and Repetition
# file: seqmath.py import types import java class usrList: def __init__(self): self.__data = java.util.Vector() map(self.__data.addElement, range(5,8)) # some default values def __add__(self, other): if isinstance(other, types.ListType): return map(None, self.__data) + other else: raise TypeError, "__add__ only defined for ListType" def __radd__(self, other): if isinstance(other, types.ListType): return other + map(None, self.__data) else: raise TypeError, "__radd__ only defined for ListType" def __iadd__(self, other): # Augmented assignments methods usually modify self, then return self if isinstance(other, types.ListType): map(self.__data.addElement, other) return self #map(None, self.__data) # act like a list type else: raise TypeError, "__iadd__ only defined for ListType" def __mul__(self, other): if (isinstance(other, types.IntType) or isinstance(other, types.LongType)): return map(None, self.__data) * other else: raise TypeError, "Only integers allowed for multiplier" def __rmul__(self, other): if (isinstance(other, types.IntType) or isinstance(other, types.LongType)): return map(None, self.__data) * other else: raise TypeError, "Only integers allowed for multiplier" def __imul__(self, other): if (isinstance(other, types.IntType) or isinstance(other, types.LongType)): map(self.__data.addElement, [x for x in self.__data] * (other -1)) return self else: raise TypeError, "Only integers allowed for multiplier" def __repr__(self): return str(map(None, self.__data)) if __name__ == '__main__': L = usrList() print "start :", L print "__add__ :", L + [1,0] print "__radd__ :", ["a", "b"] + L L += ["-", "-"] print "__iadd__ :", L print "__mull__ :", L * 2 try: print "__rmull__ :", 2 * L except TypeError: print "__rmull__ raised a TypeError" L *= 2 print "__imull__ :", L 


Output from running jython seqmath.py is:

start : [5, 6, 7] __add__ : [5, 6, 7, 1, 0] __radd__ : ['a', 'b', 5, 6, 7] __iadd__ : [5, 6, 7, '-', '-'] __mull__ : [5, 6, 7, '-', '-', 5, 6, 7, '-', '-'] __rmull__ : [5, 6, 7, '-', '-', 5, 6, 7, '-', '-'] __imull__ : [5, 6, 7, '-', '-', 5, 6, 7, '-', '-'] 


Slices

Python deprecated the following methods in version 2.0. Support for these methods still exists in the Jython code base, so they are included here for com-pleteness. Their inclusion is not meant to encourage their use, but instead is supplied just in case a reader encounters this in legacy code. For new code, use __setitem__, __getitem__, and __delitem__.

__contains__

Testing if an object o is a member of sequence S usually loops through S looking for o. If your design requires numerous membership tests like this, you are facing a harsh, quadratic performance penalty. You do, however, have the options to optimize this membership test with the special method __contains__. If the __contains__ method is defined, list membership tests instead call S.__contains__(o) rather than looping through S. The __contains__ method has the self parameter and a parameter slot for the item whose membership is in question. The __contains__ function should return 0 (false) if it does not contain the object and 1, or non-zero (true) if it does. Listing 7.18 is a class that emulates a list, but also sets members as keys in an internal dictionary. The dictionary value is the number of times the object appears in the list. This allows speedy membership tests by checking if the dictionary has the key rather than looping through the sequence. The tradeoff is increased memory usage and a slower setting and deleting of items. Listing 7.18 adds the __setitem__ and_ _delitem__ methods as these operations must be intercepted to keep the internal dictionary and list in sync. Two helper methods,_ _incrementMember and __decrementMember, are defined to help in handling the syncing process by determining each key's count (value) and deleting or creating the key when necessary.

Listing 7.18 Accelerating Membership Tests with __contains__
# file: seqin.py import types class usrList: def __init__(self, initialValues): self.__data = initialValues self.__membership = {} map(self.__membership.update, [{key:1} for key in self.__data]) def __contains__(self, item): return self.__membership.has_key(item) # for __contains__ to work, assignment and deletion must # change self.__data and self.__membership def __setitem__(self, index, value): if isinstance(index, types.SliceType): if index.step != 1: raise ValueError, "Assignment to slice requires step=1" indexes = self.__data[index] else: indexes = [self.__data[index]] # updated self.__data _and_ self.__membership self.__data[index] = value map(self.__decrementMember, indexes) map(self.__incrementMember, values) def __delitem__(self, index): indexes = self.__data[index] del self.__data[index] if isinstance(indexes, types.ListType): map(self.__decrementMember, indexes) else: self.__decrementMember(indexes) # it's really only one index def __incrementMember(self, member): if self.__membership.has_key(member): self.__membership[member] += 1 else: self.__membership[member] = 1 def __decrementMember(self, member): if self.__membership.has_key(member): if self.__membership[member] == 1: del self.__membership[member] else: self.__membership[member] -= 1 def __repr__(self): return str(self.__data) if __name__ == '__main__': from time import time t1 = time() pyList = range(0, 12000, 3) print "The PyList took %f seconds to fill." %% (time()-t1,) t1 = time() newList = usrList(range(0, 12000, 3)) print "The usrList took %f seconds to fill." %% (time()-t1,) t1 = time() count = 0 for x in range(10, 12000, 7): if x in pyList: count += 1 print "Found %i items in pyList in %f seconds" %% (count, time()-t1) t1 = time() count = 0 for x in range(10, 12000, 7): if x in newList: count += 1 print "Found %i items in newList in %f seconds" %% (count, time()-t1) 


Output from running jython seqin.py is:

The PyList took 0.000000 seconds to fill. The usrList took 0.110000 seconds to fill. Found 571 items in pyList in 6.430000 seconds Found 571 items in newList in 0.220000 seconds 


Listing 7.18 has a verbose testing section to illustrate the item setting penalty and the membership test benefit inherent in this approach.

UserList

The Jython (and Python) library contains a module aimed at easing the creating of list-like, user-defined objects. The previous examples used a class called usrList that was intended to foreshadow the introduction of this without creating naming confusion (note the spelling difference). The UserList module defines one class: UserList. This class optionally uses a PyList object internally to represent the list data, and supplies default methods for working with this data. If you choose not to use a PyList object for the internal data, you need to override all required methods. Listing 7.19 uses the UserList class to keep statistics about the frequency items are requested from the list. The important points of Listing 7.19 are the internal data and the methods defined. The ListStats class in Listing 7.19 chooses to use the PyList object as internal data and passes that object to the UserList constructor. Not all methods require implementing to fully act like a built-in list because UserList handles everything not explicitly defined in the ListStats class of Listing 7.19. If it did not pass a PyList to the UserList superclass, much more work would need to be done to fully act like a list.

Listing 7.19 UserList and List-Like Objects
# file: liststats.py import UserList class ListStats(UserList.UserList): def __init__(self, data=[]): self.data = data assert type(data)==type([]), "Constructor arg must be a list" UserList.UserList(data) self.stats = {} self.requestCount = 0 def __getitem__(self, index): items = self.data[index] if type(items) != type([]): # make plain integers into a list for convenience items = [items] for x in items: self.requestCount += 1 self.stats[x] = self.stats.setdefault(x, 0) + 1 return items def printStats(self): for x in self.data: use = self.stats.setdefault(x, 0) if not use: continue print ("%00.i, %1.3f%% " % (self.data[x], float(use)/float(self.requestCount)*100)), print # to put prompt on a new line if __name__ == '__main__': import random L = ListStats(range(10)) for x in range(2000): L[random.randint(0, 9)] # access a random index L.printStats() 


The output from running jython liststats.py is:

0, 9.100% 1, 9.600% 2, 11.000% 3, 10.900% 4, 11.350% 5, 8.750% 6, 10.850% 7, 9.500% 8, 9.550% 9, 9.400% 


Emulating Mappings

Emulating a mapping type is extremely similar to emulating a list type, except you work with keys instead of indexes. A built-in mapping object implements the methods clear, copy, get, has_key, items, keys, setdefault, update, and values, so truly emulating a mapping type required implementing these methods. The special methods that a mapping object should implement are __len__, __getitem__, __setitem__, and __delitem__. These should look familiar from the section on lists.

Because the implementation of these special methods is so familiar from emulating lists, the following code listing should be sufficient to demonstrate these special mapping methods. Listing 7.20 borrows ideas from Listing 7.14 in that it also implements data elements as files, but adds a bit of a twist. The dictionary represents a directory, the keys represent files, and the values represent file contents. Listing 7.20 also allows for numbers and data objects to be stored by using Jython's pickle module. Pickling is one of Jython's serialization mechanisms. The two pickle methods employed are dumps(), which converts from object to string, and loads(), which converts from string to object. To serialize the list in Listing 7.20, we use the following:

string = pickle.dumps(list) 


To restore the list, we use this:

pickle.loads(string) 


There is another trick used for safety sake in listing 7.20. A JythonIDfile is added to each directory created for this mapping. There are no checks to guarantee this class created a certain directory, or any file within it, so it must identify directories as special somehow (lest someone try this example with /etc or C:\windows\ system). Some of the library methods used have been introduced before, but for clarity, here is a list of methods and what they do:

Listing 7.20 A Persistent Dictionary
# file: specialmap.py import types import os import pickle from stat import ST_SIZE class mappingDirectory: def __init__(self, directory): self.__ID = None self.__dir = directory if not os.path.exists(directory): os.mkdir(directory) idfile = os.path.join(directory, "JythonIDfile") f = open(idfile, "wb") print >> f, str(id(self)) f.close() elif not os.path.isdir(directory): raise ValueError, "File %s already exists." % directory elif not os.path.isfile(os.path.join(directory, "JythonIDfile")): msg = "Directory exists, but it isn't a mapping directory." raise ValueError, msg def __repr__(self): listing = os.listdir(self.__dir) results = {} for x in listing: if x == "JythonIDfile": continue size = os.stat(os.path.join(self.__dir, x))[ST_SIZE] results[x] == "<datafile: size=%i>" %% size return str(results) def __setitem__(self, key, value): self.__testKey(key) pathandname = os.path.join(self.__dir, key) f = open(pathandname, "w+b") print >> f, pickle.dumps(value) f.close() def __getitem__(self, key): self.__testKey(key) pathandname = os.path.join(self.__dir, key) try: f = open(pathandname, "rb") except IOError: raise KeyError, key value = f.read() f.close() return pickle.loads(value) def __delitem__(self, key): self.__testKey(key) pathandname = os.path.join(self.__dir, key) if not os.path.isfile(pathandname): raise KeyError, key os.remove(pathandname) def __testKey(self, key): if not isinstance(key, types.StringType): raise KeyError, "This mapping restricts keys to strings" if key == "JythonIDfile": raise KeyError, "The name JythonIDfile is reserved." if __name__ == '__main__': md = mappingDirectory("c:\\windows\\desktop\\jythontestdir") md["odd"] = filter(lambda x: x%2, range(10000)) md["even"] = filter(lambda x: not x%2, range(10000)) md["prime"] = [2, 3, 5, 7, 11, 13, 17] print "Mapping =", md print "primes =", md["prime"] del md["prime"] print "primes deleted" print "Mapping =", md 


Output from running jythonspecialmap.py is:

Mapping = {'prime': '<datafile: size=38>', 'odd': '<datafile: size=34452>', 'even': '<datafile: size=34452>'} primes = [2, 3, 5, 7, 11, 13, 17] primes deleted Mapping = {'odd': '<datafile: size=34452>', 'even': '<datafile: size=34452>'} 


Emulating Numeric Types

Emulating a numeric type requires defining the special methods for each numeric operation the object should support. The special methods associated with numeric operations are those that implement unary and binary operators, conversion to other types, and coercion. The majority of the special methods are for the binary operators, and these methods appear in triples. For example, implementing addition involves defining the __add__ method for when the object is on the left side of the addition operator, the __radd__ method for when the object is on the right side of the operator, or __iadd__ for when using augmented assignment (+=). These methods are sometimes called respectively normal, reflected, and augmented methods. The augmented assignment methods (__i*__) are unique in that their implementation doesn't return a value, but instead modifies self. However, if a numeric object does not define the augmented method, it can still be used in an augmented assignment. If the object N defines __add__, but not __iadd__, the expression N += N executes the following:

N = N.__add__(N) 


Table 7.3 lists numeric operations and their associated method. On the left side of Table 7.3 is the operation with N representing the user-defined, numeric object. The right side of Table 7.3 is the method signature of the associated special method. If a method should returns a specific type of object, the return type is noted by --> type. For example, the operation N + 2 translates into N.__add__(2), and the method signature is __add__(self, other). Something that plays an important role in the numeric operations is __coerce__. The __coerce__ method is called whenever the two operands are of differing types. The operation N1 + N2, where N1 and N2 are different types, actually calls N1.__coerce__(N2). The __coerce__ method returns a tuple of N1 and N2 converted to a common type—let's call them T1 and T2. Then T1.__add__(T2) is called. If the left operand does not have the __coerce__ method, the right operand's_ _coerce__ method is called.

Table 7.3. Numeric Binary Operators and Their Special Methods

Operators

Methods

N + 2

__add__(self, other)

2 + N

__radd__(self, other)

N += 2

__iadd__(self, other) --> self

N - 2

__sub__(self, other)

2 - N

__rsub__(self, other)

N -= 2

__isub__(self, other) --> self

N * 2

__mul__(self, other)

2 * N

__rmul__(self, other)

N *= 2

__imul__(self, other) --> self

N / 2

__div__(self, other)

2 / N

__rdiv__(self, other)

N /= 2

__idiv__(self, other) --> self

N % 2

__mod__(self, other)

2 % N

__rmod__(self, other)

N %= 2

__imod__(self, other) --> self

divmod(N, 2)

__divmod__(self, other)

divmod(2, N)

__rdivmod__(self, other)

N ** 2

__pow__(self, other)

2 ** N

__rpow__(self, other)

N **= 2

__ipow__(self, other) --> self

pow(N, 2, 2)

__pow__(self, other, mod=1)

pow(2, N, 2)

__rpow__(self, other, mod=1)

N << 2

__lshift__(self, other)

2 << N

__rlshift__(self, other)

N <<= 2

__ilshift__(self, other) --> self

N >> 2

__rshift__(self, other)

2 >> N

__rrshift__(self, other)

N >>= 2

__irshift__(self, other) --> self

N & 2

__and__(self, other)

2 & N

__rand__(self, other)

N &= 2

__iand__(self, other) --> self

N | 2

__or__(self, other)

2 | N

__ror__(self, other)

N |= 2

__ior__(self, other) --> self

N ^ 2

__xor__(self, other)

2 ^ N

__rxor__(self, other)

N ^= 2

__ixor__(self, other) --> self

- N

__neg__(self)

+ N

__pos__(self)

~ N

__invert__(self)

abs(N)

__abs__(self)

coerce(N, x)

__coerce__(self, other) --> some common type or None

complex(N)

__complex__(self) --> PyComplex

float(N)

__float__(self) --> PyFloat

hex(N)

__hex__(self) --> PyString

int(N)

__int__(self) --> PyInteger

long(N)

__long__(self) --> PyLong

oct(N)

__oct__(self) --> PyString

 

Screenshot CONTENTS Screenshot
Comments