for Java Programmers - Java - Languages - Programming

User-Defined Functions and Variable Scoping

Defining Functions
Namespaces
Recursion
Built-In Functional Programming Tools
Synchronization

Java graphics chic01.gif Functions are those callable objects that can exist outside of a class. Functions also appear in class definitions, where they are made into methods by object attribute lookup. This chapter, however, focuses on function definition outside of classes. Java does not allow such classless entities; Jython, on the other hand, does. In Jython, functions are first-class objects, meaning the language allows for their dynamic creation and the ability to pass them around as parameters or return values. Jython functions are instances of the org.python.core. PyFunction class (PyReflectedFunction for built-in functions). Inspecting a Jython function with the built-in function type() reports that it is an instance of the org. python.core. PyFunction class:

>>> def myFunction(): ... pass ... >>> type(myFunction) <jclass org.python.core.PyFunction at 6879429>

This chapter describes how to define a function—its syntax, documentation, and parameters. Understanding Jython's namespaces is essential to writing functions so also warrants inclusion here. The latter portion of this chapter addresses the tools available within Jython that are normally associated with functional programming. The detour through functional coding is included to balance the arduous language description with some clues as to what the language can do. Within this chapter's rubric, Jython can do first class functions and efficient functional programming.

Note from Author

This chapter would have been more profound before the addition of inner classes to Java 1.1, which serves the first-class function need, but the conciseness and efficiency of functional coding in Jython is still intriguing. Note that the reference to functional coding does not mean examples throughout the chapter exemplify the functional coding style found in the strictly functional languages Erlang, Haskell, and Clean (as opposed to the imperative style of Ada, C/C++, Java, and Pascal). Examples are very much imperative through most of the chapter, while functional coding examples occur at the end and otherwise where noted.

Defining Functions

The syntax for a Jython function is as follows:

"def" function__name([parameter_list])":" code block

The function name, like all identifiers, can be any alphanumeric strings without spaces as long as the first character is an alpha character or the underscore _. Names are case sensitive, and the use of the underscore has special meaning, which , "Modules and Packages," explains in detail. The only caution associated with function names is to be wary of accidentally rebinding to the same name of another function, such as one of the built-in functions. If you were to define a function named type as demonstrated in Listing 4.1, it would make the built-in type function inaccessible.

Listing 4.1 Defining a Function that Replaces the Built-In type()

>>> S = "A test string" >>> type(S) # test the built-in type() <jclass org.python.core.PyString at 6879429> >>> def type(arg): ... print "Not the built-in type() function" ... print "Local variables: ", vars() # look for "arg" ... >>> type(S) Not the built-in type() function Local variables: {'arg': 'A test string'} >>> >>> # to restore the builtin "type()" function,, do the following: >>> from org.python.core import __builtin__ >>> type = __builtin__.type

The parameter list is an optional list of names that occur in parentheses following the function name. Variables passed to a function are bound to the supplied parameter names within the local namespace. The function defined in Listing 4.1 accepts one parameter, which is bound to the local name arg. A colon, :, ends the function declaration and designates the beginning of the associated code block.

Indentation

Indentation delimits blocks of code in Jython. A function's associated block of code must start one indention level in from its def statement. Because indention is the only delimiter for code blocks, you cannot have an empty code block—there must be some kind of placeholder within a function. Use the pass statement if you need a placeholder. Listing 4.2 has functions that demonstrate Jython's relative indention levels. Listing 4.2 determines which numbers are prime numbers out of a list provided as a parameter. The testing for prime is grossly inefficient, but is a good demonstration of a function's syntax and structure. The isPrime function is a function that simply returns 1, meaning true, for those numbers that are prime, 0 otherwise. Because generating primes occurs throughout this chapter, elements of a more efficient prime search should be discussed. Above the number 2, all primes are odd. All non-prime numbers can be created by multiplying prime numbers. Knowing these things makes it esy to disqualify many candidates and reduce loop iterations when testing for prime. Listing 4.2 is inefficient because it ignores the later principle and only restricts the divisor search to odd numbers. Other prime number generators in this chapter will contain similar weaknesses due to the preference for simple, clear examples of Jython principles rather than algorithms.

Listing 4.2 Finding Primes with Nested Functions

#file primes.py def isPrime(num): """Accepts a number and returns true if it's a prime""" for base in [2] ++ range(3, num, 2): if (num%base==0): return 0 return 1 def primes(S1): """Accepts a list of numbers and returns a list of those numbers that are prime numbers""" primes = [] for i in S1: if isPrime(i): primes.append(i) return primes list1 = range(20000, 20100) print "primes are:", primes(list1)

Results from running jython primes. py:

primes are: [20011, 20021, 20023, 20029, 20047, 20051, 20063, 20071, 20089]

Because the isPrime function in Listing 4.2 returns values that are Jython's understanding of true and false (for numberic objects, 0=false, non-zero=true), the function can be the conditional portion of an if statement. The if statement in this case discards values for which the isPrime function returns 0.

Return Values

Listing 4.2 uses the return statement to designate the function's return value. A function returns the results of the expression following the return statement. If no return is supplied, the function returns None. If return occurs with no value following it, None is returned. If more than one value is returned, the values are returned as a tuple. It is worth mentioning that a function exits when a return is called.The call-return style of coding assumes that a return statement does just that—returns the value and program flow to the calling statement. Listing 4.3 defines a function that determines the relationship of a point to a circle with specified center and radius. The bounds function in this listing explicitly returns 1, 0, or –1 depending on whether the point is inside, on, or outside the circle. The final return statement in Listing 4.3 is reached only when the first if statement evaluates to false. Because of the absence of an expression following that return, it returns the value None. This final return statement could have been left out because without an explicit return, a function returns None anyway.

Listing 4.3 A Circle Inclusion Function

# file: bounds.py def bounds(p, c, r): """Determine a points "p" relationship to a circle of designated center "c" and radius "r". 1 = within circle, 0 = on circle, -1 = outside circle""" ## Ensure args are not empty nor 0 if p and c and r: ## Find distance from center "dfc" dfc = ((p[0]-c[0])**2 + (p[1]-c[1])**2)**.5 return cmp(r, dfc) print bounds((1,2), (3,1), 3) print bounds((-3, 1), (1, 1), 4) print bounds((4,3), (5,-2), 1) print bounds((), (2,2), 4)

Output from running jython bounds.py:

1 0 -1 None

Documentation Strings

Listing 4.2 has a documentation string for each of the two functions. This string begins in the first line of the code block so must be one indention level in from the def statement. The content of this string becomes the value of the function's __doc__ attribute (two prefixing and trailing underscores). If we look at a simple function in the interactive interpreters, as is done in Listing 4.4, we see that exploring objects interactively is much more valuable when documentation strings are available.

Listing 4.4 Function Documentation Strings

>>> def getfile(fn): ... """getfile(fn), Accepts a filename, fn, as a parameter and quietly returns the resulting file object, or None if the file doesn't exist.""" ... try: ... f = open(fn) ... return f ... except: ... return None ... >>> print getfile.__doc__ getfile(fn), Accepts a filename, fn, as a parameter and quietly returns the resulting file object or None if the file doesn't exist.

Numerous modules are thoroughly documented in their __doc__ strings, and it's worth looking for an associated __doc__ string first when encountering troubles with a function, module, or class.

Function Attributes

As of Jython version 2.1, functions can also have arbitrary attributes. These are attributes assigned with function.attribute syntax, and this assignment can be within the function block, or outside of the function. Here's an example of a function with attributes:

>>> def func(): ... func.a = 1 ... func.b = 10 ... print func.a, func.b, func.c ... >>> func.c = 100 >>> func() 1 10 100

Parameters

Jython's parameter scheme is very flexible. In Jython, a function's parameters can be an ordered list, the parameters can have default values, and you can even allow for unknown numbers of arguments. Additionally, you can use keyvalue pairs when calling functions and even allow for unknown numbers of these key-value pairs.

Positional Parameters

Function examples so far have used the simplest form of parameters: a list of names, or positional parameters. In this case, arguments passed to the function are bound to the local names designated in the parameter list in the order they occur. Examining this in the interpreter with the vars() function clarifies this. The vars() function returns a dictionary object, and dictionary objects do not preserve order. This means that the results you see may be unordered but remain correct as long as all the keys and values appear:

>>> def myFunction(param1, param2, param3): ... print vars() ... >>> myFunction("a", "b", "c") {'param1': 'a', 'param2': 'b', 'param3': 'c'}

Default Values

You can assign default values to parameters. To do so, simply assign a value to the parameter in the function definition. If we define a function with three parameters, and two have default values, then the function requires only one argument when called but can be called with one, two, or all three supplied. When using default values, a parameter without a default value may not follow a parameter with a default value. Here is an example of a function with two default parameters preceded by a third parameter. This creates flexibility in calling the function:

>>> def myFunction(param1, param2="b", param3="c"): ... print vars() ... >>> myFunction("a") # call with the 1, required parameter {'param1': 'a', 'param2': 'b', 'param3': 'c'} >>> myFunction("a", "j") # call with 2 parameters {'param1': 'a', 'param2': 'j', 'param3': 'c'} >>> myFunction("a", "j", "k") # call with 3 parameters {'param1': 'a', 'param2': 'j', 'param3': 'k'}

Here is another example, but the function has a non-default parameter following a default parameter. This is not allowed, as evidenced by the SyntaxError exception:

>>> # let's test a non-default arg following a default arg >>> def myFunction(param1, param2="b", param3): ... print vars() ... Traceback (innermost last): (no code object) at line 0 File "<console>", line 0 SyntaxError: non-default argument follows default argument

*params and Handling an Unknown Number of Positional Parameters

If you need to allow for an unknown number of arguments, you can prefix a parameter name with an asterisk *. The asterisk designates a wild card that holds all extra arguments as a tuple. This should appear after other parameters in the function definition.The asterisk is the essential syntax; the name used after it is arbitrary:

>>> def myFunction(param1, param2, *params): ... print vars() ... >>> myFunction('a','b','c','d','e','f','g') {'param1': 'a', 'param2': 'b', 'params': ('c', 'd', 'e', 'f', 'g')}

Keyword Pairs as Parameters

When calling a function, you can also use key-value pairs to designate parameters, also called keyword parameters. The key must be the actual parameter name used in the function's definition. When using keys-value pairs the order is no longer significant, but when mixing key-value pairs with plain, positional arguments, order matters until the right-most positional argument:

>>> def myFunction(param1, param2, param3="d"): ... print vars() ... >>> myFunction(param2="b", param3="c", param1="a") {'param1': 'a', 'param2': 'b', 'param3': 'c'} >>> # next let's mix key-value pairs with positional parameters >>> myFunction("a", param3="c", param2="b") {'param1': 'a', 'param2': 'b', 'param3': 'c'}

**kw params and Handling Unknown Key-Value Pairs

Just as the asterisk handles an unknown number of plain arguments, the double-asterisk, **, handles unknown numbers of key-value pairs. The name you choose to prefix with the double-asterisk becomes a PyDictionary type containing the unused key-value pairs.

>>> def myFunction(param1, param2, param3, **kw): ... print vars() ... >>> myFunction("a", "b", "c", param4="d", param5="e") {'param1': 'a', 'param2': 'b', 'param3': 'c', 'kw': {'param5': 'e', 'param4': 'd'}}

Note that all types of parameters can occur in the same function definition:

>>> def myFunction(param1, param2="b", param3="c", *params, **kw): ... print vars() ... >>> myFunction("a", "b", "j", "k", "z", key1="t", key2="r") {'param1': 'a', 'params': ('k', 'z'), 'param3': 'j', 'kw': {'key2': 'r', 'key1': 't'}, 'param2': 'b'}

Namespaces

Jython has static and statically nested scoping. The bridge between these two occurs in Jython 2.1. Jython version 2.1 and version 2.0 have static scoping; however, you can optionally use static nested scopes, also called lexical scoping, in version 2.1. Future versions of Jython will use the statically nested scoping (lexical scoping) exclusively. The reason Jython 2.1 bridges both types of scoping rules is that Python developers needed a safe way to introduce new scoping rules while providing time to amend adversely affected legacy code.

Two Static Scopes

Jython's static scoping entails two specific namespaces: globals and locals (three if you include the built-in namespace). When a name binding action occurs, the name appears in either a global or local namespace. Actions that result in a name binding are assignment operations, an import statement, the definition of a function or a class, and assignments derived from statement behavior (such as for x in list). The location of a name binding action determines which namespace will contain that name. For functions, anything bound within the function's code block and all of the function's parameters are local with but one exception—the explicit use of the global statement within the code block. Listing 4.5 contains a function that makes a few assignments and prints its local namespace for confirmation. Listing 4.5 also prints global names to show how global variables are unique and separate from the local variables in function().

Listing 4.5 Global and Local Namespaces

>>> a = "This is the global variable a" >>> def function(localvar1, localvar2): ... a = 1 ... b = 2 ... print locals() ... >>> function(3, 4) {'b': 2, 'localvar1': 3, 'localvar2': 4, 'a': 1} >>> >>> globals() {'a': 'This is the global variable a', '__doc__': None, 'function1': <function function1 at 5135994>, '__name__': '__main__'}

Because Listing 4.5 uses a local name for the variable a within function(), the global variable a is unaffected. A function can use variables from the global namespace, however. This happens if the name binding action occurs outside of the function, or if the global statement is used. If a name occurs within a function, but the name binding occurred in the global namespace, the name lookup continues beyond the locals and into the global namespace. Here is an example of this use of a global name:

>>> var1 = [100, 200, 300] # This is global >>> def function(): ... print var1 ... var1[1] += 50 ... >>> function() [100, 200, 300] >>> function() [100, 250, 300] >>> print var1 #Look at the global var1 [100, 300, 300]

If you try to use a global variable within a function that later binds to the same name, it is an error. The reason is that the name binding designates a variable as local for the entire code block, no matter where that name binding sequentially occurs in the block:

>>> var = 100 # This is global >>> def function(): ... print var ... var = 10 # this assignment makes ALL occurrences of var local ... >>> function() Traceback (innermost last): File "<console>", line 1, in ? File "<console>", line 2, in function UnboundLocalError: local: 'var'

The second way to use a global variable within a function is to declare the variable global with the statement global. Explicitly declaring intentions with the global statement looks like this:

>>> var = 100 # This is global >>> def function(): ... global var ... print var ... var = 10 ... >>> print var # Peak at global var 100 >>> function() 100 >>> print var # Confirm function acted on global var 10 >>>

Statically Nested Scopes

Why change to statically nested scopes? After all, what is wrong with the static scoping that Jython, and Python, have historically used? The problem manifests itself with nested functions and the lambda form (lambda forms are discussed later in this chapter). These two are non-intuitive within the two-namespace, static scoping model. Defining simple nested functions in the interactive console sheds light on this:

>>> a = "BacBb" >>> >>> def decode(): ... b = "h" ... def inner_decode(): ... print a.replace("Bb", b) ... inner_decode() ... <console>:1: SyntaxWarning: local name 'b' in 'decode' shadows use as global in nested scopes >>> decode()Traceback (innermost last): File "<console>", line 1, in ? File "<console>", line 5, in decode File "<console>", line 4, in inner_decode NameError: b

Jython version 2.1 is kind enough to give you the console warning about the invisibility of variable b in the inner function. With only local and global namespaces, inner_decode() cannot see variables in its containing function. Overcoming this usually means supplying the inner function with parameters. However, beginning with Jython version 2.1, you can also choose to use statically nested scopes to overcome this. Using statically nested scopes in Jython 2.1 requires the use of the recently added __future__. Python, and thus Jython, have recently added __future__ as a means of migrating to functionality that may be standard in subsequent releases. To import nested scopes from __future__, use the following:

>>> from __future__ import nested_scopes

Currently, nested_scopes is the only futuristic behavior that users can import, but when done, the interpreter obeys the rules of statically nested scopes. If we revisit the decode example with the futuristic nested_scopes imported, we receive a different result:

>>> from __future__ import nested_scopes >>> >>> a = "BacBb" >>> >>> def decode(): ... b = "h" ... def inner_decode(): ... print a.replace("Bb", b) ... inner_decode() ... >>> decode() Bach

With lexical scoping, an inner function can hold some additional, non-global information. Suppose you wanted to hide a built-in function under a different name while you used its old name:You could use an inner function as somewhat of a proxy like that in Listing 4.6

Listing 4.6 Using nested-scopes to Create a Proxy

# file: typeproxy.py from __future__ import nested_scopes def typeProxy(): _type = type def oldtype(object): return _type(object) return oldtype f = typeProxy() type = 10 # rebind name "type" print f(4)

Results from running jython typeproxy.py:

org.python.core.PyInteger

Because lexical scoping allows nested functions to use variables from their containing function, nested functions can carry non-global data around as baggage. This creates an object of combined data and procedure, much like a class. This is similar to what functional languages call closures. Listing 4.7 uses nested functions, and the closure-like behavior of the returned inner function, to generate a stream of fibonacci numbers. The xfib function accepts one argument: the iteration the sequence should begin with. Two assert statements provide some runtime parameter checking to add some robustness, and after the inner function fibgen is defined, it is used to set the data to the designated iteration before it is returned. The ability to return this function is part of the first-class object requirement and is also an important ingredient in functional programming. The script uses the returned function to print the first 20 numbers of the fibonacci sequence.

Listing 4.7 Nested Functions with Lexical Scoping

# file: fibonacci.py from __future__ import nested_scopes def xfib(n=1): """Returns a function that generates a stream of fibonacci numbers Accepts 1 arg- an integer representing the iteration number of the sequence that the stream is to start with.""" assert n > 0, "The start number must be greater than zero" from types import IntType, LongType assert type(n)==IntType or type(n)==LongType, \ "Argument must be an integral number." # make cache for penultimate, current and next fib numbers fibs = [0L, 1L, 0L] # Define the inner function def fibgen(): fibs[2] = fibs[1] ++ fibs[0] fibs[0] = fibs[1] fibs[1] = fibs[2] return fibs[1] # Set state to the iteration designated in parameter n for x in range(1, n ): fibgen() # return the primed inner function return fibgen fibIterator = xfib() for x in range(20): print fibIterator(), print # Print an empty newline for tidyness after execution

Results from running jython fibonacci. py:

1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946

Passing the fibIterator object in Listing 4.7 to other functions or classes is also a likely scenario. The fact that it is the combination of data and procedure makes it useful in many situations much like class instances are useful for their combined methods and data.

Special Variables in User-Defined Functions

"Special" variables in Jython are usually those with two preceding and trailing underscores, such as __doc__. However, Jython functions have function-specific special variables that are prefixed with func _. One such special function variable is func_doc which is a pseudonym for __doc__. As you explore functions with dir() or vars() you will encounter these variables, so they are listed here for reference even though examples within this chapter do not directly use them. Special function variables are as follows:

func_doc is the same as __doc__.
func_name (also __name__) is the name of the function.
func_defaults lists default parameter values.
func_code contains the compiled version of the function.
func_globals is the function's global namespace.
func_dict (also __dict__) is the functions local namespace.
func_closure is a tuple binding for the variables used from a containing function in nested-scopes. It is always None if not using nested-scopes.

Recursion

Recursion occurs when a procedure calls itself. Recursion is a powerful control structure that is similar to other loop constructs and is an essential ingredient in functional programming. Recursion occurs with repeating math operations like those found in sequences and graphs. Another common example of recursion is evaluating a factorial. A simplified description of a factorial is the product of all integers between a designated number and 1. This means 5 factorial (written as 5!) is 5 * 4 * 3 * 2 * 1 (if details of the gamma function are ignored). Implementing a recursive function to do this calculation is shown in Listing 4.8.

Listing 4.8 Implementing a Recursive Function

>>> def factorial(number): ... number = long(number) ... if number == 0: ... return 1 # It just is by mathmatician's decree ... return number * factorial(long(number - 1))) ... >>> print factorial(4) 24 >>> print factorial(7) 5040

Recursion has its limits. A function can call itself only so many times before a StackOverflowError occurs. If we extend our interactive example from above, we can try larger and larger numbers to see what this limit is. Note that the actual number will depend on your available memory.

>>> bignum = factorial(1000) >>> biggernum = factorial(1500) >>> hugenum = factorial(1452) # The breaking point on my machine traceback (innermost last): ... java.lang.StackOverflowError: java.lang.StackOverflowError

Built-In Functional Programming Tools

Examples up to this point have mostly been imperative in style. However, many important ingredients of functional coding exist in Jython. Those already noted are first-class functions, recursion, and closures. Additional tools in Jython that aid in functional coding are lambda forms, which provide anonymous functions, built-in functions for list processing, the list generation tool zip(), and list comprehension. This section looks at Jython's functional tools including lambda forms, map, filter, reduce, zip, and list comprehension.

lambda

Creating anonymous functions requires the use of lambda. Functions defined with lambda are lambda forms or lambda expressions. The term lambda expression may be most instructive as what this type of function does is return the results of an expression. Lambda expressions cannot contain statements. The syntax of a lambda expression is as follows:

"lambda" parameters:: expression

The problem of determining if a number is odd or even creates this lambda form:

>>> odd_or_even = lambda num: num%2 and "odd" or "even" >>> odd_or_even(5) 'odd' >>> odd_or_even(6) 'even'

You can see that this is functionally the same as the following:

>>> def odd_or_even(num): ... return num%2 and "odd" or "even"

A lambda form's parameters are no different than a named function's parameters. They can be positional parameters, have default values, and can use the wild card parameters * and **. Because lambda forms cannot contain statements like if/else, they usually leverage the and/or operators to provide the logic of execution. In functional programming, the evaluation of expressions is preferred over statements. Using expressions in lambda forms is not only required, it is valuable practice in functional programming. Another example of a lambda expression is a rewrite of an earlier factorial evaluator. The control structure employed in this example is recursion: The lambda form calls itself to determine each successive value:

>>> factorial = lambda num: num==1 or num * factorial(num - 1) >>> factorial(5) 120

The lambda expression is especially susceptible to the side effects of Jython's scoping rules. In static scoping, it is all too frequent that default parameter values obscure the lambda code. Just a few variables that require default values become unwieldy. Listing 4.9 figures the volume of a cylinder of static height, but variable radius and shows default arguments in a lambda expression.

Listing 4.9 Using Default Arguments in a lambda Form

# file: staticpi.py # Note "nested_scopes" is not imported from math import pi def cylindervolume(height): return lambda r, pi=pi, height=height: pi * r**2 * height vrc = cylindervolume(2) # vrc = variable radius cylinder print vrc(5)

Results from running jython staticpi.py:

Compare this with the missing default parameter values in a lexically scoped lambda form. Lambda forms are much more intuitive with lexical scoping as demonstrated in Listing 4.10.

Listing 4.10 lambda Forms in Lexical Scoping

# file: lexicalpi.py from __future__ import nested_scopes from math import pi def cylindervolume(height): return lambda r: pi * r**2 * height vrc = cylindervolume(2) # vrc = variable radius cylinder print vrc(5)

Results from running jython lexicalpi.py:

map()

The map function is a list-processing tool. The syntax for map is as follows:

map(function, sequence[, sequence, ...])

The first argument must be a function or None. Map calls the function for each member of the designated sequence with that respective sequence member as an argument. The returned list is the results of each call of the function. To create a list of the squares of integers, you apply map like so:

>>> def square(x): ... return x**2 ... >>> map(square, range(10)) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

A lambda form satisfies the function requirement as well:

>>> map(lambda x: x**2, range(10)) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

If a map function is called with multiple sequences, the function receives the same number of parameters as there are sequences. If we assume map was called with a function (or None) and three sequences, each call to the function would include three parameters:

>>> map(None, range(10), range(2,12), range(5,15)) [(0, 2, 5), (1, 3, 6), (2, 4, 7), (3, 5, 8), (4, 6, 9), (5, 7, 10), (6, 8, 11), (7, 9, 12), (8, 10, 13), (9, 11, 14)]

For multiple sequences of differing lengths, map iterates through the longest of the sequences and fills in missing values with None:

>>> map(None, range(5), range(6,20), range(3)) [(0, 6, 0), (1, 7, 1), (2, 8, 2), (3, 9, None), (4, 10, None), (None, 11, None), (None, 12, None), (None, 13, None), (None, 14, None), (None, 15, None), (None, 16, None), (None, 17, None), (None, 18, None), (None, 19, None)]

To compensate for the numeric focus so far, take a look at an example of using map to process a string. This example assumes there is a need to replace all tabs in a string with four spaces. An empty string is used to allow use of the join function, and the lambda expression makes use of Boolean evaluation to return four spaces when a tab is found.

>>> s = 'a\tb\tc\td\te' >>> "".join(map(lambda char: char=="\t" and " " or char,, s)) 'a b c d e'

filter()

The filter function is similar to the map function in that it requires a function or None as the first parameter and a list as the second. The differences are that filter can only accept one sequence, filter's function is used to determine a true/false value, and the result of filter is a list that contains the original sequence members for which the function evaluated to true. The function filters out those sequence members for which the function returns false—thus the name. The syntax for filter is as follows:

filter(function, sequence)

An example that clarifies filter's behavior is testing for even numbers. The range function has the optional step parameter, so this functionality is trivial, but it makes for a clear example of filter's behavior. The results contain the original sequence's members, unaltered except for the exclusion of those for which the function disapproves:

>>> filter(lambda x: x%2==0, range(10)) [0, 2, 4, 6, 8]

The filter function can also compact code used to search for set intersections. This example iterates through the members of set1 and uses a lambda form to return 1 or 0 depending on that element's inclusion in set2:

>>> set1 = range(0, 200, 7) >>> set2 = range(0, 200, 3) >>> filter(lambda x: x in set2, set1) [0, 21, 42, 63, 84, 105, 126, 147, 168, 189]

If you revisit Listing 4.2 from very early in this chapter, you will notice that it is a very verbose and inefficient primes search. The filter function can decrease the verbosity, and does in Listing 4.11. The efficiency, however, is not improved. Listing 4.11 is likely less efficient because the use of the filter function is not lazy. Lazy means a process only evaluates what it needs to. The isPrime function in Listing 4.11 successively tests each potential divisor. To be lazy, the isPrime function would have to test only those divisors up to the first contradiction of prime. This detrimental effect is burdensome as numbers get larger.

Listing 4.11 A Functional Search for Primes

# file: functionalprimes.py from __future__ import nested_scopes def primes(S): isDiv = lambda num, x: num<3 or num%x==0 isPrime = lambda num: filter(lambda x, num=num: isDiv(num, x), [2] + range(3,num,2))==[] return filter(isPrime, S) print primes(range(4, 18))

Results from running jython functionalprimes.py

[5, 7, 11, 13, 17]

reduce()

The reduce function requires a function and sequence like map and filter. The function supplied to reduce, however, must accept two arguments and cannot be None as it can be in map and filter. The syntax for reduce is as follows:

reduce(function, sequence[, initial])

The result returned from reduce is a single value that represents the cumulative app of the specified function to the supplied sequence. The first two values from the list, or the optional initial value and first list value, become the function's first pair of arguments. The results from that operation and the next list item are the subsequently arguments to the function, and so on until the list is exhausted. An example follows:

>>> reduce(lambda x, y: x+y, range(5, 0, -1))

The result of this expression is ((((5+4)+3)+2)+1), or 15. Suppose the optional initial value is supplied:

>>> reduce(lambda x, y: x+y, range(5, 0, -1), 10)

The results would be (((((10+5)+4)+3)+2)+1), or 25.

zip()

The zip function combines respective elements of multiple lists as a tuple and returns a list of tuples. The length of the returned list is the same as the shortest sequence supplied. The syntax for zip is as follows:

zip(seq1 [, seq2 [...]])

An example usage looks like this:

>>> zip([1,2,3], [4,5,6]) [(1, 4), (2, 5), (3, 6)]

Another example is using zip to supply key-value pairs in constructing a dictionary. Because zip produces a list of tuples, the lambda form has only one parameter for the single tuple argument:

>>> d = {} >>> map(lambda b: d.setdefault(b[0],b[1]), zip(range(4), range(4,8))) [4, 5, 6, 7] >>> d {3: 7, 2: 6, 1: 5, 0: 4}

List Comprehension

List comprehension is a way of generating lists according to a set of rules. List comprehension syntax is really a short cut. What it does is possible with other Jython syntax, but list comprehension provides more clarity. The basic syntax is as follows:

[ expression for expr in sequence1 for expr2 in sequence2 ... for exprN in sequenceN if condition]

Evaluation of the list comprehension form is equivalent to nested for statements such as the following:

for expr1 in sequence1: for expr2 in sequence2: for exprN in sequenceN: if (condition): expression(expr1, expr2, exprN)

The list comprehension used here produces a list of only the elements that evaluate to true:

>>> L = [1, "a", [], 0, "b"] >>> [ x for x in L if x ] [1, 'a', 'b']

This could be rewritten without the list comprehension as follows:

>>> L = [1, "a", [], 0, "b"] >>> results = [] >>> for x in L: ... if x: ... results.append(x) ... >>> results [1, 'a', 'b']

The advantage of the list comprehension is obviously its clarity. This advantage increases as the number of for statements increases. The list comprehension in Listing 4.12 may look a bit confusing at first, but that is only because the surrounding lambda expression. The actual list comprehension in Listing 4.12 has only four expressions equal to the following nested form:

for x in [num]: for y in ([2] ++ range(3, num, 2)): if x%y==0: returnlist.append(x)

The result of this list comprehension is a list of successful divisors of the variable num. The lambda form merely tests to see if this list of divisors is empty, meaning the number must be a prime.

Listing 4.12 Primes Through List Comprehension

# file: primelist.py def primes(S): isPrime = lambda num: [x for x in [num] for y in [2] + range(3,num,2) if x%y==0]==[] return filter(isPrime, S) print primes(range(2,20)) print primes(range(200, 300))

Output from running, jython primelist. py:

[3, 5, 7, 11, 13, 17, 19] [211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293]

Jython file objects have a readlines() method that returns a list of strings representing each line of the file. For another list comprehension example, assume that the file config.cfg is a configuration file with commented lines starting with # and all other non-blank lines are important directives. It is more likely that you would want the parsed results of such a file in a dictionary, but because we are focused on functional coding here, the realines() and list comprehension are used to parse this file into a list of lists. Another important point about this list comprehension is that the first expression invokes a method. Limiting this first expression to just declaring which value to return seems to be the common usage, but this is certainly not a restriction. It can also invoke an object's methods or can be a function.

# file: configtool.py def parseConfig(file): return [ x.strip().replace(" ", "").split("=") for x in open(file).readlines() if not x.lstrip().startswith("#") and len(x.strip())] print parseConfig("config.cfg") The content of the config.cfg file used is: a = b c = d # comment dog = jack cat = fritz

Results from running jython configtoo.py:

[['a', 'b'], ['c', 'd'], ['dog', 'jack'], ['cat', 'fritz']]

The preceding list comprehension makes for clean and compact syntax as well as helping make the example more of a functional style program.

The Functional Programming Idiom

Defining and using functions does not qualify as functional programming. Functional coding is much more. It emphasizes list processing such as map, filter , and reduce do. It also requires functions that are first class objects, employs closures, and uses expressions like list comprehension. Other important features of functional coding are the discouragement of rebinding names and the emphasis on expressions instead of typical compound statement control structures. Most Jython coding is imperative and object oriented, but we can see from the description of functional coding that Jython is capable of a functional style as well. What is the advantage of functional coding in Jython? There is the potential for increased robustness and there may be situations where this style increases code clarity. If you go to the effort to eliminate rebinding of variables, you've eliminated the opportunity for a certain variable to end up with an unexpected value. If everything is evaluated as an expression, there is little opportunity for a misplaced statement.Take a look at some imperative code:

>>> def odd_or_even(num): ... var = 9 # potential side effect ... if num%2: # statement instead of expression ... return "odd" ... return "even" ... >>> odd_or_even(10) 'even' >>> odd_or_even(7) 'odd'

This uses the compound if/else statement. Functional programming discourages statements and encourages expressions. So, to convert the above code into something more functional, use the and and or operators:

>>> def odd_or_even(num): ... return num%2 and "odd" or "even" ... >>> odd_or_even(20) 'even' >>> odd_or_even(33) 'odd'

In addition to avoiding statements, minimizing rebinding of names is important in functional programming. Rebinding is what is normally done in imperative programming, there are even shortcut operators such as += to make it more convenient. Functional coding considers such rebinding to increase the risk of side effects. Those who have suffered at the hands of Perl's debugger before learning the value of my know better than most about unexpected and deleterious side effects from variable rebinding. If you look back at Listing 4.12, you will see that list comprehension enabled the elimination of many name bindings in that function so side effects would be unlikely. The same is true for Listing 4.11; names are bound to lambda forms and the parameter, but no extraneous name rebindings exist. What is exciting is that all of Jython's list-processing tools work on any sequence. Not just Jython sequences, but Java sequences as well. To iterate through a java.util.Vector object, designate it as the sequence in a map function:

>>> import java >>> v = java.util.Vector() >>> map(v.add, range(10)) # fill the vector with PyIntegers [1, 1, 1, 1, 1, 1, 1, 1, 1, 1] >>> filter(lambda x: x%2==0, v) [0, 2, 4, 6, 8]

Java objects were not emphasized in this chapter for fear of their impact on example clarity, but an instructive next step is to experiment with incorporating Java objects into some functional style scripts. Purely functional code is uncommon once Java is included, but the extent to which you can use Java objects in list processing and lambda forms is impressive.

Synchronization

Java and Jython allow programs with multiple threads of execution. Python's thread and threading module is the means of create multithreaded programs in Jython and CPython. What is unique to Jython is the synchronize module, which simplifies synchronization much like Java's synchronized keyword. This chapter on functions introduces synchronization because Jython functions are callable objects. Multiple threads of control often require synchronization of callable objects. Java programmers are accustomed to synchronized methods, but there is no such keyword in the Python language specification. This raises the question of how you actually synchronize functions or methods in Jython. You could use tools from the threading module to acquire and release locks when entering and leaving callable objects, but Jython makes it much easier than that. Jython includes a synchronize module that has two functions that allow users to synchronize callable objects. These two functions are make_synchronized and apply_synchronized. Listing 4.13 defines a count method that is started in two separate threads. The count method merely prints a sequential count along side the thread responsible for producing the number. This makes for a clear example because output is interleaved without synchronization, but sequential when synchronized.

Listing 4.13 Synchronizing a Function

# file: sync.py import thread import synchronize import sys threadID = 0 def count(out): global threadID thisid = threadID threadID += 1 threadnames = ["first thread", "second thread"] for i in range(5): print threadnames[thisid], "counted to", i count = synchronize.make_synchronized(count) thread.start_new_thread(count, (sys.stdout,)) thread.start_new_thread(count, (sys.stdout,))

The function thread.start_new_thread() starts two threads in Listing 4.13, and it has the following syntax:

thread.start_new_thread(callable_object, args[, kwargs])

The first parameter is the name of the callable object that is to run in the thread. The second parameter is a tuple of the arguments to pass to the callable object designated in the first parameter. The optional third parameter consists of keyword arguments also passed to the callable object. Listing 4.13 implements synchronization with the make_synchronized function. make_synchronized accepts a callable object and returns a new callable object. This new object is synchronized on the first argument supplied when the object is called. The syntax is as follows:

make_synchronized(callableObject) -> callableObject

The fact that the object is synchronized on the first argument means that the callable object used must accept at least one argument. Listing 4.13 uses sys.stdout as that object to ensure that output is sequential. Even though Jython methods have not been discussed yet, it is valuable to note that they are similar to Jython functions; however, they receive an instance reference called self as their first argument. Therefore, when make_synchronized is used on a method, that method becomes synchronized on the object self. If you execute the sync.py file from Listing 4.13, you should see output like the following:

prompt>jython sync.py first thread counted to 0 first thread counted to 1 first thread counted to 2 first thread counted to 3 first thread counted to 4 second thread counted to 0 second thread counted to 1 second thread counted to 2 second thread counted to 3 second thread counted to 4

For comparison's sake, try commenting out the synchronization line like so:

# count = synchronize.make_synchronized(count)

Next, run the sync.py file again to see how the output is interleaved without the synchronization. A sample of the non-synchronized output appears as follows:

prompt>jython sync.py second thread counted to 0 second thread counted tofirst thread 1 counted tosecond thread 0 counted tofirst thread 2 counted tosecond thread 1 counted tofirst thread 3 second thread counted to 4 counted to 2 first thread counted to 3 first thread counted to 4

You could alternatively use the synchronize.apply_synchronized function to synchronize callable objects. The synchronize.apply_synchronized function reproduces the functionality of the built-in apply function with an additional argument that specifies an object to synchronize on. Syntax for this statement is as follows:

apply_synchronized(syncObject, callableObject, args, keywords={})

The operation performed in the apply_synchronized method is synchronized on the first argument (syncObject).

CONTENTS

Comments