Template-Driven Code Generation

Contents:
On Code Generation
Jeeves Example
Jeeves Overview
Jeeves Implementation
Sample Specification Parser
Resources

I'd rather write programs to write programs than write programs.

- Developing Pearls, Communications of the ACM, Sept. 1985

This chapter builds a template-driven code generator, an indispensable tool in a C, C++, or Java developer's toolbox. The chapter has two objectives: to make the case for code generation as a method of code reuse and to present a small but nontrivial problem that can exercise all the Perl concepts you learned in the first half of the tutorial: complex data structures, modules, objects, and eval. Enjoy!

On Code Generation

Developers create and use tiny specification languages all the time. Database schemas, resources (rc files in Unix such as .mwmrc and .openwinrc), user interface specifications (Motif UIL files), network interface specifications (RPC or CORBA IDL files), and so on are all examples of such languages. These languages enable you to state your requirements in a high-level, compact, and declarative format; for example, in Motif's UIL (User Interface Language), you can simply state that you want two buttons inside a form and spare yourself the effort of writing 20 or so statements in C to achieve the same effect.

The semantic gap between these specification languages and conventional systems-developing languages such as C or C++ can be bridged in one of two ways. The first approach is for the C application to treat the specification as meta-data; that is, the application embeds the specification parser and exchanges information with it using C data structures and an internal API. The second approach is to have a standalone compiler to translate the specification to C, which in turn is linked to the application. RPC systems and CASE tools prefer this approach.

In the following pages, we will study the second alternative and build ourselves a configurable code generation framework called Jeeves.[1]

[1] Jeeves is the efficient butler in P.G. Wodehouse's novels, who does all the work for his bumbling master with at most a twitch of an eyebrow.

The code generators we mentioned previously are clearly domain-specific. In practice, I have also found most of them to be needlessly specific in their output capabilities. Consider the following examples:

NOTE: The Remote Procedure Call facility allows you to call a procedure in a different address space, possibly on a different machine. You specify a list of procedures that you wish to export in an Interface Definition Language (IDL) and feed it to an IDL compiler, which produces some C code for the client and server ends. Link these pieces of code to your application, and voilĂ , you have network transparency.

Most commercial IDL compilers are remarkably inflexible about changing their output code. They make it hard for you to insert probes for monitoring network performance or auditing data flowing across the network. If you want to transparently encrypt the data before it is put "on the wire," you are often out of luck. Sure, you can change the C code output by the IDL compiler, but your changes will get overwritten the next time you run the IDL compiler.

In most of these cases, the demand for different types of output typically exceeds the number of changes made to the input specification format. We can make two observations as a consequence. First, parsing the input and producing the final output are related but separate tasks. Second, the output needs to be configurable. This can be arranged either by having one parameterizable output generator or by having a number of output generators that can be used interchangeably with the input parser. In my experience, the first option is not often practical. For example, it is pointless to write one output generator in the POD case, which can output HTML or ASCII or RTF just by tweaking a few parameters; they are very different sets of outputs.

The Jeeves framework goes for the second option. It helps you write a configurable translator by supplying a template-driven code-generating back end. This module allows you to write configurable templates with loops, if/then conditions, variables, and bits of Perl code, so it is no ordinary cookie-cutter cookie-cutter (otherwise, it might have been called yacccc).

An example serves to better explain this framework.