|
|
||||||||||||||||||
|
|
||||||||||||||||||
![]() |
![]() |
Issue 5 - Revision 7 / April 20, 2004
|
|||
|
Simple Code Generation - with Python - - - - - - - - - - - - By Ollie Rutherfurd | February 3, 2003 Introduction When programmers discover Python, they often fall in love. They find that they can accomplish more with less effort and less time. "How can I convince my boss that we should be using Python for our next big project", is a frequently asked question on the newsgroup comp.lang.python. Sadly, this article won't answer that question, but it will demonstrate using Python scripts for generating code in other languages. None of the projects I've worked on during the past year have been developed in Python. However, I've used Python scripts to generate code, both Java and Visual Basic, for many of them. This article walks the reader through versions of these scripts to illustrate how Python can be used to automate the generation of simple and repetitive code. What Code Are You Generating?In one standard approach to developing data-centric applications, one uses classes to represent data entities and a relational database to persist data. For example, in a sales contact system, you might have a Contact class, with attributes for id, name, address, phone number, e-mail, etc... which are stored in a database table named Contacts. Each Contact would be stored in a separate row in the Contacts table, and there would be a corresponding column for each attribute. On the whole, the look and feel of data object classes is very similar, regardless of the language. Class attributes, or properties, correspond to table columns. The main difference is how attributes are accessed and modified in different languages. For example, Python development practices do not mandate that access to class attributes should be via “get” or “set” methods, but both Java and Visual Basic encourage this practice. As as result, data classes written in languages such as Java and Visual Basic typically contain a lot of repetitive code for accessing class attributes. In many cases this code is so repetitive, it's easier to write a script to write it for you. Here's where Python fits in. With excellent text manipulation capabilities, triple-quoted strings for templates, high-level datatypes, and concise syntax, Python is an ideal candidate for the task. In this article, we'll examine two code generation scripts. One generates Java classes by reading field names and types from a database table, and the other generates Visual Basic code from a text file describing both the table and the class. Generating Java BeansIn the first example, we'll generate a Java Bean by reading column names and type from an SQL Server database, using adodbapi The example scripts accompanying this article are modified versions of scripts I use on a regular basis, with some extended option handling and similar features removed. The spirit is the same, the code's just a little shorter. What is a Java Bean?First, some terminology. A Java Bean is Java class containing attributes which follow a naming convention. Usually, a Bean doesn't actually do anything. Bean properties, or attributes, are accessed via getters and setters or, if you're feeling highfalutin, accessors and mutators. Here's an example Java Bean class:
public class Point{
public Point(){
this.x = 0;
this.y = 0;
}
public Point(int x, int y){
this.x = x;
this.y = y;
}
public int getX(){
return this.x;
}
public void setX(int x){
this.x = x;
}
public int getY(){
return this.y;
}
public void setY(int y){
this.y = y;
}
private int x;
private int y;
}
For comparison, the Python equivlent might be:
class Point:
def __init__(self, x=0 ,y=0):
self.x = x
self.y = y
The Java class contains a bunch of repetitive code the generation of which should be amenable to automatization. This is our first example. The StepsBefore we actually look at the first script, a quick plan of attack. Here are the steps we'll take to generate Java Beans corresponding to database tables.
Before we actually do anything, we define the code templates used later on in the script. Triple-quoted strings are great for embedding code templates directly into a script, as strings can span multiple lines, are easily readable, and can contain variable placeholders. Variable declaration:
DECLARATION = """\
private %(type)s %(attribute)s;"""
Getter and Setter pair:
GETTER_AND_SETTER = """\
public %(type)s get%(name)s(){
return %(attribute)s;
}
public void set%(name)s(%(type)s %(attribute)s){
this.%(attribute)s = %(attribute)s;
}"""
Class template:
CLASS = """\
public class %(class)s{
%(accessors)s
%(declarations)s
}"""
About now, you might be wondering about the indentation of DECLARATION and GETTER_AND_SETTER. They're indented to the desired level for the generated code. Step 1Retrieving column names and types from the database is going to be specific to each DBMS. I've done most of this work on Windows 2000, and have used both ADO, via win32com, and adodbapi. If you're not using an SQL Server on Windows, this code would have to be adapted as the Python DB-API does not specify how this information should be made available.
# See http://www.connectionstrings.com/
# for a good reference on connection strings.
CONN_STR = """\
Driver={SQL Server};\
Server=%(server)s;\
Database=%(database)s;\
Uid=%(username)s;\
Pwd=%(password)s"""
def get_cols(server,database,username,password,table):
conn = adodbapi.connect(CONN_STR % locals())
cursor = conn.cursor()
# returns a table with detailed column information
cursor.execute("sp_columns @table_name='%(table)s'" % locals())
cols = []
for row in cursor.fetchall():
colname,coltype = row[3],row[5].upper()
cols.append((colname,coltype))
return cols
Step 2
Step 2 is the heart of the script. This is where we generate all the code we didn't want to write by hand. For each attribute, we'll need to determine the appropriate variable type from the column type. This is easily done using a dict:
TYPES = {
'CHAR': 'String',
'DATETIME': 'java.sql.Timestamp',
'INT': 'int',
'NCHAR': 'String',
'NVARCHAR': 'String',
'VARCHAR': 'String',
# and any others...
}
For each column, we'll determine the type and the variable name: type = TYPES[coltype] attribute = col2attr(colname) With type and attribute names we can create declarations by replacing placeholders in the templates defined earlier with values for each attribute:
declarations.append(DECLARATION % {
'type': type,
'attribute': attribute})
accessors.append(GETTER_AND_SETTER % {
'type': type,
'name': attribute[0].upper() + attribute[1:],
'attribute': attribute})
Step 3
Now that we have 2 lists which contain all variable declarations and accessor methods, we'll generate the class by inserting the declarations, accessors, and class name into the class template:
klass = CLASS % {
'accessors': '\n'.join(accessors),
'declarations': '\n'.join(declarations),
'class': name
}
# this script uses spaces, expand to tabs
klass = klass.replace(' ' * 4, '\t')
Step 4
Finally, we write the generated code to a file: f = open(class_name + '.java', 'w') f.write(klass) f.close()
You can view the script in it's entirety here. Generating Vlsual Basic ClassesIn the second script, we generate Visual Basic code, but use a different approach for obtaining column names and types. The approach taken in the first script works well if you've got column names which are well-named and easy to understand. However, on older DMBS's, in which column name lengths were limited, this may not be the case. A little over a year ago, I began work on a workflow management system written in Visual Basic. We were replacing a green-screen application which ran on an AS/400 using DB2 as the database. There were oodles of tables with piles of fields, many of which looked like randomly generated sequences of characters. As is sometimes the case, the system wasn't terribly well documented and the best source of information was those who had worked on it or used it. For each table we needed to access or modify, I'd create a tab-delimited text file with the following 4 columns:
The first two columns were populated from the database table itself, and the last two by asking the project manager, the definitive source of information, what the column name was and what it was for, how it was used, etc... The result was a text file that looked like this: #column type property description LID VARCHAR LocationId Location Id (Primary Key) LCTY VARCHAR County County Name LREGN VARCHAR RegionCode Region Code LTYPE CHAR LocationType H=High School,4=Middle School, etc... LSECTR CHAR Sector JR=Junior, T=Technical, etc... Armed with that file, I could use a script to generate a Visual Basic class, with meaningful names for the data attributes, a description for each, and a mapping of property names to column names. Generating the Vlsual Basic ClassThis example is slightly more involved than the previous one, as there's more boiler-plate Visual Basic code. To generate the class, for each attribute we'll declare a private variable, property accessors, and code for getting and setting the property value from an ADODB Recordset. Here's the class template: CLASS = """\ %(header)s %(variables)s %(accessors)s %(rsfuncs)s """ Here's what the template values represent:
In Visual Basic one usually names a variable with a 3 letter prefix indicating the type of the variable. Private member variables are often further prefixed with m_. Here's an example variable declaration: Private m_strCounty As String To determine the type to use for a variable we'll use a dict, just as in the first script:
TYPES = {
'INT': 'Long',
'CHAR': 'String',
'DATETIME': 'Date',
'MONEY': 'Currency',
'VARCHAR': 'String',
}
Most 3-letter prefixes are the first 3 letters of the variable type, but that's not always the case. For the exceptions, we'll use a dict as well:
PREFIXES = {
'Long': 'lng',
}
Here's an example property declaration:
' County Name
Public Property Get County() As String
Count = m_strCounty
End Property
Public Property Let County(strCounty As String)
m_strCounty = strCounty
End Property
and here's the example template:
VARIABLE = """\
Private m_%(prefix)s%(name)s As %(type)s"""
ACCESSOR = """\
' %(description)s
Public Property Get %(name)s() As %(type)s
Attribute %(name)s.VB_Description = "%(description)s"
%(name)s = m_%(prefix)s%(name)s
End Property
Public Property Let %(name)s(%(prefix)s%(name)s As %(type)s)
m_%(prefix)s%(name)s = %(prefix)s%(name)s
End Property
"""
Get name of file containing table and class information, and the name to use for the class:
try:
table_file,class_name = sys.argv[1:]
except (IndexError,ValueError),e:
usage()
For each line in the file (ignoring comment lines), create a variable declaration, property get and let methods, and code to read and write the property value to and from an ADODB.Recordset.:
accessors = []
from_rs = []
to_rs = []
variables = []
for line in file(table_file):
# allow for comments in file
if line.startswith('#'):
continue
# for tab separated fields for each column
colname,coltype,propname,description = line[:-1].split('\t')
type = TYPES[coltype]
prefix = PREFIX.get(type,type[:3].lower())
accessors.append(ACCESSOR % {
'name': propname,
'type': type,
'prefix': prefix,
'description': description})
variables.append(VARIABLE % {
'prefix': prefix,
'name': propname,
'type': type})
from_rs.append(COPY_FROM_FIELD % {
'name': propname,
'colname': colname})
to_rs.append(COPY_TO_FIELD % {
'name': propname,
'colname': colname})
rsfuncs = RS_FUNCS % {
'read_from_fields': '\n'.join(from_rs),
'put_into_fields': '\n'.join(to_rs)
}
props = {
'header': HEADER % {'class': class_name},
'variables': '\n'.join(variables),
'accessors': '\n'.join(accessors),
'rsfuncs': rsfuncs,
}
f = file(class_name + '.cls', 'w')
print >> f, CLASS % props
f.close()
The templates for the recordset functions are not included here, as they don't demonstrate anything new, but the entire script can be viewed here. Testing ScriptsIt's important to test your generated code to ensure both that it works and that it's actually what you wanted. I once whipped out a script that generated multiple output files. One of these files was an XML file containing metadata for the generated Java code, and though it was well-formed an attribute name had an extra "s" on it. I'd like to say I found the bug in no time, but it took me the better part of an hour to locate the problem. It was a perfect example of a case where it would have taken less time overall to have tested the script and verified it generated correct output than it did to find and fix the problem after the fact. The easiest way I've found to test a script is to create a file containing your desired output. As you develop the script, run it and compare the output against the file you created. When they match, you're done. Template StylesThough Python's triple-quoted strings often work well for templates, “%(name)s” interpolation can be a little hard to read at times; furthermore, it's all too easy to forget the trailing s. As an alternative, one could easily write a class which could use shell-style $variable or ${variable} substitution. Here's how's the template for the Java Bean “get” and “set” methods would look:
GETTER_AND_SETTER = """\
public $type get${name}(){
return $attribute;
}
public void set${name}($type $attribute){
this.$attribute = $attribute;
}"""
Here's an example implementation, based on Barry Warsaw's reference implementation for PEP 292:
import sys
import re
class Template(str):
def sub(self, mapping=None):
# Default mapping is locals/globals of caller
if mapping is None:
frame = sys._getframe(1)
mapping = frame.f_globals.copy()
mapping.update(frame.f_locals)
def repl(m):
return mapping[m.group(m.lastindex)]
return re.sub(r'\$(?:([_A-z]\w*)|\{([_A-z]\w*)\})', repl, self)
Example usage:
>>> t = Template("""\
... private $type $name;""")
>>> print t.sub({'type': 'String', 'name': 'name'})
private String name;
>>>
With this class, using shell-style substitution would be easy to do, and is left as an exercise to the reader. ConclusionPython is an ideal language for code generation and using scripts such as those presented in this article can be a big time saver. They're easy to write, read, and test and you can add functionality or features as requirements change. Another benefit of using Python, which might not initially be apparent, is that you aren't tied to any specific IDE or platform. Code generation does have it drawbacks, however. It's so easy to do, one should be aware of what code actually needs to be repeated and what could be refactored into common functions or classes. You should also know what it is you want to generate before you begin. Lastly, have fun -- and enjoy coding in Python so Python can do a little of your coding for you! For further Reference:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Py is committed to bringing you great Python Articles. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
Reproduction of material from any of PyZine's pages without prior written permission is strictly prohibited. Copyright 2003 - 2005 PyZine |
|