PyZine
 


Article Finder
People
Issue 1 - Revision 1  /   Published in 2002 


 
  Py Links:
Latest Issue
Issue 08
Issue 07
Issue 06
Issue 05
Issue 04
Issue 02
Issue 01
 
 
Downloads
     
  Articles:
Throughout the quarter we cover topics of interest to Python developers.

  Scientific Python: Introducing Numeric

  Simple CGI Template Processing

  Extending Python with C: Part 1

  Image Viewing with TKinter

  Threading and the Global Interpreter Lock

 
 
 
     

Illustration by Lia Avant
Py Archive Article
Simple CGI Template Processing

Simple CGI Template Processing
- with Python
- - - - - - - - - - - -

By Mike Soulier | Originally published in Py Issue 1

print

Like many scripting languages, Python is very good at parsing strings, reading environment variables, and putting together dynamic output. These attributes make it a good language for CGI scripting, a world most commonly dominated by Perl at this time.

Like Perl, Python has a cgi module to make the handling of the input to a CGI easier to manage. As a programmer, you will not have to concern yourself with whether the user used a GET or POST method to send input to the CGI, or how to handle special characters. The cgi module does it all.

For example, on the OPAG (http://opag.ca) site, I wrote a subscription CGI so that visitors could subscribe to our mailing list by putting their email address in the small text field on the right side and clicking 'Subscribe'. The only value my CGI required was the contents of the text field, called "address". Confirming that value was present, and retrieving that value, was dead simple thanks to the cgi module.

form = cgi.FieldStorage()
if form.has_key("address"):
    address = form['address'].value
    subscribe_address(address)
else:
    errmessage = """
ERROR: The email address is not set.
    	Please go back and fill in 
  	the email address to subscribe.”""
    raise AssertionError, errmessage

The exception is caught and sent back to the browser with a proper MIME header, of course. In the above case, the except block I follow the try with sends a clean error message to the user. For other exceptions though, I personally prefer to handle them as follows:

import sys

sys.stderr = sys.stdout

if __name__ == '__main__':
    try:
        main()
    except Exception, errmessage:
        (tbtype, value, tb) = sys.exc_info()
        # If this is a standard system exit,
# then let it pass.
        if tbtype == SystemExit:
            raise tbtype, value
        else:
            print "Content-Type: text/html\n"
            print "<pre>"
            traceback.print_tb(tb)
            print tbtype, value
            print "</pre>"
            del tb
            sys.exit(1)

In this way, I catch all exceptions from the code, including a standard exit. All stderr stream output is redirected to stdout, and all exceptions of any other nature not caught by my code end up in a traceback being displayed in the browser. This is nice in the case of the OPAG site, as I'm not the sysadmin on the website, just the webmaster. I need those tracebacks to diagnose any problems. I could have just as easily sent an email and provided a standard error message redirection, which I will probably do in the future.

As you can see, Python has a way of making CGI scripts very clean, and easy to maintain. Small scripts like the subscription CGI are trivial, but what about large CGIs that you want to appear as part of your website's look and feel? The OPAG site has a very definite look to it, and the last thing you want is to lose that when you code CGIs, defaulting to some horrible black-on-white default scheme.

Hardcoding the look and feel would seem to be an option, but once you work on a software project for a while, you start to realize just how bad an idea hard-coding anything is. Maintainability goes right out the window, and maintaining code is a very big part of software development. It would be a shame to take a wonderfully maintainable language like Python and muck it all up with hard-coded output that's impossible to maintain past 100 lines.

So what do we do? We want a standard look-and-feel for the website, but we also want to be able to evolve it from time to time. Wouldn't it be nice if we could change a single file and have all of our CGIs “automagickally” change their look? As you've already guessed, we can. Lets face it, most web pages have common elements if they are to maintain a standard look and feel. You might put a common menu on each page, or a standard table structure to section the page, or any number of common elements where structure is unchanging but the contents of that structure changes.

Templates to the Rescue

A very clean solution to this problem is to create an html page with all of the common elements in it, and embed some special characters in that page wherever we have dynamic content. Then, all we do is read in the page in every CGI program, and replace the special characters with our dynamic content. A typical way to delimit template tags is by double-percentile (this is a common Perl solution as well). For example, if you wanted to have a block showing the page owner and email address, you might put something like this in the template:

<hr>
<p>Last Updated: %%last_updated%%</p>
<p>Page Maintained By:
<a href="mailto:%%maintained_email%%">
%%maintained_name%%</a></p>

The framework is now in place for your page footer, you just need to replace three template variables, being the %%last_updated%%, %%maintained_email%% and %%maintained_name%% variables. Obviously, this is going to be a search and replace operation over the text in the template, and the only element that's changing is the name of the variable, so it should be simple to code this generically and simply repeat the operation for every variable in the template.

Very powerful search and replace operations can be done across a buffer of characters with Python's re module, specifically the re.sub() function. What we need here is a way to map template variables to values. A dictionary seems the appropriate type to use, and since re.sub() permits us to call a function, that would appear to be our way to control the search and replace operation.

There's one hitch though. According to our handy Python docs, the function takes a single match object as a parameter. That means we can't pass the dictionary we want the function to use to map the variable names. At first, that would suggest we need to use a global variable, but thanks to the friendly folks at comp.lang.python, I was shown a much cooler way to manage a unique dictionary to be used for each use of the function. This is quite necessary for scalability of this solution, or we'll find ourselves unable to concurrently do replacement with the same function. Don't use globals if you can avoid it, especially if you're playing with threads.

A bound method is a method belonging to an object that we no longer have access to. Behind the method call is an entire object that we cannot see, with its own namespace for additional variables and methods that we can't see. This is also an effective method of create a closure, a function which contains it's own unique namespace. If we can put the dictionary to map the template variables into such a bound method, then we can use it as our replacement function to re.sub() with a single parameter, and it can access it's internal copy of our dictionary. This class should fit the bill:

class Replacer:
"""This class is a utility class used to provide a bound method to the
    re.sub() function."""
    def __init__(self, dict):
        """The constructor. It's only duty is to populate itself with the
        replacement dictionary passed."""
        self.dict = dict

    def replace(self, matchobj):
        """The replacement method. 
     This is passed a match object by re.sub(),
        which it uses to index the replacement 
  dictionary and find the
        replacement string."""
        key = matchobj.group(1)
        if self.dict.has_key(key):
            return self.dict[key]
        else:
            return ''

So, it takes our dictionary in its constructor, and then the replace method will expect our match object that re.sub() will pass to it. The replace method in our Replacer uses the match object passed, matching the first match group to a key in our dictionary. It then returns the corresponding text in the dictionary, or the null string if the key didn't exist. If you'd prefer it to throw an exception in that latter case, the modification should be trivial. Our replacer can be used then as follows:

# Open the template file for reading.
template_file = open(template, "r")
# Create our replacer bound method.
replacer = Replacer(dict).replace
# Read in the entire template.
buffer = template_file.read()
# Replace all template variables in the buffer.
replaced = re.sub("%%(\w+)%%", replacer, buffer)

Our replaced buffer is now the entire web page that we want to display, with all of our template variables replaced with whatever dynamic contents we want to fill them with for this CGI. If we change the basic structure of the page by changing the template, all of our CGIs will change their look and feel. If we add or remove template variables, we can change a simple dictionary in our CGIs and we're good to go.

We're not quite done yet though. Python is an object-oriented language, and while we're making use of objects in our Replacer class, the responsibility of the parser demands yet another. As the purpose of the class I was implementing for OPAG was to manage OPAG CGIs, I used the wonderfully-imaginative name of OpagCGI for the class responsible for CGI functionality. Every instance of a CGI will require a corresponding template, so the constructor for the class ends up being quite simple.

def __init__(self, template=site_template):
    """OpagCGI(template) -> OpagCGI object
    The class constructor, taking the path 
 to the template to use, using
    the site template as default."""
    self.template = template
    self.template_file = None
    if not os.path.exists(self.template):
        raise OpagMissingPrecondition, """%s 
  does not exist""" % self.template

Note that the default template is controlled by the site_template variable, and if the specified template doesn't exist, we raise a custom exception. I chose the exception names for readability, not adding any new functionality beyond their place in the class hierarchy permitting us to screen our custom exceptions out from the standard Python exceptions. Their declaration is quite simple.

class OpagRuntimeError(RuntimeError):
    """The purpose of this class is to act 
 as the base class for all runtime
    errors in OPAG CGI code. 
 More specific Exceptions should subclass 
 this if they happen at runtime. 
 We might want to get more specific 
 than this in the future, and introduce
 subclasses for IO errors, type errors and such,
    but this will do for now."""

class OpagMissingPrecondition(OpagRuntimeError):
    """The purpose of this class is to 
 	 represent all problems with missing
    preconditions in OPAG code, such as 
 a file that is supposed to exist, but
    does not."""

The only other method currently implemented is the parse() method, which receives a replacement dictionary and does the work of parsing the template. It's basically just a formalism of our code above, with the added parameter of header, which controls whether or not a MIME header is prefixed to the output or not. This last feature is necessary if we want to embed templates within templates; we only want a MIME header at the beginning of the compiled output (ie. "Content-Type: text/html\n\n"). Our final parse() method ends up like this:

def parse(self, dict, header=TRUE):
    """parse(dict) -> string
    This method parses the open file object passed,
 replacing any keys found using the replacement 
 dictionary passed."""
    if type(dict) != types.DictType:
        raise TypeError, "Second argument must be a dict"
    if not self.template:
        raise OpagMissingPrecondition,"path not set"
    # Open the file if it’s not already open. 
    # If it is, seek to the beginning of the file.
    if not self.template_file:
        self.template_file = open(self.template, "r")
    else:
        self.template_file.seek(0)
    # Instantiate new bound method 
    # to do the replacement.
    replacer = Replacer(dict).replace
    # Read in the entire template into memory. 
    # I guess we'd better keep
    # the templates a reasonable size 
    # if we're going to keep doing this.
    buffer = self.template_file.read()
    replaced = ""
    if header:
        replaced = "Content-Type: text/html\n\n"
    replaced = replaced + re.sub("%%(\w+)%%", replacer, buffer)
    return replaced
____
 
 
Sidebar - Bolean Types in Python

While I used the values of TRUE and FALSE throughout my code, Python does not support a Boolean type, per se. I just used one of the many methods of simulating Booleans in C; I used integers.

TRUE = 1

FALSE = 0

Putting this at the top of my code allows for added readability. Its too bad Python doesn't have real constants with the usual compiler optimization gained by using them. Was that a hint to the developers?

 
____

The point behind encapsulating all this Pythony goodness into a couple of classes is, among other things, to ease our use of this library in implementing CGIs for our websites. So, if it's not easy to use, we've defeated the purpose. We'd better make sure.

First, we need the most important part of processing any template, being the replacement dictionary. Declaring dictionaries is simple:

replacement_dictionary = {
    'titleimage':	'images/title.png',
    'titlealt':		'The Ottawa Python Author's Group',
    'maintained_email':  'msoulier@mcss.mcmaster.ca',
    'maintained_name': 'Michael P. Soulier',
    'pagecontent':      pagecontent
    }

For this simple example, we're only replacing 5 template variables, and the pagecontent variable contains the full page to be displayed inside the template (this can be generated with additional templates, or whatever means works for you). Now, we need an instance of the CGI processing class, and to output the page. As it turns out, that's only two additional lines of code.

cgiprocessor = OpagCGI()
print cgiprocessor.parse(replacement_dictionary)

That's it! All the work is handled by the class, as it should be. Note that we're using the standard site template, since we didn't pass a different one to the OpagCGI constructor.

While these techniques won't make it any easier to write the "meat" of your CGIs, they will help you format the output so that it looks like any other page on your website. Templates reduce the maintenance work of your pages greatly, allowing you to change the look and feel of your website without changing all of the code in your CGIs. They are a very important part of any web developer's arsenal.



For further Reference:

Ottawa Python Authors Group (OPAG) http://opag.ca
A group devoted to learning, using and providing resources for the programming language Python. OPAG also serves as a general gathering place for Python programmers from the Ottawa region and beyond.

Opagcgilib Source Code http://opag.ca/resources/code/opagcgilib.py
The source code the module outlined in this article.

Source for all OPAG Modules http://opag.ca/opag_modules.shtml

This particular article is Copyright © 2002 Mike Soulie. All Rights Reserved.
Mike Soulier

is a software designer for Nortel Networks, and spends his "free time" promoting Linux, Python and other free software solutions, when he's not cycling, at the gym or going on yet another adventure in shopping with his lovely wife Maria.


shim
shim

 Py is committed to bringing you great Python Articles.

shim
shim


Home   Subscribe   Migration FAQ   Contact PyZine   Write for PyZine   ZopeMag   opensourcexperts.com  

Reproduction of material from any of PyZine's pages without prior written permission is strictly prohibited. Copyright 2003 - 2005 PyZine Zope/Plone hosting by Nidelven IT