PyZine
 


Article Finder
People
Issue 2 - Revision 1  /   Published in 2002 


 
  Py Links:
Latest Issue
Issue 08
Issue 07
Issue 06
Issue 05
Issue 04
Issue 02
Issue 01
 
 
Downloads
     
  Articles:
Throughout the quarter we cover topics of interest to Python developers.

  Configuration Files Made Easy

  Jython & zxJDBC Database Programming

 
Extending Python with C: Part 2


  POOPy: Introduction to Objects in Python

  Scientific: Array Broadcasting in Numeric

 
 
 
     

Illustration by Lia Avant
Py Archive Article
Extending Python with C: Part II

Extending Python with C: Part II

- - - - - - - - - - - -

By Alex Martelli  | Originally published in Py Issue 1

print

The most frequent source of errors in C-coded Python extensions is the issue of reference counting, which is amply covered in the Python docs online. Python objects always live “on the heap,” i.e., you always access them via PyObject*, never directly. Nobody “owns” a Python object, but at any time there is a number of extant “references” to the object, each of which is “owned” by one of the PyObject*s pointing to that object; the object keeps track of the number of extant references to itself (its “reference count”), but you must help it do so, by using the macros Py_INCREF(x) and Py_DECREF(x) respectively to add and remove a reference you own.

Some Python C API functions return “borrowed” references, which you may only use for a limited span of time during which no alteration can occur to the object you borrowed from (or, you may turn a borrowed reference into one you own, with Py_INCREF): you need to learn which functions behave this way (the most important group are the GetItem functions), but most functions that return objects are transfering ownership. When you pass or receive a PyObject* as an argument, in most cases it's just being borrowed: the main exceptions are the functions SetItem of tuples and lists (only not those of generic sequences, nor those of dictionaries). When your C-coded function returns a PyObject*, this is also an ownership transfer (so you must own the reference you are returning, and must not Py_DECREF it).

All of these PyObject* pointers must be non-0, except that functions returning pointers return 0 to indicate an error (a Python exception). If you don't know whether a pointer is or isn't null, you may play it safe by using macros Py_XINCREF(x) and Py_XDECREF(x), which do check if x==0 and are innocuous in that case -- this is most often advisable, since the extra cost is typically just one machine instruction, and the alternative, when you erroneously pass a null pointer to Py_INCREF etc, is a total crash, core dump, and so on.

Most often, you want to recode some working Python function into a C-coded extension for speed purposes; therefore, C-coded extensions are often concerned with loops, lists, and so on. Let's conclude with an example of such an extension. Suppose that you have a Python function:

def mapMethod_1(methodName, sequence):
return map(lambda x: getattr(x, methodName)(), sequence)

or, equivalently,

def mapMethod_2(methodName, sequence):
   return [getattr(x, methodName)() for x in sequence]

i.e., rather similar to a normal Python map, but needing to apply to each object one of its methods, without arguments, rather than using each object as argument to a function. You have carefully profiled your program and have determined that this is indeed an important performance bottleneck, so you want to recode it in C, looking for any speed gain you can.

Make a new directory acceldir. This time, we’ll start by coding a unit-test, as suggested by the “test-first coding” approach. Since we know that reference count errors are common, we will check carefully on reference counts, via the standard Python function sys.getrefcount. In file testaccel.py, we therefore write:

     1	import sys
     2	import gc
     3	gc.disable()
     4	
     5	class wu:
     6	    def __init__(self, st='up'):
     7	        self.st=st
     8	        print 'ini',self,id(self)
     9	    def __del__(self):
    10	        print 'del',self,id(self)
    11	    def upper(self):
    12	        return wu(self.st.upper())
    13	    def __repr__(self):
    14	        return 'wu(%s)'%self.st
    15	
    16	import accel
    17	
    18	a = [wu('a%d'%i) for i in range(3)]
    19	print 'a:',
    20	for x in a:
    21	    print x, sys.getrefcount(x),
    22	print
    23	del x
    24	
    25	b = accel.methodMap('upper', a)
    26	print 'a:',
    27	for x in a:
    28	    print x, sys.getrefcount(x),
    29	print
    30	del x
    31	print 'b:',
    32	for x in b:
    33	    print x, sys.getrefcount(x),
    34	print
    35	del x
    36	del b
    37	
    38	try:
    39	    b = accel.methodMap('upper', a+[0])
    40	except AttributeError:
    41	    print 'methodMap diagnosed error, as expected:'
    42	    print sys.exc_info()[0], sys.exc_info()[1]
    43	else:
    44	    print 'methodMap did not diagnose error'
    45	    print 'b:',
    46	    for x in b:
    47	        print x, sys.getrefcount(x),
    48	    print
    49	    del x
    50	    del b
    51	
    52	print 'a:',
    53	for x in a:
    54	    print x, sys.getrefcount(x),
    55	print
    56	del x
    57	
    58	del a
    59	print 'finis'

Note that we code both a case we expect to work, and one we expect to fail in a controlled way. We are also careful to del each variable as soon as we're done with it, so that reference count updates happen at predictable points; it's easy to forget to do so for the iteration variables of for loops, like x in this example, and then spend time puzzling why some one thing appears to disappear later than we expected.

We also write a setup.py file, very similar to the previous one:

     1	from distutils.core import setup, Extension
     2	
     3	setup(name = "accel",
     4	  version = "1.0",
     5	  description = "Accelerate some operations",
     6	  maintainer = "Alex Martelli",
     7	  maintainer_email = "aleaxit@yahoo.com",
     8	     
     9	  ext_modules = [ Extension('accel', sources=['accel.c']) ]
    10	)

And finally, of course, the C source file for our extensions, accel.c:

   1 #include 
   2 
   3 static PyObject*
   4 methodMap(PyObject* self, PyObject* args)
   5 {
   6    char *methodName;
   7    PyObject *sequence, *listresult;
   8    int listlength, i;
   9 
  10    if(!PyArg_ParseTuple(args, "sO",
           &methodName, &sequence))
  11        return 0;
  12    listlength = PySequence_Length(sequence);
  13    if(listlength < 0) {
  14        PyErr_SetString(PyExc_TypeError, 
        "2nd arg must be a sequence");
  15        return 0;
  16    }
  17    listresult = PyList_New(listlength);
  18    if(!listresult)
  19        return 0;
  20    for(i=0; i

I'm not going to examine all of this in detail, but it's well worth studying, particularly for the careful handling of reference ownership in the body of the for loop at lines 20-32. PySequence_GetItem transfers ownership, so item must be decref'd when it's not needed any more (I do this in line 25, right after item's only use). PyObject_CallMethod also transfers ownership, but so does (a rare case!) PyList_SET_ITEM, so there is no need to deal with the reference count of result. If any error occurs, either by PySequence_GetItem returning 0 at line 22, or by a return of 0 at line 24 by PyObject_CallMethod, it's important to decrement the reference count of listresult, as done at line 28, before returning 0 to indicate an error (at line 29), otherwise a “memory leak” would result.

Now, with the usual python setup.py install, we can build and install the new module accel, and run python testaccel.py, observing some output such as:

ini wu(a0) 135301868
ini wu(a1) 135399564
ini wu(a2) 135399604
a: wu(a0) 3 wu(a1) 3 wu(a2) 3
ini wu(A0) 135576260
ini wu(A1) 135574996
ini wu(A2) 135430756
a: wu(a0) 3 wu(a1) 3 wu(a2) 3
b: wu(A0) 3 wu(A1) 3 wu(A2) 3
del wu(A2) 135430756
del wu(A1) 135574996
del wu(A0) 135576260
ini wu(A0) 135576260
ini wu(A1) 135574996
ini wu(A2) 135433452
del wu(A2) 135433452
del wu(A1) 135574996
del wu(A0) 135576260
methodMap diagnosed error, as expected:
exceptions.AttributeError upper
a: wu(a0) 3 wu(a1) 3 wu(a2) 3
del wu(a2) 135399604
del wu(a1) 135399564
del wu(a0) 135301868
finis

At long last, we can check if we have indeed gained any performance from all of this work. We code a file timeaccel.py:

     1	import time, accel, string
     2	
     3	def mapMethod_1(methodName, sequence):
     4	  return map(lambda x: getattr(x, methodName)(), sequence)
     5	
     6	def mapMethod_2(methodName, sequence):
     7	  return [getattr(x, methodName)() for x in sequence]
     8	
     9	mapMethod_3 = accel.methodMap
    10	
    11	bigsequence = list(string.letters)*100
    12	
    13	def timefunc(func, sequence):
    14	  start = time.clock()
    15	  for i in range(100):
    16	      result = func('upper', sequence)
    17	  stend = time.clock()
    18	  return stend-start
    19	
    20	for func in mapMethod_1, mapMethod_2, mapMethod_3:
    21	  runtime = timefunc(func, bigsequence)
    22	  print "%12s %.2f" % (func.__name__, runtime)

This is a typical idiom for checking the running times of Python constructs, and it applies just as well to checking those of C-coded extensions, since, from Python, they're used the same way.

So, at last, we can do our measurements, running this a couple of times:

[alex@lancelot tiny]$ python -O timeaccel.py
 mapMethod_1 1.50
 mapMethod_2 1.38
   methodMap 1.21
[alex@lancelot tiny]$ python -O timeaccel.py
 mapMethod_1 1.50
 mapMethod_2 1.39
   methodMap 1.20

Your machine could give different results, so, check it out. However, roughly speaking, we can see that we have made some gain, but only a small one: over 100 executions on lists of 5,200 items, 520,000 method calls in all, about 0.18 seconds overall -- not quite 0.35 microseconds per method call, and only a bit more than we already gained by going from lambda inside a map to a list-comprehension. Still, this is better than nothing -- and it’s roughly typical of the gains you can expect by recoding in C some construct where most of the overhead remains the same (here, in getattr or the C equivalent PyObject_CallMethod).

So, remember: while recoding some parts of your program in C may be quite some fun, don't necessarily expect huge performance gains from this: the really huge performance gains can only come from tinkering with your system's overall architecture (which is easier when the system stays in Python) or drastic changes in your data structures and algorithms (possibly including the use of some highly optimized existing library for your purposes: coding C extensions works quite well for this, but don't forget to consider SWIG as well).

This particular article is Copyright © 2002 Alex Martelli. All Rights Reserved.
Alex Martelli

lives in Italy and is a System Developer for AB Strakt, Sweden. He co-edited the "Python Cookbook" and is writing the forthcoming book "Python in a Nutshell."


shim
shim

 Py is committed to bringing you great Python Articles.

shim
shim


Home   Subscribe   Migration FAQ   Contact PyZine   Write for PyZine   ZopeMag   opensourcexperts.com  

Reproduction of material from any of PyZine's pages without prior written permission is strictly prohibited. Copyright 2003 - 2005 PyZine Zope/Plone hosting by Nidelven IT