PyZine
 


Article Finder
People
Issue 5 - Revision 7  /   April 20, 2004 


 
  Py Links:
Latest Issue
Issue 08
Issue 07
Issue 06
Issue 05
Issue 04
Issue 02
Issue 01
 
 
Downloads
     
  Articles:
Throughout the quarter we cover topics of interest to Python developers.

  RSOAP

  Simple Code Generation

  4ss

  Pyro

  PyCon 2004

  Kaa & Firedrop

  XML-RPC for Python

  Applied XML-RPC

 
 
 
     

Illustration by Lia Avant
article
Pyro

Pyro
- The Fireworks Of Python Remote Objects
- - - - - - - - - - - -

By Irmen de Jong | March 16, 2004

print
Introduction

What started as a small hack has grown into something wonderful. This process took a long time, but it was worth it. In this technical article we talk about the use of Pyro for distributed computing, we discuss some of the more important implementation issues of Pyro and you're given a glimpse at future developments. But first, let me tell you how things came to be. If you want to dive straight into the technical details you can skip this section, but perhaps you're interested in what the story behind Pyro is.

I first learned about Python while doing an assignment for my study on the University. We had to develop a prototype of a campus information system where students could track their college rosters, assignments, grades, finals and so on. Due to the dynamic requirements we decided to build a web-interface powered by CGI scripts, written in Python! I had never seen such a clean, concise and powerful script language before and was instantly hooked. At home I had an Amiga 4000 computer and wanted Python on that, so I created AmigaPython (a clean Python port to AmigaDOS, including some Amiga specific extensions).

During the last year of my study I had become very interested in distributed objects and after that for my job I had been working with CORBA and ObjectSpace VoyagerŪ (now Recursion Software, Inc.). On a rainy afternoon in the summer of 1999 I decided that I wanted to be able to do stuff like that in Python too: I wanted to be able to create mobile agent objects in Python and send them out in the world to perform their tasks, and I needed a technology for that. So I decided to write my own Python "ORB", because I would also learn about the gory details in the process. I came up with a nice name: "Pyro", from PYthon Remote Objects. Thinking that I could pull this off in a few weeks of spare-time-hacking, I never suspected that the little example (Pyro V0.1) shown to my colleagues the next week would take the next three years to grow into what Pyro is now, and still be actively used and developed.

I do the development in my spare time, when I feel like it, so it proceeds slowly perhaps (but steadily). At the time Pyro reached version 2.0, I created a project on SourceForge along with a Pyro mailing list. This boosted interest enormously. The continued interest and positive reactions of people using Pyro are very nice and make me still wanting to keep improving Pyro.

If you're interested in reading something more about Pyro applications and developing programs that use Pyro, David Mertz has written a good article on this for Intel Developer Connection, see. I've used a few ideas of it for this article too. Although it's from a while ago and covers an old version of Pyro, it still is a good read. For a complete reference, the Pyro manual is also available online.

DISTRIBUTED SYSTEMS

Well, what are we actually talking about? What does the "Remote Objects" part of the name "Pyro" mean? It all has to do with a concept called "distributed systems", which are computer systems that aren't build as a whole, but consist of various parts that are somehow separated from each other but work together. The "working together" part isn't straightforward, on the contrary, it's actually extremely difficult! But let us first look at the underlying principles and what it means to "distribute computing".

Distributed computing

When we're talking about distributed computing, basically what is happening here is that we are distributing hardware resources (disk space, printer capacity). But at a more abstract level some much more interesting things can be distributed within that networked system: data, information, program logic, "objects", and ultimately, responsibilities. There are essentially three categories of resources that can be distributed:

  • Computational (hardware) resources.
    Think about CPU power (computation capacity, or free cycles), memory, disk space and perhaps there are certain nodes in your distributed system that have specialized hardware such as printers or CD-burners.
  • Informational resources.
    Some computers may have privileged access to certain data. A particular machine might be the actual originating source of data, perhaps because it is attached to an automated data collector like a scientific instrument or because it is a terminal into which users enter data. A database might be local to a privileged computer, or at least to a limited group that the machine/account belongs to. Non-privileged computers might nonetheless have reason to have access to certain data derived from the database.
  • Business logic expertise.
    Within any organization - or between organizations certain parties (individuals, departments, and so on) have the capacity and responsibility to decide the decision rules in certain domains. For example, the Payroll Department might determine (and sometimes modify) the business logic concerning sick days and bonuses. Or the Database Administrator might have the responsibility to determine the most efficient way to extract this datum from complex relational tables.

We need something that "glues" all the different parts together, and makes them work together. This means that the different parts must at least be able to locate each other, and talk to each other. Ultimately, at the lowest level there is a network protocol such as TCP/IP that carries the exchange of data, but on a higher level we can see that the "glue" has to have a few properties:

  • It must define a common language (or "protocol") that all parties understand
  • It must provide a means of naming and locating entities within the system
  • It must define some common rules that all parts of the system have to conform to; this tells you how to actually build the system.
  • It may provide a toolkit, framework, or even architecture on which to build the components within the system.

Pyro is a Python solution to all but the last item above. It may be a toolkit but it doesn't really force you to use a specific framework or even architecture, apart from the fact that your software will be based on distributed objects with associated services.

Python Remote Objects

Usually when you have to deal with communication between parts of your system (or components in a distributed system) you have to create explicit communication functions to do this. Wouldn't it be much simpler when you could just write your system as usual, without being concerned which component is separated from others and where it's located? Just write your software as if it was a single system, and after that, deal with the distribution of the components without having to change your software. For Python this means that we have to have a transparent, high-level dynamic means of communicating between the various components. With minimal changes compared to a regular Python program, you'll write your distributed system.

Pyro makes this possible. It provides you with a means of enabling your Python objects to be distributed over a networked environment, without having to write tedious communication code. A Pyro distributed object (or Pyro object in short) is just a regular Python object, which can be called normally from other normal Python code, but which can be located miles away on a totally different machine. With very little extra code your Python program will automatically use Pyro to locate and access the distributed objects in your networked system. The remote object's method invocations, including handling the parameters, return values and exceptions are taken care of transparently.

THE MAGIC TRICK

One of the most interesting issues in Pyro is how it actually does the transparent object trick. A remote method call goes like this:

client program --> proxy --> protocol adapter --> network --> protocol adapter/daemon --> remote object base --> remote object

The result of the call travels the same way back to the client program. When you're using a static proxy generated by the Pyro proxy compiler, the method calls on the proxy are directly translated in the corresponding protocol adapter call, remoteInvocation, with the arguments: method name, method flags, and the list of method arguments. When you're using the dynamic proxy however (almost everybody does, because it is so convenient), Pyro has to do some trickery. The dynamic proxy intercepts the __getattr__ that Python executes to locate the particular method that is invoked. It stores the name Python was looking for in an internal list, and returns the bound method self.__invokePYRO__. Python thinks "ah, that must be the method I was looking for" and actually calls it, with the same arguments supplied to the original method! Next, __invokePYRO__ pops the method name that was stored before, and calls the protocol adapter with the correct arguments, just like the static proxy would. Below you'll find a piece of the dynamic proxy code we talked about.

# module Pyro.core
class DynamicProxy:
...
   def __getattr__(self, name):
     if name!="__getinitargs__": # allows it to be safely pickled
       self._name.append(name)
     return self.__invokePYRO__
     raise AttributeError()


   def __invokePYRO__(self, *vargs, **kargs):
     if not self.adapter.connected():
       self.adapter.bindToURI(self.URI)
     return self.adapter.remoteInvocation(self._name.pop(),
       constants.RIF_Varargs|constants.RIF_Keywords, vargs, kargs)

Remote attributes work much the same way (but require some extra trickery). Pyro first determine the type of the requested object member (is it an attribute, or a method?) and if it's an attribute, uses special remote functions to get or set the attribute in the remote object.

The protocol adapter's role is to convert the method call information to a message that can be sent over the wire to the other side. By default, Pyro uses the standard pickle mechanism for this, but you can choose to use XML serializing instead (using Gnosis XML pickling, or the de-facto Python PyXML toolkit) XML pickling is safer, but much slower than the default pickle). The adapter sends it via a socket connection to the adapter on the other side, which unpickles the data. It then looks up the required object in the daemon's list of connected objects, and invokes the special Pyro_dyncall method that is defined in the ObjBase (that is why all Pyro objects have to be derived from ObjBase, or use delegation with ObjBase). This method calls the actual method on the object and returns the results back the way the method call came from: back to the client. Here's what Pyro_dyncall looks like:

# module Pyro.core
class ObjBase:
...
   def Pyro_dyncall(self, method, flags, args):
     # find the method in this object, and call it with the supplied args.
     keywords={}
     if flags & constants.RIF_Keywords:
       keywords=args[-1]
       args=args[:-1]
     if flags & constants.RIF_Varargs:
       args=args[:-1]+args[-1]
     if method in dir(ObjBase):
       return apply(getattr(self,method),args,keywords)
     else:
       return apply(getattr(self.delegate or self,method),args,keywords)

You can see that it reconstructs the actual method call using apply based on the method name, method flags (whether it has keyword args and so on), and the arguments themselves. An extra check has to be done to see if the required method is in ObjBase itself or in the Pyro object.

WHAT'S IN A NAME?

Another very important concept in Pyro is the way how the Pyro objects are identified. There are several different parts in this puzzle: names, URIs, GUIDs and the thing that stores them for you: the Name Server.

Usually you assign a humanly-readable name to your Pyro objects. This is the name you supply to the connect call when you connect your object to the daemon. People can read it and it gives them a hint of what the object does or what its responsibilities are. It is essentially a convenient label to discriminate the various Pyro objects in your system, but you need more than the name to find an object: you also need its location.

That's what the URI (Universal Resource Identifier) is for. This is a special string that contains information on the protocol, the network location, and the object identification, like this:
   PYRO://192.168.1.50:9090/c0a801320630667f416d4c8dfdb4796e. "What is that big number, and where is my object name?" you might ask. Well, your humanly-readable object name is not used in URIs, and the big number is what is used instead: the Pyro object GUID.

The GUID (Globally Unique IDentifier) is generated by Pyro and you will probably never have to deal with it directly. But it is very important: it uniquely identifies every Pyro object that exists, everywhere, anytime (your own readable object names are not guaranteed to be unique). Every Pyro object has a GUID. It is actually a big hexadecimal number constructed from various sources to achieve uniqueness (well, almost. Read the source in the Pyro.util module for more info on this).

Now what about your object names then? What ties everything together? The answer is the Name Server. This is a registry of object names, and their associated URIs. It is much like a regular telephone book: you search for a name and you'll find the details on how to contact the particular person (the phone number). The Name Server can even group various objects together because it supports a hierarchical namespace (viz. the folder and file tree structure on your hard disk). Usually Pyro server objects are registered in the Name Server, and Pyro clients look in the Name Server to get the URI for a particular named object. The URI can then be used to contact that object because it contains all information needed to locate and access that unique object in the system. Interestingly enough, the Name Server is not strictly necessary: it is also possible to contact an object at once, but then you must know its URI beforehand. The Name Server is just a convenient place to store and find names and URIs. Pyro goes to great lengths to make using the Name Server as simple as possible, such as automatic discovery. Also see the next section for another example.

EXAMPLE

Well, we've talked a lot about the development of Pyro and what it's good for, but how does it work in practice? Let us look at a very simple example now. You will see that it is strikingly easy to use Pyro in your applications. The example is taken from the Pyro manual.

First, let us write the server part. That is: a Python object that we will access remotely (thus, making it a Pyro object).

   import Pyro.core
   import Pyro.naming

   class JokeGen(Pyro.core.ObjBase):
     def __init__(self):
       Pyro.core.ObjBase.__init__(self)
     def joke(self, name):
       return "Sorry "+name+", I don't know any jokes."

   Pyro.core.initServer()
   ns=Pyro.naming.NameServerLocator().getNS()
   daemon=Pyro.core.Daemon()
   daemon.useNameServer(ns)
   uri=daemon.connect(JokeGen(),"jokegen")
   daemon.requestLoop()

JokeGen is the Pyro object (with a single method "joke") and you can see that after some initialization an instance is created with the name "jokegen". After that we just tell Pyro to sit and wait for any remote calls.

To start our server, first lauch Pyro's Name Server (ns). After it's running we can start the server program.

The client program that we use to access the joke server is even simpler:

   import Pyro.core

   Pyro.core.initClient()
   jokes = Pyro.core.getProxyForURI("PYRONAME://jokegen")
   print jokes.joke("Irmen")

We request the Pyro object with the name "jokegen" (it doesn't matter where it's running, Pyro will locate it for us). After that we can just call the joke method as if the object is a regular Python object (but it isn't, it's running on a different machine). Sadly, our server doesn't know any jokes, as you will see by running the client program.

Many more examples are available in the Pyro distribution itself. Some are very simple, others show quite complex aspects of Pyro, but all of them are useful to look at to see how things can be done.

INTERESTING PYROTECHNICS

Pyro also provides an Event Server. It is a high level communication mechanism, based on events, topics, publishers and subscribers. The subscribers are fully unaware of the publishers, and also the other way round. This allows for very loose coupling. Events are published on certain channels (that have a topic) and any subscribers to these topics will receive these events. It is extremely easy to build an IRC-chat like system with the Event Server. It may be just what you need when you want this kind of communication and loose coupling between your components.

For very convenient object acces, two special URI formats are available. Inspired by the same development in CORBA, you can connect to the object of your choice like so:

   PYRONAME://nshostname:port/objectname

This URI can be used to automatically lookup an object in the Name Server. For instance, Pyro.core.getProxyForURI('PYRONAME://:Test.MyObject') gets you a proxy for the object named ":Test.MyObject", no explicit Name Server lookup is required.

   PYROLOC://hostname:port/objectname

This URI can be used to directly connect to an object on a specified host (for which you don't have an URI), bypassing the Name Server.

Pyro also contains a flexible, extendible authentication mechanism that you can adapt to almost any authentication logic that your specific situation requires. For security, communication over SSL is supported.

The feedback of the Pyro community has been (and still is) very important: useful feature requests will always be considered and bug reports are very valuable to improve the quality of the software.

APPLICATIONS

Pyro is used by many people, many of them use it for testing purposes, or small- scale projects, but there are a few really interesting applications. You can read about them in this section.

Racemi: Their product DynaCenter manages server images, and manages the systems that use the images. They use Pyro as the core for their distributed service agent architecture. All of the load management and server image management is handled via classes that are running in a Pyro server. Pyro allows them to easily implement fault tolerance by having multiple servers available in the event that one fails to respond. It also allows them to implement our load management architecture in a platform independent way via Python, given that Pyro is pure Python.

Cenix is a young biotech company located in Dresden, Germany. They have developed a Laboratory Information Management System (LIMS) over the last two years, which they use to collect data from microscopic experiments in the lab, and to put everything into a big database. The main goal is to identify genes related in cell division, so that they can inhibit them without having heavy side-effects for the rest of the organism. In the system Pyro is used to connect clients on more and more workstations (macs, linux, windows machines) to a central server. They've developed a library on top of Pyro with very high-level objects and factories for remote data access with focus on high availability. It really works fine, and especially the simplicity and elegance of Pyro was very convincing for them.

Pyro also played a key infrastructure role in the processing of GOES-12 SXI (Solar X-ray Imager) data which is freely available to the public. GOES-12 is a satellite in orbit around the earth. The GOES-12 satellite has a camera that photographs the Sun in the X-ray part of the spectrum. As images are captured by the satellite camera, they are relayed to an antenna. The unprocessed satellite images are then processed in various ways, archived, and made accessible on two web sites in as close to real-time as possible (within one minute of receipt from the satellite). Some components of this processing/archiving/dissemination system were written in Python. Each of these Python components was a separate process. Pyro provided the communication mechanism for these other (distributed) components which are producers and/or consumers of Pyro events (e.g., the arrival of new data). Pyro was not used to transport products (which typically range from 0.5 mb to 1 mb), primarily because some non-Python components were still file-based. The actual system is now moved to Spread, because of high performance and multi-language requirements.

And for one of the largest banks in Spain, Pyro is at the center of a credit derivatives distributed sytem consisting of a dozen financial components running on a Sun Enterprise 10000 server, and about twenty desktop workstations. Pyro connects the financial components on the servers with each other and also with the desktop applications on the workstations. Pyro is also used to connect calculation worksheets in Excel to the servers, using a COM proxy written in Python that makes it possible to speak PYRO from Excel macro's. This is a complex project, involving 150000 lines of code in C++, Python and VB, and Pyro is the glue that holds it all together. They have chosen Pyro because of it's performance (it beats comparable solutions in terms of speed and ease- of-use), its open-source character (which helps in debugging), and the rapid development that Pyro allows them to do (new server methods are available to everyone straight away, no need for proxies, compilation, ...) Pyro's Event Server is used to publisch financial data and it's been running stable for months without problems. Carlos Cabrera (developer) loves Pyro's simple design, the nice documentation and the examples. He thinks sometimes configuring commercial middleware can be overwhelming, and that Pyro makes things easy. (Due to marketing regulations I cannot make the name of the bank public, sorry).

CONCLUSION

Pyro has come a long way since the first version that I showed to my colleagues in 1999. A lot of bugs have been removed and many exciting features have been added, many of them thanks to the support and interest of many users and contributors. Without them Pyro would probably have withered a long time ago. Looking at the current stable version (3.3 at the time of writing) I'm still wondering how such a small library (about 5800 lines, 190 kilobyte of source code) can be so powerful and enables Python to shine at developing distributed object systems. It's very nice to see that some people have put Pyro to excellent use within their projects or companies. I hope that this interest in Pyro will stay and that Pyro can continue to evolve and improve in the future, to make it also your platform of choice for developing distributed Python programs.

For further Reference: For further Reference:

AmigaPython http://irmen.razorvine.net/amigapython/

CORBA http://www.corba.org

Voyager ORB http://www.recursionsw.com/products/voyager/orbpro.asp

Pyro project page on SourceForge: http://www.sourceforge.net/projects/pyro

Pyro mailing list: http://lists.sourceforge.net/lists/listinfo/pyro-core

xml.pickle from Gnosis XML utils:http://freshmeat.net/projects/gnosisxml/

Racemi DynaCenter:http://www.racemi.com

Cenix BioScience GmbH http://www.cenix-bioscience.com

Space Environment Center, GOES Solar X-ray Imager: http://www.sec.noaa.gov/sxi/

Logilab's Narval:http://www.logilab.org/narval/

David Mertz, article for Intel Developer Connection, "Introduction to Python Remote Objects (Pyro)": Click here for PDF

Pyro manual: http://pyro.sourceforge.net/manual/PyroManual.html


Irmen de Jong

Irmen de Jong, age 29, lives in the Netherlands. Having been introduced to Python during his Computer Science study, he now uses Python for many hobby-projects and occasionally on the job.


shim
shim

 Py is committed to bringing you great Python Articles.

shim
shim


Home   Subscribe   Migration FAQ   Contact PyZine   Write for PyZine   ZopeMag   opensourcexperts.com  

Reproduction of material from any of PyZine's pages without prior written permission is strictly prohibited. Copyright 2003 - 2005 PyZine Zope/Plone hosting by Nidelven IT