|
|
||||||||||||||||||
|
|
||||||||||||||||||
![]() |
![]() |
Issue 4 - Revision 2 / January 28, 2003
|
|||
|
What's new in Python 2.3 - Streaming Media with Zope - - - - - - - - - - - - By Andrew M. Kuchling | published November 17, 2003 Abstract Every Python release has had a different complexion, ranging from radical to conservative. For example, Python 2.0 was a radical release: it added Unicode, string methods, cyclic garbage collection, and new syntax (print » f(*args, **keyworddict)). Python 2.1 was middle-of-the-road: it had one radical change, the introduction of nested scopes, and a number of less noticeable changes and new modules. Python 2.2 was another radical release, adding new-style classes alongside the existing object model and new language features, most notably generators and iterators. Python 2.3 should be released some time during the summer of 2003; the second alpha is already available. So what's Python 2.3 like? It leans more toward the conservative side, emphasizing useful new modules more than changes to the Python language. There certainly are some changes to the language itself, but there aren't many of them, and none of them are major. GeneratorsGenerators are now fully part of Python. First introduced in 2.2, a from __future__ import generators directive was required to enable the yield keyword needed to use them. In 2.3 yield is always a keyword; no __future__ directive is required. In Python, C, and most other languages, a stack frame is created to hold the local variables whenever a function is entered. This stack frame is destroyed when the function is exited, whether it's by hitting a return statement or falling off the end of the function. Generators change this model by not destroying the local stack frame, instead keeping it around so it can be reused. Consider the simplest generator function:
def g():
for i in range(4):
yield i+1
In Python, C, and most other languages, a stack frame is created to hold the local variables whenever a function is entered. This stack frame is destroyed when the function is exited, whether it's by hitting a return statement or falling off the end of the function. Generators change this model by not destroying the local stack frame, instead keeping it around so it can be reused. Consider the simplest generator function:
Calling g() creates an instance of the generator and returns it. The instance has its own stack frame for local variables, so it has its own private value for the variable i. The returned generator behaves like an iterator. Let's detour momentarily for a quick refresher on iterators, another feature introduced in Python 2.2. To be an iterator over some sequence, an object simply needs a next() method that returns the next item in the sequence and raises the StopIteration exception when there are no more items left. As of Python 2.2 the for statement always use an iterator; default iterators are used for lists and strings: existing code -- for example, for i in [1,2,3] -- continued to work. When the generator's next() method is called, execution of the body of code begins and continues until it hits a yield statement. yield i + 1 evaluates the expression i + 1 and the resulting value is returned by next(). The subsequent call to next() picks up after the yield, so the for loop goes around again and this time i + 1 evaluates to 2. And so it goes, until the loop is over and the subsequent next() call falls off the bottom of the generator, causing StopIteration to be raised. Generators are most useful when you'll be iterating over a very large collection of elements, a collection so large that you don't want it to be created as an actual list in memory. For example, if you wanted to loop over all of the files on a disk, the resulting list might well be too large to fit in memory. A generator can return filenames one-by-one, reducing the memory required for such a traversal. Basically, a streaming server works like an ordinary file server, delivering a requested file to the requesting user(s). But where an ordinary file- or Web-server - delivers the complete file to a user on request, a streaming server delivers the file in small packages, ordered according to the file's implicit timeline. The streaming server continuously "chucks up" the file and keeps track of where the user is in the file, assuring that the media content is played back in the right order by the end-user's client and with the appropriate timeline. BooleansThe most visible change to Python 2.3 is probably the addition of booleans. The Boolean is the only new builtin type, though a number of other data types -- including date/time types, sets, and heaps -- were added to the standard library. The Python 2.2.1 bugfix release prepared the way for the Boolean type by adding True and False as builtin names. In 2.2.1 they're just integers, as if you'd assigned:
True = 1
False = 0
In 2.2.1, if you print the value of True, you just get 1. Python 2.3 keeps the names True and False, but they're now unique instances of a new Most of Python 2.3's operators, builtin functions, and library modules have been changed to return True or False where appropriate; for example, the in operator and the isinstance() builtins now return True and False. A Builtin Sequence EnumeratorA new builtin function, enumerate(), was added to simplify the common Python idiom for looping over a list and doing something to each entry. The idiomatic code usually looked something like.
for i in range(len(L)):
item = L[i]
...
L[i] = item
You could also use a list comprehension or a map() call, but the idiomatic for loop was often easier to read, especially when its body was longer than a line or two. The range(len(L)) is the most inelegant part of the idiom; sometimes I typed range(L), which is incorrect. enumerate() makes this idiom a bit tidier. enumerate(iterator) returns an iterator, which produces the sequence [(0, iterator[0]), (1, iterator[1]), ..., (N, iterator[N])]. The idiomatic loop can now be rewritten as
for i, item in enumerate(L):
...
L[i] = item
New Modules
The bulk of the additions in Python 2.3 are handy new modules for one task or another. The most generally useful new module may be the logging package, a set of flexible and highly customizable classes for recording log messages from a program's various subsystems. For the simplest uses, you can just import the logging module and call the right function:
import logging
logging.debug("Starting program")
logging.warn("Config file %s not found", "/etc/application.conf")
logging.critical("Disk full")
More complex software may be divided into subsystems. For example, a program for data analysis might have a user interface subsystem that displays a GUI, a network subsystem that handles retrieving data from remote servers, and a computational component that does some work with the data. Each subsystem can have its own log, a separation which lets you look at debugging messages for the computational subsystem without drowning in messages from the network component. To implement this you just have to retrieve a particular log with the getLogger() function:
net_log = logging.getLogger("network") # subsystem name
net_log.debug("Starting DNS lookup")
comp_log = logging.getLogger("compute")
comp_log.error("Matrix is not diagonalizable")
There are also hooks for implementing custom handlers and log records. With a bit of work you can build a logging scheme closely tailored to your application and your debugging needs. Dates and TimesSeveral types were added to represent times, all contained in the datetime module. There's a date class, instances of which have year, month, day attributes, a time class with hour, minute, second, and microsecond attributes, and a datetime stamp that has all of them. These classes are an upgrade from the functionality of the 9-tuples used by the time module, but they a step below mxDateTime, the most common extension used for date handling. datetime's types are easier to work with than 9-tuples, but mx.DateTime also has functions for parsing strings in various date and time formats, as well as support for dates in the distant past or future. If you're already using mxDateTime, there's not much reason to switch to the new 2.3 types. A new sets module contains two data types for representing mathematical sets, that is, unordered collections of elements with no duplicates. It's always been possible to use dictionaries to get the semantics of a set, but doing so meant you had to implement the intersection and union operations yourself, a simple but annoying task. sets contains two set classes: Sets which can have elements added and removed at any time, and ImmutableSets, which can't be modified, thus allowing the creation of sets of sets. Using the set classes is straightforward. The constructors can take any Python sequence to populate the set, and then you can perform intersection and union operations on sets.
>>> import sets
>>> s = sets.Set('abc')
>>> s
Set(['a', 'c', 'b'])
>>> s2 = sets.Set(['c', 'd', 'e'])
>>> s.intersection(s2)
Set(['c'])
Instances of the mutable Set class can also be updated in place:
>>> s.union_update(s2)
>>> s
Set(['a', 'c', 'b', 'e', 'd'])
Having sets in the Python standard library isn't a huge leap forward, but it is a pleasant convenience. bsddbThe code for the old bsddb module has been replaced by the third-party PyBSDDB package. A compatibility interface is provided so that programs using the bsddb will continue to work; the new interfaces provide access to the transactional features in current versions of BerkeleyDB. This is probably my favorite enhancement because all of the database modules previously included with Python were rather feeble. Thus, Python 2.3 will be a significant leap forward, making it possible to write fancier, more robust data-handling applications with an out-of-the-box installation of Python. MiscellanyOther notable, new modules include:
One significant module has been removed. rexec is still present but now always raises an error on being imported. It's been unceremoniously disabled because it has been essentially unmaintained for the last few versions. Occasionally bugs would be found and someone would fix them, but no one has been carefully updating the module in light of recent changes to Python, so no one is quite sure if it's still safe. For example, Python 2.2 changed the object model in many ways -- for example, it made types callable -- and possibly introduced new ways to break out of restricted execution. The safest and most honest course was to remove rexec. Other New ThingsMost of the other new features are less significant, though some of them are awfully cute:
There are other language-level changes, most of them so esoteric that the majority of Python programmers won't notice them. End of Article
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Py is committed to bringing you great Python Articles. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
Reproduction of material from any of PyZine's pages without prior written permission is strictly prohibited. Copyright 2003 - 2005 PyZine |
|