|
|
||||||||||||||||||
|
|
||||||||||||||||||
![]() |
![]() |
Issue 1 - Revision 1 / Published in 2002
|
|||
|
Scientific Python - Py in Science - - - - - - - - - - - - By Eric Jones | Originally published in Py Issue 1 The Numeric module adds multi-dimensional array types to Python and forms the heart of its scientific computing capabilities. Numeric is the platform upon which higher-level libraries such as Scientific and SciPy are built. Its usefulness extends beyond the scientific community. Anyone crunching numbers with Python can benefit from its simplicity and speed. Numeric was originally developed by Jim Hugunin, also of Jython fame and is currently maintained by Paul Dubois of Lawrence Livermore National Labs along with a loose team of developers. Numeric is hosted on Sourceforge and distributed in multiple formats including source tar balls, click-install MS Windows installers, and RPM’s for both Python 2.1 and Python 2.2. The fundamental idea behind Numeric is that of array arithmetic. To illustrate, consider two lists of numbers, a and b. In standard Python, the individual elements in the lists might be added together using the following loop: >>> a = [1,2,3] >>> b = [3,4,5] >>> c = [0,0,0] >>> for i in range(len(a)): ... c[i] = a[i] + b[i] >>> print c [4, 6, 8] Using Numeric arrays, the same operation reduces to adding a and b together. >>> from Numeric import * >>> a = array([1,2,3]) >>> b = array([3,4,5]) >>> c = a + b >>> print c [4 6 8] Other standard numeric operators are supported, as well, and are always done on an element-by-element basis. Numeric supplies functions for operations such as matrix-vector and matrix-matrix multiplies. The benefits of array operations are more than aesthetic. There is also a significant speed benefit. In the first example, the looping is done in Python code and, even though all the elements of a and b are integers, the interpreter doesn’t have this information and spends time discovering element types to issue the proper addition operation. In contrast, the loop executed under the covers of c = a + b is done in C and has the benefit of knowing that all the operands are integers. In most cases, the speed improvement is one or two orders of magnitude. For this simple example, my windows laptop provides a factor of 50 improvement for 10000 element sequences. Note that using Numeric arrays as a straight replacements for lists in the first example will not produce any speed improvements. The entire overhead is in the interpreter, not the data structure. You must use array arithmetic to see speed improvements. Numeric also provides array equivalents for most of the functions in the standard math library so that expressions with trigonometric functions (as well as others) are possible: >>> from Numeric import * >>> a = array([0,pi/4,pi/2]) >>> c = sin(a)**2 + cos(a)**2 >>> print c array([ 1., 1., 1.]) Multi-dimensional Arrays The array constructor function takes a sequence as its first argument. If it is a sequence of numbers, as in the previous examples, a one-dimensional array is created. If it is a sequence of sequences, a multi-dimensional array is created. Here, a two dimensional array is created: >>> from Numeric import * >>> a = array( [[ 0, 1, 2], ... [10, 20, 30]]) >>> a array([[ 0, 1, 2], [10, 20, 30]]) Array arithmetic works identically on multi-dimensional arrays as it does on one-dimensional arrays. Numeric Types in ArraysUsually all the values of an array have the same numerical type. During array creation, the list of numbers is scanned to determine the appropriate numerical type, and the typecode for the array is set: >>> from Numeric import * >>> # Integer (Long) Arrays >>> a = array([1,2,3]) >>> print a, '--> typecode:', a.typecode() [1 2 3] --> typecode: l >>> #Floating point (double precision) array >>> a = array([1,2,3.0]) >>> print a, '--> typecode:', a.typecode() [ 1. 2. 3.] --> typecode: d >>> # Complex ( double precision) array >>> a = array([1,2,3+2j]) >>> print a, '--> typecode:', a.typecode() [ 1.+0.j 2.+0.j 3.+2.j] --> typecode: D By default, Numeric uses the double precision versions of floating point and complex numbers because these are the types that Python itself supports. However, many other numeric types are also supported. To specify explicitly that an array is single precision, the array constructor uses keyword arguments: >>> from Numeric import * >>> a = array([1,2,3],typecode=Float32) >>> print a, '--> typecode:', a.typecode() [ 1. 2. 3.] --> typecode: f The available types are as follows:
Note that there are more types than a machine can actually handle. For example, it is impossible to represent a floating point number with 0 bits which is exactly what Float0 asks Numeric to do. Float0 is still a useful identifier to have, however, because it tells Numeric to use the smallest floating point representation available on the host machine for the given array. Indexing and Slicing ArraysArray indexing follows the conventions used in other Python sequences, allowing both positive and negative values. >>> from Numeric import * >>> a = array([1,2,3]) >>> a[0] 1 >>> a[-1] 3 Array slicing is also similar to normal sequence slicing: >>> from Numeric import * >>> a = array([1,2,3,4,5,6]) >>> a[2:-2] array([3, 4]) >>> a[:3] array([1, 2, 3]) >>> a[3:] array([4, 5, 6]) Slicing syntax for arrays also has some additions. Numeric array slicing has a third field to specify the “stride” between elements in a slice. The following extracts the even elements in an array: >>> from Numeric import * >>> a = array([0,1,2,3,4,5]) >>> a[::2] array([0, 2, 4]) Here the ‘start’ and ‘stop’ fields have been omitted and default to the beginning and end of the array respectively. Multi-dimensional arrays are indexed by specifying multiple slice arguments separated by commas:
>>> from Numeric import *
>>> a = array([[ 0, 1, 2, 3, 4],
... [10,11,12,13,14],
... [20,21,22,23,24]])
>>> a[:2,1:3]
array([[ 1, 2],
[11, 12]])
The first slice argument specifies the rows and the second specifies the columns of the resulting slice. Array slices also vary from other sequence slices in an important way. Slicing standard sequences always produces a copy of the original slice. Changes to the newly created sequence do not affect values in the original sequence. In contrast, array slices are references or views into the original array. Slicing a Numeric array and assigning it to a variable will result in two variables referencing the same memory. The following example illustrates this. The first set of commands demonstrates slicing a list: >>> a = [0,1,2,3,4,5] >>> b = a[1:3] >>> b [1, 2] # Now assign a new value to an element in b. >>> b[0] = 10 >>> b [10, 2] # The original list, a, remains unchanged. >>> a [0, 1, 2, 3, 4, 5] Now try the same commands using Numeric arrays: >>> a = array([0,1,2,3,4,5]) >>> b = a[1:3] >>> b array([1, 2]) # Now assign a new value to an element in b. >>> b[0] = 10 >>> b array([10, 2]) # The original array has also changed! >>> a array([ 0, 10, 2, 3, 4, 5]) Numeric’s “slices as references” behavior is sometimes questioned, and it does require some programming care to prevent unexpectedly overwriting data. But the feature is also very important for efficient numeric computations. Arrays often contain many millions of elements. Making a copy every time a slice is made would be computationally expensive and could lead to “out of memory” errors in cases where arrays fill most of the available memory. Consider a simple finite difference calculation on a two dimensional grid: >>> b = (a[1:,1:-1] – a[:-1:1:-1]) / dx If slices created copies, this calculation would require approximately three times the memory used by the original a matrix. In cases where a copy is preferable, the copy method will do the trick: >>> a = array([0,1,2,3,4,5]) >>> b = a[1:3].copy() >>> b[0] = 10 >>> a array([0, 1, 2, 3, 4, 5]) It’s worth noting that arrays can have typecode = PyObject and hold arbitrary collections of Python objects. The augmented slicing or referencing behavior can be beneficial outside of Numeric computations. Also, if the objects in the arrays have Numeric operators defined for them, array arithmetic will produce expected results:
>>> a =
… array(('abc',[1,2,3,4]),typecode=PyObject)
>>> a
array([abc , [1, 2, 3, 4] ],'O')
>>> a+a
array([abcabc , [1, 2, 3, 4, 1, 2, 3, 4] ],'O')
Conclusion
This short introduction has covered array construction, typecodes, indexing, and slicing of Numeric arrays. Numeric has a number of other features, methods, and modules for working with arrays. In the next column, I will discuss array broadcasting and several of the methods for selecting array elements based on some criteria or condition. Future columns will discuss optimization and efficiency issues as well as tools for accelerating computations such as the weave module. For further Reference:Numeric http://sourceforge.net/projects/numpy/ SciPy http://www.scipy.org Scientific http://starship.python.net/crew/hinsen/scientific.html Weave http://www.scipy.org/site_content/weave
|
|||||||||||||||||||||||||||||||||||||||||||||
|
Py is committed to bringing you great Python Articles. | ||||||||||||||||||||||||||||||||||||||||||||||
![]() |
Reproduction of material from any of PyZine's pages without prior written permission is strictly prohibited. Copyright 2003 - 2005 PyZine |
|