Sunday, March 30, 2008

Py-Lib 0.9.1 released

The Py-Lib 0.9.1 release is out! The Py-Lib is a very important support library that PyPy uses for a lot of things – most importantly it contains py.test, which PyPy uses for testing.

This is mostly a bugfix release, with a couple of new features sneaked in. Most important changes:

  • some new functionality (authentication, export, locking) in py.path's Subversion APIs
  • numerous small fixes in py.test's rsession (experimental pluggable session) and generative test features
  • some fixes in the py.test core

Download/Install: http://codespeak.net/py/0.9.1/download.html

Documentation/API: http://codespeak.net/py/0.9.1/index.html

UPDATE: the py-lib is now easy-installable with:

easy_install py

Friday, March 28, 2008

PyPy Summer of Code Participation

As in the last years, PyPy will again participate in Google's Summer of Code program under the umbrella of the Python Software Foundation. Unfortunately we were a bit disorganized this year, so that our project ideas are only put up now. The list of project ideas of PyPy can be found here.

Any interested student should mail to our mailing list or just come to the #pypy channel on irc.freenode.net to discuss things.

Monday, March 17, 2008

ctypes configuration tool

As a part of implementing ctypes, we decided to make coding using ctypes better on its own (irrelevant what python interpreter you use). The concrete problem we're trying to solve is to make ctypes code more platform-independent than it is. Say you want to create a ctypes type for size_t: ctypes itself provides no mechanism for doing that, so you need to use a concrete integer type (c_int, c_long, c_short etc.). Your code either becomes platform dependent if you pick one of them or is littered with conditionals for all sorts of platforms. We created a small library, called ctypes_configure (which is actually a variation of something we use somewhere in the PyPy source tree), which tries to solve some platform dependencies by compiling and running small chunks of C code through a C compiler. It's sort of like configure in the Linux world, except for Python using ctypes.

To install the library, you can just type easy_install ctypes_configure. The code is in an svn repository on codespeak and there is even some documentation and sample code. Also, even though the code lives in the pypy repository, it depends only on pylib, not on the whole of pypy.

The library is in its early infancy (but we think it is already rather useful). In the future we could add extra features, it might be possible to check whether the argtypes that are attached to the external functions are consistent with what is in the C headers), so that the following code wouldn't segfault but give a nice error
libc = ctypes.CDLL("libc.so")
time = libc.time
time.argtypes = [ctypes.c_double, ctypes.c_double]
time(0.0, 0.0)
Also, we plan to add a way to install a package that uses ctypes_configure in such a way that the installed library doesn't need to call the C compiler any more later.

Bittorrent on PyPy

Hi all,

Bittorrent now runs on PyPy! I tried the no-GUI BitTornado version (btdownloadheadless.py). It behaves correctly and I fixed the last few obvious places which made noticeable pauses. (However we know that there are I/O performance issues left: we make too many internal copies of the data, e.g. in a file.read() or os.read().)

We are interested in people trying out other real-world applications that, like the GUI-less Bittorrent, don't have many external dependencies to C extension modules. Please report all the issues to us!

The current magic command line for creating a pypy-c executable with as many of CPython's modules as possible is:

  cd pypy/translator/goal
  ./translate.py --thread targetpypystandalone.py --allworkingmodules --withmod-_rawffi --faassen

(This gives you a thread-aware pypy-c, which requires the Boehm gc library. The _rawffi module gives you ctypes support but is only tested for Linux at the moment.)

Tuesday, March 4, 2008

As fast as CPython (for carefully taken benchmarks)

Good news everyone. A tuned PyPy compiled to C is nowadays as fast as CPython on the richards benchmark and slightly faster on the gcbench benchmark.
IMPORTANT: These are very carefully taken benchmarks where we expect pypy to be fast! PyPy is still quite slower than CPython on other benchmarks and on real-world applications (but we're working on it). The point of this post is just that for the first time (not counting JIT experiments) we are faster than CPython on *one* example :-)
The exact times as measured on my notebook (which is a Core Duo machine) are here:
Compiled pypy with options:
./translate.py --gcrootfinder=asmgcc --gc=generation targetpypystandalone.py --allworkingmodules --withmod-_rawffi --faassen (allworkingmodules and withmod-_rawffi are very likely irrelevant to those benchmarks)
CPython version 2.5.1, release.
  • richards 800ms pypy-c vs 809ms cpython (1% difference)
  • gcbench 53700ms pypy-c vs 60215ms cpython (11% difference)
PyPy shines on gcbench, which is mostly just about allocating and freeing many objects. Our gc is simply better than refcounting, even though we've got shortcomings in other places.
About richards, there is a catch. We use a method cache optimization, and have an optimization which helps to avoid creating bound methods each time a method is called. This speeds up the benchmark for about 20%. Although method cache was even implemented for CPython, it didn't make its way to the core because some C modules directly modify the dictionary of new-style classes. In PyPy, the greater level of abstraction means that this operation is just illegal.