Archive for Python Core Development

Activity on distutils-sig

A long time ago (I don’t remember exactly: years) I subscribed to the Python distutils-sig mailing list. And then I unsubscribed because it was noisy and not terribly fruitful.

Now, chasing down a Windows-related pip issue, I’ve come across it again and discovered that there’s a shed-load of useful work going on there. I had no idea that distribute (a fork of setuptools) & setuptools had [re]merged as of setuptools v0.7, and I’d lost sight of the many PEPs on naming, versioning, distribution formats and the like. I still haven’t worked out which is which, but at least certain of them seem to have reached the stage where they’re the point of reference for other discussions — not discussion points in their own right. There’s an initiative to get pip into the main Python distribution — which I also had no idea about.

I’m especially happy to see Paul Moore holding up the Windows end of things in discussions — thanks, Paul! Despite our both being UK-based [*] and Windows types and long-term Python users, we’ve never actually met AFAIK.

I’ve resubscribed now and I hope to be able to contribute in some small way.

[*] I’m fairly sure — and he did recently make a reference to Gunga Din, which is something I’ve never heard outside this country.

How Does Python Handle Signals (on Windows)?

(Background: I’ve recently and less recently worked through a couple of issues with Python’s Ctrl-C handling under Windows. These required me to dig into the corners of Python’s signal-handling mechanism as interpreted on Windows. This post is something of an aide memoire for myself for the next time I have to dig.).

Signals are a Posix mechanism whereby User-space code can be called by Kernel-space code as a result of some event (which might itself have been initiated by other User-space code). At least, that’s the understanding of this ignorant Windows-based developer.

For practical purposes, it means you can set up your code to be called asynchronously by installing a signal handler for a particular signal. Lots more information, of course, over at Wikipedia. Windows (which is not short of native IPC mechanisms, asynchronous and otherwise) offers an emulation of the Posix signals via the C runtime library, and this is what Python mostly uses.

However, as you’ll see from the Python docs Python doesn’t allow arbitrary code to be called directly by the OS. Instead, it keeps track of what handlers you’ve set up via the signal module and then calls them when it’s got a moment. The “when it’s got a moment” means, essentially, that Modules/signalmodule.c:PyErr_CheckSignals is called all over the place, but especially is called via the eval-loop’s pending calls mechanism.

So what does this mean in term’s of Python’s codebase?

* The heart of the signal handling mechanism is in Modules/signalmodule.c

* The signal module keeps track in a Handlers structure of the Python handlers registered via the signal.signal function. When the mechanism fires up, it pulls the appropriate function out of that structure and calls it.

* Python registers Modules/signalmodule.c:signal_handler with the OS as a global signal handler which, when fired by the OS, calls Modules/signalmodule.c:trip_signal which indicates that the corresponding Python signal handler should be called at the next available point.

* The signal can be delivered by the OS (to the internal signal_handler function) at any point but the registered Python handler will only be run when PyErr_CheckSignals is run. This means that, at the very least, the Python signal handlers will not be run while a system call is blocking. It may be that whatever caused the signal will have caused the kernel to abort the blocking call, at which point Python takes over again and can check the signals. (This is what happens at points in the IO read/write loops). But if some uninterruptible device read hangs then Python will not regain control and no signal handler will execute.

* The main eval loop will check for raised signals via its pending calls mechanism, a C-level stack from which a function call can be popped every so often around the loop. The trip_signal function (called by the global signal_handler) adds to the queue of pending functions a wrapped call to PyErr_CheckSignals. This should result in the signals being checked a few moments later during the eval loop.

OK; so much for the whistlestop tour. How about Windows?

Well, for the most part, Windows operates just the same way courtesy of the C runtime library. But the signals which are raised and trapped are limited. And they probably resolve to the more Windows-y Ctrl-C and Ctrl-Break. I’m not going to touch on Ctrl-Break here, but the default Ctrl-C handling in Python is a bit of a mixed bag. We currently have a mixture of three things interacting with each other: the signal handling described above (where the default SIGINT handler raises PyErr_KeyboardInterrupt); the internal wrapper around the C runtime’s implementation of fgets which returns specific error codes if the line-read was interrupted; and some recently-added Windows event-handling which makes it easier to interrupt sleeps and other kernel objects from within Python).

That really was quick and I’ve brushed over a whole load of details; as I say, it’s more to remind me the next time I look at a related issue. But, hopefully it’ll give other interested people a headstart if they want to see how Python does things.(Background: I’ve recently and less recently worked through a couple of issues with Python’s Ctrl-C handling under Windows. These required me to dig into the corners of Python’s signal-handling mechanism as interpreted on Windows. This post is something of an aide memoire for myself for the next time I have to dig.).

Just in case you thought it was easy…

From time to time, the idea of a standard Python “Enum” object is raised on the Python lists. You know the kind of thing: a lightweight means of mapping numbers to labels so you can do set_colour( without having a long module of manifest constants or magic numbers lying around all over your codebase.

It all sounds very straightforward, and Barry Warsaw had an existing module which seemed like a fairly good starting point, so PEP 435 was started and it all looked like it was just a formality.

Now, literally *hundreds* of mailing list posts and endless, endless threads later, GvR has just pronounced his approval of the PEP and it’s good to go.

If you — like me — thought “this one won’t be controversial”, then just point your search engine of choice at and look for “enum” or “435″, or just look at the archive for May alone (which only represents the final few days of details being thrashed out) to realise just how much discussion and work is involved in what appears to be quite a simple thing.

Of course, part of the problem is precisely the fact that the idea is so simple. I’m sure most people have rolled their own version of something like this. I know I have. You can get up and running with a simple “bunch” class, possibly throw in a few convenience methods to map values to names and then just get on with life. But when something’s got to go into the stdlib then it all becomes a lot more difficult, because everyone has slightly (or very) different needs; and everyone has slightly (or very) different ideas about what constitutes the most convenient interface.

And there’s always the danger of the “bikeshed” effect. If a PEP is proposing something perhaps quite fundamental but outside most people’s experience, then only people with sufficient interest and knowledge are likely to contribute. (Or argue). But an enum: everyone’s done that, and everyone’s got an interest, and an idea about how it should be done.

But, bikesheds aside, I’m glad that the Python community is prepared to refine its ideas to get the best possible solution into the standard library. As a developer, one naturally feels that one’s own ideas and needs represent everyone else’s. It’s only when you expose your ideas to the sometimes harsh winds of the community critique that you discover just how many different angles there are to something you thought was simple.

Thankfully, we have a BDFL (or, sometimes, a PEP Czar) to make the final decision. And, ultimately, that means that some people won’t see their needs being served in the way they want. But I think that that’s far preferable to a design-by-committee solution which tries to please everybody and ends up being cluttered.

Aide-memoire for Python hg clones

(This is because I use mercurial rarely enough and commit to Python even more rarely; so I always forget what incantations I used last time…)

hg clone
hg clone issue1234
cd issue1234

hg up 3.3
hg import --no-commit http://.../fixedit.patch

# Do whatever
# Edit Misc/NEWS

hg commit -m "... (Patch by ...)"
hg up default
hg merge 3.3

# Stuff happens including, probably, Misc/NEWS conflicting
# Copy Misc/NEWS.orig back to Misc/NEWS and re-edit

hg resolve -m Misc/NEWS

# Do whatever

hg commit -m "... (Patch by ...)"

(Watch out for push races if other devs have committed…)

hg push ssh://

Python 3.3 is on the way

You’ve probably seen the mailing list announcements and the tweets which declare that Python 3.3 is entering its beta stage. For core developers this means that no more features can be introduced into this version of Python (at least not without the connivance of the Release Manager: Georg Brandl). Bug fixes are still allowed — which sometimes leads to rather creative labelling of changesets.

Have a look at the preliminary “What’s New?” doc.

Up to now, Python core developers have been championing Python 3 over Python 2 largely on the basis of a significant amount of cleanups, rationalisation and cruft-clearing. Now all this is very good, but mostly pleases people who have to work with the underlying Python code & data or who are fond of housekeeping. Clearly there have been other additions. (In fact the more I read through the “What’s New?” documents the more I realise that we’ve been doing something of a poor job of advertising our own achievements). But there hasn’t perhaps been anything which is really structurally groundbreaking.

Now we’ve got built-in virtual environment support, proper namespace packages, a much clearer OS exceptions hierarchy, and — at long last — a way of yielding from other iterators. (That thing where you had to loop over an iterator yielding its values: you can now just “yield from”). Any one of those alone is surely worth the entrance price. It’s actually starting to become attractive to use Python 3.x not just because it’s cleaned-up (on the moral highground but without much of a change of scenery) but because it’s got cool new features that I actually need (wow! look at that view).

To be sure there are literally hundreds of other small changes, ranging from one-off bugfixes to large-scale rewrites: bz2 has been rewritten from scratch; Brett Cannon’s finally achieved his objective of reimplementing the import mechanism in Python; Unicode support under the covers is cleaner and faster; the useful-but-slower decimal module is now useful-but-faster thanks to a version written in C, the indefatigable Stefan Krah getting the credit there. And after much bouncing around, Peter Moody’s ipaddress module is now integrated in the stdlib.

For Windows users, two significant changes have made it in (although they haven’t reached the What’s New? docs yet). One is the implementation of PEP 397 — a Python launcher for Windows, conceived by Mark Hammond, with implementation work by Vinay Sajip and worked up into its final form by Martin von Loewis. The other, implemented by Brian Curtin, is the addition of python.exe to the system PATH. This one is a long-standing gripe especially for novice users who can now just type “python” at the command prompt and get the latest version up. In fact, the PEP 397 launcher makes this perhaps less important but it’s still good to have the option there in the installer.

I’m trying to help along a couple of changes to the Win32 implementation of some os pieces under the guise of bugfixes, but even if I don’t get them past the eagle eye of the release manager, this’ll still be a version to look forward to.