How Does Python Handle Signals (on Windows)?

(Background: I’ve recently and less recently worked through a couple of issues with Python’s Ctrl-C handling under Windows. These required me to dig into the corners of Python’s signal-handling mechanism as interpreted on Windows. This post is something of an aide memoire for myself for the next time I have to dig.).

Signals are a Posix mechanism whereby User-space code can be called by Kernel-space code as a result of some event (which might itself have been initiated by other User-space code). At least, that’s the understanding of this ignorant Windows-based developer.

For practical purposes, it means you can set up your code to be called asynchronously by installing a signal handler for a particular signal. Lots more information, of course, over at Wikipedia. Windows (which is not short of native IPC mechanisms, asynchronous and otherwise) offers an emulation of the Posix signals via the C runtime library, and this is what Python mostly uses.

However, as you’ll see from the Python docs Python doesn’t allow arbitrary code to be called directly by the OS. Instead, it keeps track of what handlers you’ve set up via the signal module and then calls them when it’s got a moment. The “when it’s got a moment” means, essentially, that Modules/signalmodule.c:PyErr_CheckSignals is called all over the place, but especially is called via the eval-loop’s pending calls mechanism.

So what does this mean in term’s of Python’s codebase?

* The heart of the signal handling mechanism is in Modules/signalmodule.c

* The signal module keeps track in a Handlers structure of the Python handlers registered via the signal.signal function. When the mechanism fires up, it pulls the appropriate function out of that structure and calls it.

* Python registers Modules/signalmodule.c:signal_handler with the OS as a global signal handler which, when fired by the OS, calls Modules/signalmodule.c:trip_signal which indicates that the corresponding Python signal handler should be called at the next available point.

* The signal can be delivered by the OS (to the internal signal_handler function) at any point but the registered Python handler will only be run when PyErr_CheckSignals is run. This means that, at the very least, the Python signal handlers will not be run while a system call is blocking. It may be that whatever caused the signal will have caused the kernel to abort the blocking call, at which point Python takes over again and can check the signals. (This is what happens at points in the IO read/write loops). But if some uninterruptible device read hangs then Python will not regain control and no signal handler will execute.

* The main eval loop will check for raised signals via its pending calls mechanism, a C-level stack from which a function call can be popped every so often around the loop. The trip_signal function (called by the global signal_handler) adds to the queue of pending functions a wrapped call to PyErr_CheckSignals. This should result in the signals being checked a few moments later during the eval loop.

OK; so much for the whistlestop tour. How about Windows?

Well, for the most part, Windows operates just the same way courtesy of the C runtime library. But the signals which are raised and trapped are limited. And they probably resolve to the more Windows-y Ctrl-C and Ctrl-Break. I’m not going to touch on Ctrl-Break here, but the default Ctrl-C handling in Python is a bit of a mixed bag. We currently have a mixture of three things interacting with each other: the signal handling described above (where the default SIGINT handler raises PyErr_KeyboardInterrupt); the internal wrapper around the C runtime’s implementation of fgets which returns specific error codes if the line-read was interrupted; and some recently-added Windows event-handling which makes it easier to interrupt sleeps and other kernel objects from within Python).

That really was quick and I’ve brushed over a whole load of details; as I say, it’s more to remind me the next time I look at a related issue. But, hopefully it’ll give other interested people a headstart if they want to see how Python does things.(Background: I’ve recently and less recently worked through a couple of issues with Python’s Ctrl-C handling under Windows. These required me to dig into the corners of Python’s signal-handling mechanism as interpreted on Windows. This post is something of an aide memoire for myself for the next time I have to dig.).

5 Comments so far »

  1. John M. Camara said,

    Wrote on May 31, 2013 @ 8:08 pm

    fyi - You pasted the contents of the blog post twice

  2. tim said,

    Wrote on May 31, 2013 @ 8:13 pm

    @John M. Camara: thanks; fixed. (I thought it looked too long!)

  3. eryksun said,

    Wrote on June 13, 2013 @ 2:51 pm

    A few months ago I looked into signal handling on Windows in order to answer the following question on Stack Overflow:

    http://stackoverflow.com/a/15472811/205580

    I’ll be the first to admit that my answer for this Ctrl-C problem with Intel’s Fortran runtime is just plain ugly. Notably, it bypasses the time module’s Ctrl-C handler unless you can delay importing the module. For that I had to resort to using ctypes.

    In 3.3 this hack no longer works. The fix is to have PyErr_SetInterrupt in signalmodule.c call SetEvent(sigint_event). Setting this event is required in order to interrupt time.sleep, and it’s critical for my_fgets in Parser/myreadline.c. If the event isn’t set and my_fgets sees ERROR_OPERATION_ABORTED, then it assumes Ctrl-Z was pressed. In interactive mode this quits the interpreter.

    I did a quick check using ctypes to confirm that setting the event solves the problem. I added calls to pythonapi._PyOS_SigintEvent and windll.kernel32.SetEvent in the Ctrl-C handler before calling _thread.interrupt_main. It works, but I think this should already be handled in PyErr_SetInterrupt to properly handle a simulated SIGINT in general. Perhaps it’s just an oversight because this is a little-used API, but it is used: IDLE calls _thread.interrupt_main several times in run.py.

    Also, in the process of looking into this I stumbled on a problem with my_fgets. If sigint_event is already set (e.g. time.sleep was interrupted), then a Ctrl-C in fgets immediately resets the event because WaitForSingleObject immediately returns WAIT_OBJECT_0. Then my_fgets returns 1, and PyOS_StdioReadline returns NULL. When signal_handler finally runs in another thread, it sets sigint_event. Thus the scenario repeats until something breaks the cycle, such as using Ctrl-Z with input() (misinterpreted as Ctrl-C, so it raises KeyboardInterrupt) or using ctypes to call ResetEvent(sigint_event).

    In 3.3.2 I’ve noted 3 bugs related to this scenario — all after an interrupted sleep(). (1) type Ctrl-C and then Ctrl-Z; instead of exiting the interpreter it raises a KeyboardInterrupt. (2) type Ctrl-C in an input(); the exception isn’t displayed in the traceback. (3) Repeat the latter but in a try/except that ignores the KeyboardInterrupt; pressing enter at following prompt raises a SyntaxError: unknown decode error.

    It seems to me the solution is to do the ResetEvent before fgets is called, and not worry about resetting the event after calling WaitForSingleObject. That’s basically how timemodule.c and _multiprocessing/semaphore.c are using the event.

  4. tim said,

    Wrote on June 13, 2013 @ 6:33 pm

    @eryksun: thanks for the comment. Frankly, the Ctrl-C handling is a bit of a ‘mare: there are too many levels and, as you note, too many subtle interactions and possible races.

    Have you raised issues at bugs.python.org for the problem you identify above? When I was looking at issue18040 — which I eventually decided was a “won’t fix” — I tried to do a sweep of existing issues but I don’t remember turning up the ones you mention?

    I’d like to nail as many as I can while the code is still a bit fresh in my mind.

  5. eryksun said,

    Wrote on June 14, 2013 @ 6:55 am

    I haven’t filed an issue at bugs.python.org. Your post reminded me of the workaround I’d devised for the problem with the Fortran runtime DLL, and inspired me to look into the new implementation in 3.3. I figured you might look into the two issues and pronounce whether or not they’re “won’t fix”. Every ‘fix’ can possibly break existing code or introduce new bugs. I was half expecting a reply along the lines of “yeah, but…”.

    I forgot to mention that since IDLE uses _thread.interrupt_main, adding the call to SetEvent(sigint_event) in PyErr_SetInterrupt would have the added benefit of letting an IDLE user interrupt time.sleep() with Ctrl-C. That wasn’t even a possibility before unifying around a common sigint_event — which was a clever idea, IMO.

    Speaking of unfiled bug reports, I also noticed an issue the other day while reading importlib/_bootstrap.py — but it isn’t urgent. _make_relax_case looks for b’PYTHONCASEOK’ in _os.environ (that’s the built-in os, i.e. posixmodule.c). This is correct for Darwin (posix.environ), but not for Windows (nt.environ), which uses Unicode.

    http://hg.python.org/cpython/file/d047928ae3f6/Lib/importlib/_bootstrap.py#l29

    The test has the same problem:

    http://hg.python.org/cpython/file/d047928ae3f6/Lib/test/test_importlib/source/test_case_sensitivity.py#l52

    Again, it’s looking for b’PYTHONCASEOK’, but also it’s setting os.environ and looking for the change to be reflected in nt.environ. That can’t happen since os uses a copy of nt.environ with upper-cased keys.

    The test is skipped when I run the following:

    py -3.3 -m test.test_importlib.source.test_case_sensitivity

    It doesn’t matter whether or not I initially set PYTHONCASEOK=1.

Comment RSS · TrackBack URI

Leave a Comment

OpenID

Sign in with your OpenID ?

Anonymous

Name: (Required)

E-mail: (Required)

Website:

Comment: