Python, Win32 & codecs: the joys of open source

As you may have realised, Windows is my primary development platform. (Truth be told, my sole development platform unless something unexpected comes along). Although I can, technically, write code in C etc. I’ve been spoilt by the sheer quantity and, for the most point, quality of Python libraries which a generous army of itch-scratching developers has seen fit to unleash on the world in the form of precompiled Win32 binaries. So aside from a bit of dabbling, I’ve never really had to compile a Python extension in earnest on Windows.

Until now. I’ve been tasked with producing a program which will scan a batch of media (basically video) files to ascertain whether they meet an agreed reference standard. I’m not a digital media expert as such, but one of our guys who is wrote down the specs I needed to match against, and I went scouting on suitable websites to find the information I needed. Now the first thing I realised is that in the wonderful world of multimedia, one person’s “bitrate” is another person’s “multiplex bitrate” and yet another’s “sample rate”. I’m not saying that these and other terms are interchangeable strictly speaking, merely that they are sometimes used as if they were.

My starting point, given that we’re using an embedded Windows Media Player to play the files out was to automate WMP (via win32com and MPlayer.OCX) because since it could play the files in question it would, ipso facto, be able to tell me whatever I wanted to know about them. At the same time, I assumed that the Property-sheet properties of any of the same files would come from the same source and that as a simple workaround, my users could right-click files to get the information I needed. Wrong on both counts. (Don’t ask why; I’ve no idea. I’m moving too fast at this point to do more than look through the frankly scant information available in the Windows Media SDK).

Absent my primary point of information, my mind moved towards open source libraries. And, obviously, to ffmpeg, the factotum of the open source media world. Naturally, I knew that pymedia was the usual answer to “How do I…?” questions of this sort. Unfortunately, the only Python 2.5 build I could find (from the mailing list) failed to recognise most of the files I had. I did consider building it, but decided to look for a closer-to-the-metal interface to ffmpeg itself.

Which brought me, by way of Pyrex and its fork Cython (I’d no idea how good those were; must try them out as soon as) to AVBin & Pyglet. Together, these came tantalisingly close to giving me what I wanted, but lacked just a few of the attributes I was after. So… a search through the ffmpeg source code and examples later, I had a couple of patches ready for AVBin and Pyglet. [*]

I admit I didn’t go searching the web for COM components or Dlls which promised to return details of media formats. That’s partly because I’d have had to do some plumbing anyway to use them, and partly because — assuming they were closed source — I might well have ended up with the same mixture of information I’d started off with. The open source advantage is that I could go back and back through the code bases of the various projects to find out exactly what information was being returned in each field. The disadvantage for me as a Windows developer who doesn’t usually build Python extensions is that I had to go through the more than slight pain of building an environment in which I could in fact build the extensions (involving MingW, MSys and various build scripts) but I’ll keep that story for another day.


[*] Which were quite rightly rejected by the project maintainer as I hadn’t followed the project standards: let this be a lesson to all you young coders out there!

4 Comments so far »

  1. Steve said,

    Wrote on November 5, 2007 @ 3:04 pm

    I’ve used the MediaInfo dll via ctypes for this kind of stuff. If you’re interested I could email you my simple wrapper that I wrote - about 160 LOC.

  2. mickey mouse said,

    Wrote on November 5, 2007 @ 10:21 pm

    You might check out the Enthought python distribution ( It’s targeted at scientific applications but it comes with a working mingw build environment out-the-box (as well as lots of other good stuff). This makes building win32 extensions quite painless. N.B. Their all-in-one Sumo-distribution only goes up to Python-2.4, but their egg repositories and “Enstaller” means you can get the same tools installed with Python-2.5 and leave out all the ones you don’t need.

  3. tim said,

    Wrote on November 6, 2007 @ 9:31 am

    Thanks for the suggestion “mickey” (which I suspect may not be your real name :). I have tried Enthought in the past, for no reason other than curiosity. I hadn’t realised that it included a working MingW setup. I’ll have to have a look again.

  4. Vlada said,

    Wrote on November 21, 2007 @ 8:08 am

    Hi Steve,

    I’m interested in your wrapper for mediainfo.dll, could you please contact me at Thanks a lot.

Comment RSS · TrackBack URI

Leave a Comment


Sign in with your OpenID ?


Name: (Required)

E-mail: (Required)