Archive for May, 2011

Too much code, too few releases…

(or: release late, release seldom).

I don’t know about you, but I have sort of one-man sprints on some of my modules when for days on end I take my laptop on trains and squeeze in time at lunch or whenever to refactor or enhance modules. The only trouble is that I’m very (read: too) particular about the quality of the code I release to the public. I want to make a good impression. I want it to have good test coverage. (Or at least some test coverage). I want it to have good docs, with an overview, an API guide, and a cookbook. I want it to have a roadmap so people can see what to expect. (And what not to expect).

But of course all that takes time and energy from the limited pool of time and energy at my disposal. And I have several projects, all competing for that same time and energy, not all of them related to Python or programming at all. Like most developers, I suspect, I’d far rather be writing new code, juicing up existing code, refactoring code for the new generation of language features, breaking code out into nicely layered, suitably decoupled modules. It’s not that I don’t value docs, test, etc. I do. But someone’s got to write them. And that’s me.

Also I don’t want to go public with an underwhelming feature-set. I don’t want people saying, “Oh good… Oh no! Where’s….?” So I go on extending the code forever and never actually releasing. The Perfect is the proverbial enemy of the Good.

The case in point at the moment is the rewrite of my active_directory module (cunningly entitled active_directory2). It’s more or less of a ground-up rewrite with a very similar API but more structure and layering etc. etc. The rewrite’s been on the go for well over a year in bits & pieces. The trouble is that the original module works fine for most of what I want. Very occasionally I need something more sophisticated and I can usually cobble something suitable together for the occasion. So there’s only a low incentive level.

Testing it’s a bit of a ‘mare as I need an test AD rig. Obviously. So, courtesy of the PSF MSDN Subscription — which, it seems to me, was created for just this kind of situation — I’ve installed a VM containing Windows 2003 Small Business Server. This is about the smallest modern server which will host Active Directory. This setup has the slight advantage that it forces me to test AD authentication since, in an inversion of the norm, I’ve already logged onto my laptop before firing up the AD server so it can’t be the authenticating agent for my account.

I’ve got the top two layers working and I’m looking at tests — at the very least, exercise tests: ie those which simply call every function just to make sure it doesn’t completely fall over. Then I’ll probably release it as a work in progress while working on the more sophisticated extra layer which does some transparent transformations to and from Python types based on the schema definition of the AD object and which handles things like memberOf entries in a more Python way.

Watch this space…

Projects I Wish I Had A Reason To Use

I keep half an eye on various of the channels through which new or updated Python projects are announced: planet.python.org, the PyPI feed, and c.l.py.announce. While I am occasionally irritated by the noise which PyPI produces (nested lists, anyone?) it’s helpful to see projects go past, just to remind myself what’s out there.

One phenomenon I’ve experienced is that I’ll not even have *heard* of some underlying library or web service until someone writes a Python interface to it. Glancing randomly at the current PyPI front page, I see “Python yubico client” which refers to some kind of one-key logon service which I’d never heard of. Then there’s “sunburnt - a Python interface to Solr” which, it happens, I have heard of. And a project which refers to the “Comprehensive Knowledge Archive Network” which, even after I’ve visited the website, I’m still not very clear why I might want to use it. But I’m sure it’s really good for the people who do.

This is great news for two reasons at least: people are using Python for all sorts of things, and not just the big-named everyone-does-that stuff; and there’s an inadvertent advertising effect when they publish their work, which brings a possibly interesting project to the attention of a wider audience.

Which brings me to the main point of this post: from time to time I see a project go past which really grabs the programmer in me, and I just wish I had an excuse to use. Occasionally I actually get to fulfil this wish, but usually I just look wistfully on and check as each new update is released to see if there’s any way I can squeeze it into whatever I am paid to do (or have time to do otherwise).

  • Kamaelia - I’ve followed this one since its owner (Michael Sparks) presented it at a London Python meetup some years ago. I love the idea of simple objects and network pipelines between them. (Who wouldn’t?). I recall a thread a few years ago where Michael wondered why more people hadn’t taken it up (and why they’d gone for Twisted instead which seemed so much less intuitive).
  • Celery - obviously not unrelated to the previous entry. My interest in lightweight distributed message passing probably goes back to my days using Modula-2, the language I used at University and which led to my discovering Python. (Modula-2 has built-in support for coroutines).
  • requests - bit of a newcomer but it it exemplifies one of the things I admire most about Python code: it makes it really easy to do something which is conceptually simple. In this case, making HTTP requests. I think this is a key differentiator when comparing Python to other useful and successful languages: it just doesn’t take much code to get something done. It’s something I try for in my own libraries, and is, sadly, one of the reasons why the built-in logging module gets such a bad press. (In spite of its author’s ready willingness to answer questions, address issues, improve things, and update docs).
  • Whoosh - a pure Python search engine. For no very good reason, the idea of doing this in pure Python appeals to me. As it happens I’ve rolled my own search facilities for several small projects. (And who hasn’t?). Unfortunately, the cognitive overhead of replacing those by Whoosh is just too great at the moment to justify the small risk in changing. One day…
  • PyMongo (or any other of the several interfaces to MongoDB). As a daytime database developer, I’m as intrigued as the next man by the various schemaless / distributed databases currently going under the “NoSQL” umbrella. I’ve played with CouchDB (very RESTful interface but map-reduce all the way down) and MongoDB (less pure but more practical) and I even toyed with re-implementing our Helpdesk system using one of them, but as ever the risks were too great and there’s no real benefit to the business. (Only to me :) ).

There are loads more; these are just some that occur to me. Others include the kind of popup notifiers inspired by Growl, lkcl’s recently-discussed pyjamas and even some Windows-specific functionality within the pywin32 modules.

If only time and opportunity allowed…