CS and the City

  • rss
  • Home
  • Resume

Gripe: XML in Python

Sean Lynch | July 18, 2008

I hadn’t even finished writing my post announcing my new love of Python when I stumbled into one of its skeleton-filled closets: XML.

The Python core libraries include six different methods for parsing and creating XML, none of which feel particularly Pythonic (here I am, three weeks into developing with Python and already I’m calling out core libraries as not being Python-y enough).  I missed the low overhead methods I had used in other languages. Particularly for parsing XML, PHP’s simplexml is hard to beat, and for building, it’s hands-down Ruby’s XML Builder.  Off I went, hunting for Python ports.

Warning: The following is a tangent.

This may be an unfair statement, but I get the impression that there’s a slight “Not built here” bias in the Python world.  Python has a substantial number of best of breed functionality, both core and third-party, but my initial impression is that they’re a little reluctant to adopt solutions championed by other languages.  Example: Where’s the Python equivlent to CPAN or gems?

Here’s my point:  I found Python ports, but they lacked in the qualities I loved about Python.  Maturity and Active Development.

First on my list was simplexml for PHP.  It allows the developer to access attributes and text through variable and list combination.  To get similar functionality in Python, I found handyxml. Handyxml allows equally brief tree traversal and iteration of multiple items.  Unfortunately, it hasn’t been updated since early 2004 and a lot of the dependencies have moved or are gone completely.  As such, it required some modifications just to get it into a functional state.  Not ideal.

The other functionality I missed was XML Builder in Ruby.  XML Builder takes full advantage of blocks in Ruby to allow nesting of xml element creation that makes the structure of the resulting document blindingly obvious in the code. This is in stark contrast to the Java-esque series of createNode, appendNode that Python (and Java and Objective-C) love. I managed to dig up a recent port by Jonas Galvez of XML Builder for Python.  He took advantage of the upcoming ‘with’ statement in Python 2.6 to achieve the same effect.  Though it had some problems handling unicode characters (remind me to submit a patch to github) and the documentation is minimal, I was able to get it up and running very quickly.  Better.

I know from my digging in recent weeks that there’s been some talk about refactoring Python’s urllib/urllib2 code for Python 3 to simplify the module and remove duplication.  I sincerly hope the XML libraries fall underneath the same knife, and that the solutions from Ruby and PHP are considered for a graph.

Comments
6 Comments »
Categories
Python
Tags
handyxml, modules, python, simplexml, xml, XML Builder
Comments rss Comments rss
Trackback Trackback

Confirming everything that’s ever been said about Python

Sean Lynch | July 16, 2008

As mentioned earlier, I’ve been working on a medium-scale project written in Python. It’s the first time I’ve used python beyond a few lines in a script. After a few weeks of working in Python I returned to an application I had been developing for the Mac in Objective-C over the weekend. It wasn’t until I made the switch back to a “traditional” static typing language that the sheer beauty of Python struck in full force.

It’s really not fair to pick on Objective-C here. A number of languages could sit in its place. Nonetheless, after working with Python for only two weeks, coding in Objective-C felt like being stuck in the middle of a traffic jam where all the other cars are driven by lobotomized chimps.

Every one of my intentions had to be slowly and laboriously explained in great detail to the computer lest I cause a massive digital pile up. To avoid, I was required to take several trips to Apple’s mediocre documentation before I assembled enough square brackets to build the Eiffel Tower. Even once the appropriate method was called, trying to break down the over-abstracted object return types to get the simple data I wanted resulted in so many code-compile-crash loops it hurt.

I wondered how much of my life I had already wasted prefixing all the class names with “NS”, or how many more times I would have to chase down some archaic memory error only to find that I had forgot to put a @ before my string.

Don’t get me started on strings in Objective-C either. I don’t know how the language designers at Apple can respect themselves when it takes almost 50 characters to do a replace (Actual Method Signature: stringByReplacingOccurrencesOfString:withString:), oh and you have to give up backwards compatibility with 10.4 if you want to use it.

This only showed me just how much I loved Python. String slicing. Unicode strings. Dictionaries and Lists everywhere. Generator Functions. List comprehension and filters. Easy to understand/parse syntax. Lots of Third-party modules. With simple APIs. That are open-source. And are actively maintained. Mature. And well documented.

I found myself writing far less code and accomplishing much more. In fact, I distinctly remember feeling excited about just how much functionality I had accomplished in such little time (Django is partly to blame for this).

I has used Ruby a reasonable amount before, but I didn’t fall head over heals largely because of the lack of maturity. The maturity of the documentation of both the core library and third-party modules is one of the most important features of a language for me, and something I missed tremendously coming from a Java background. In fact, I’ve re-written the ruby script that powered my Pitchfork Reviews gadget to be Python. It took me about 30 minutes.

I’m in love.

Comments
4 Comments »
Categories
Python
Tags
python objective-c ruby
Comments rss Comments rss
Trackback Trackback

Navigation

  • Business
    • Apple
    • Google
    • Microsoft
    • Yahoo
  • Canada
  • Copyleft
  • Development
    • Interfaces
    • Protocols
    • Python
  • How-to
  • Reviews
  • School
  • Technology
    • Gadgets
    • Software
  • Truthiness

Search

rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox