Andy Reitz (blog)

 

 

Integrating del.icio.us and Movable Type, in code

| Comments

Previously, I mentioned that I had been working on some automation to link my del.icio.us account with my blog. Since that posting, you've probably noticed a lot of posts showing up here, containing links. I have been experimenting further with this code, and so far, I like how the link blogging is going. I have spent the last several nights cleaning things up, and now I have something that, while it isn't the cleanest code that I have ever written, is suitable to show the world: delish2mt. After the jump, I'll go into some more detail about the code that I have posted.

Overall structure:

The source code directory contains four files:

  • delish2mt.py - The main script. This is what you execute to run the automation.
  • delish2mt.cfg - The configuration file. Put all of the login and other details about your del.icio.us and MT blog here.
  • DeliciousWS.py - A helper module. Contains all of the code for interacting with del.icio.us.
  • MTBlog.py - Another helper module. Contains all of the code for interacting with Movable Type (refactored from my earlier sample code)

Executing the automation:

In order to actually run this thing, you'll want to download all of the above files to a directory on a machine that has network connectivity both to del.icio.us and your blog. Next, you will want to modify the configuration file, delish2mt.cfg. The settings that you will need to modify should be rather apparent, and I have included comments in this file to describe.

After you have tailored the configuration file to suit your environment, it is time to execute delish2mt.py. Simply executing it at the command line should cause the script to fetch the specified links from your del.icio.us account, and post them to your blog. There are a few command-line options, that you can specify:

usage: delish2mt.py --cfgfile=FILE [options]

options:
  -h, --help            show this help message and exit
  -f FILE, --cfgfile=FILE
                        Configuration file (default: 'delish2mt.cfg')
  -q, --quiet           don't print status messages to stdout
  -d, --debug           Print debugging messages to STDOUT.

If delish2mt.py isn't working properly, you can try enabling the '--debug' option. I haven't really worked on cleaning up the debug output - but combined with the source code, the debug output should help you to figure out what is going on.

The '--quiet' might be useful if you are running this script from a cron job. In that mode, it should only print something if an exceptional condition occurs.

A tangent into Unicode:

By far, the most difficult problem that I encountered when writing this problem was dealing with unicode. When cobbling this thing together, one of the initial links that I chose was a page on Brent Simmons' blog. If you look carefully at the title of that blog post, you'll notice that it contains curly quotes around the actual title of the post. Well, when posting this link to del.icio.us, NetNewsWire helpfully encoded those characters into their unicode equivalent. Even more helpfully, del.icio.us has been programmed to accept (and display) unicode characters. Even more helpfully, del.icio.us will send those same unicode characters to you, URI-encoded, when you make a query through the API.

To Python's credit, I didn't even know that I was receiving unicode characters back from del.icio.us at first. Somewhere along the way, Python was taking the URI-encoded unicode characters, and converting them to actual, factual unicode characters in memory. This was all great, until I got to the part of my code where I wanted to write this data back to del.icio.us. In order to do so, I need to re-URI-encode these characters, so that they could go into the HTTP GET request back to del.icio.us. This is where things broke down for me, because none of the URI encoding routines that I found in the Python standard library could handle being passed unicode characters.

Once this happens, and you begin to grasp your situation, you then realize that you are in "unicode hell". In my case, I spent a few hours in this special hell, just trying to figure out how Python handles unicode, so that I could figure my way out of it. As it turns out, what I found is that on Mac OS X, unicode characters were converted to UTF-8, and the individual bytes were URI-encoded. This might be the standard for URI-encoding unicode, I'm not sure. But the next thing that I had to realize was that internally, Python was representing the unicode characters that I got from del.icio.us as UTF-16. Once I realized this, I came up with a way to convert the UTF-16 characters to UTF-8. And from there, I was able to URI-encode each byte, and assemble my outgoing HTTP GET properly.

Whew!

The routine that I came up with can be found in DeliciousWS.py - look for a method called 'quoteUriUTF16()'.

A few words about error handling (or the lack thereof):

There are probably more than a few rough areas in this code base. One of the roughest is going to be around error handling - I didn't do a lot of negative testing - as in, testing this code when things go wrong. As such, it will probably just barf up Python exceptions when something bad happens. I know that this is pretty ugly, but for right now it works.

Conclusion:

I think that's it! As with all software projects, it took almost as much time to polish this code base into something that I felt comfortable putting online as it took me to get the basic functionality working. But now that this is done, I can move on to my next project (oh, and add this thing to cron, too).

-Andy.

Technorati Tags: , , , , , , ,