As I have referenced on Twitter, I have been re-watching the entirety of the TV Show "Angel". And aside from taking joy in sucking all of Kevin's free time (because I totally got him into this show), I have once again endeavored to get every episode of Angel ripped, encoded, and put into iTunes. And once again, this involves getting the proper metadata in each file. So like I did for "The Simpsons", I have once again scraped Wikipedia, in order to automate the bringing of metadata to h.264 files:
The pre-requisites, instructions for use, notes, etc. for these programs are pretty-much the same as for the version that I wrote for "The Simpsons". So if you're interested in running this, I'll refer you back to there.
Unique to this implementation, however, is the fact that when I started, the "List of Angel episodes" page on Wikipedia didn't contain all of the information that I needed in one handy place. Instead, I was forced to write two parsers -- one to parse the listing page, and harvest the links to each individual episode. Then, I parsed each episode page, in order to get the bulk of the metadata that I required. Thus, I thought that there was going to be another aspect to this project -- I figured that I would need to write a program that would take the metadata that I harvested, and write it back out as wiki-formatted text, suitable for fixing the "List of Angel episodes" page. However, in the time since I started on this project and now, the good denizens of Wikipedia have largely fixed the page.
You can go back in time and compare the old version of the page to what is there now -- the new stuff is much better, and much more consistent. That being said, I did uncover a few minor data inconsistencies -- a few "Original Airdates" were in the wrong format on some of the episode pages, and some of the "production numbers" were wrong. I have since fixed these issues. It's amazing the amount of human effort that goes into keeping data in sync between Wikipedia pages. I hope that the Wikimedia foundation builds better tools for this in the future. For example, the data "info box" on each episode should be linked to the master table of all the episodes. If data is updated in an info box, it should automatically flow into the master table.
Tangent: What about the show, anyway?
To be honest, when I started the re-watching part of this project, I remembered that Angel was mostly good (especially Season 5), but on the whole, that the show wasn't as good as "Buffy". I couldn't have been more wrong. This show is crazy-good. It's bit different than Buffy in several ways, however. I'd say that it's a lot more "soapy". Each episode ends on a mini-cliffhanger, that makes you want to jump into the next episode to see what happens. And let me tell you, when you don't have to wait a week, or deal with DVD menus, but can just double-click on the next MPEG file? You're going to be watching a lot more "Angel", and probably miss a few bedtimes by a wide margin.
My second notion was also wrong, in that Season 5 was the only good season. I found that Seasons 1 through 3 are all very good, each easily recommended. I will admit that Season 4 is fairly weak -- but as I recall, Joss was overseeing 3 shows at this point ("Buffy", "Angel", and the beloved "Firefly"), and I think there was an issue finding a proper show-runner for "Angel". Nevertheless, there are still some gems in Season 4, especially episode 6, "Spin the Bottle". It's amazing that Joss found time in his schedule to write and direct an episode of "Angel" (sadly, something that he did rarely). What's even more amazing is how the dialog was clearly stepped up a notch. Even without seeing the credits, you can tell that Joss was writing it.
But what was I saying? Oh yes, great show, please watch. I have even made it (slightly) easier to get your legally-purchased DVDs into iTunes, so what is your excuse?
So, after all is said and done, what does it look like when 110 episodes of "Angel" land in your iTunes library? It looks a little something like this:
You can't really tell from the picture (well, maybe if you view the full size image), but I also went the extra mile and found high-quality cover art for all of my "Angel" episodes. Yet another disappointment in the current state-of-DVD-ripping art, along with a lack of CDDB-equivalent, is a lack of good cover art. I went ahead and just scanned my DVD covers, cropped them down, fixed a few blemishes, and then pasted the resulting images into iTunes.
Kindof a lot of work for such a one-off (even though things do look quite nice in Front Row). But as a special bonus to you, dear reader, I am putting the images that I made online:
Anyways, I think that about does it for this post. I think I'm spending entirely too much time grooming my media files. Although, I am having fun mucking with Wikipedia, and I continue to be impressed as to what a great resource it is. I think if I do this again, I'm going to have to spend some time seeing if there are better tools for working with the content stored in Wikipedia.