Monday, May 18th, 2009>p>The Crap I missed it! crew took on the task of dealing with importing your iTunes XML file, and wanting to give you responsive feedback on the items as they come in. The usual tactic would be to suck in the entire file, and then process it.
Michael Baldwin did more, and here he tells us more:
We wanted to allow users to upload their iTunes library files, so that we could extract artist names (to let users sign up for new album and concert notifications). The problem was that these .xml library files can easily run up to 20MB in size.
Which means 1) long, boring downloads without feedback that it’s really working, 2) huge space requirements on our servers to support lots of concurrent uploads, and 3) big memory requirements to process the XML files.
What we did instead was to write a bare-bones custom web server just for this task in PHP (yes, PHP) which analyzes the file as it streams in (storing nothing on disk, and using negligible memory), gradually puts artist statistics into shared memory, and then responds to new AJAX requests every ten seconds which retrieve and remove the artist statistics from shared memory.
The end result for users: as users upload their library file over the course of several minutes, they get to watch their web page fill up with their list of artists at the same time, sorted and even animated. If the connection breaks, they can even choose to continue with just the artists that made it.
The end result for us: we can deal with gigabytes of uploads while using trivial computing resources — just a couple KB of incoming buffer space and a couple KB of outgoing buffer space.
You can check it out yourself, where it will look a bit like this (but constantly updating!)