Friday, September 14th, 2007
The Digg Oracle: Data mining on the client
<p>Brian Shaler noticed that almost a year ago, Digg removed the “search your ownDiggs” feature, to the dismay of thousands of Digg users. To explain
why the feature had not yet returned, they cited hardware and software
solutions as being very complicated and expensive.
Brian decided to re-implement the feature himself using the Digg APIs, and we end up with The Digg Oracle:
Because the dataset is relatively small and user-specific, performing
tasks like searching/filtering and sorting can easily be done on the
client, using Google Gears. The tool downloads the selected user’s
entire voting history, indexes the stories in the local DB, then does
all the sorting/searching without connecting to Digg’s servers.
Here we see an original query, and the application starts to download the users usage data:

When the data is loaded, searching and filtering the data is extremely fast, even if you use Kevin Rose as your sample :) This is a great non-offline example of using the database and workerpool components.
Related Content:











Bah — crashed my copy of Firefox.
Very awesome