Wednesday, November 21st, 2007

Wikipedia Offline with GearsMonkey

Category: Articles, Gears, Google, Offline, Showcase

>

Working on the Gears team we also run across applications that we would love to take offline. A lot of these applications aren’t Google’s so we thought it would be nice to be able to take third party apps offline. This also makes sense since Aaron Boodman (Mr. Greasemonkey) is co-tech lead on Gears itself!

Ben Lisbakken has written up his work taking Gears and Greasemonkey to make this happen. He details the real example of taking Wikipedia offline which has once piece of narly code to do with iframe injection to be able to store data on third party sites (e.g. media site vs. wikipedia main site):

  1. Initialize Gears on page
    • Check if site has been allowed, if not, trigger allow dialog
  2. Insert iFrame
  3. Initialize Gears on iFrame
    • Check if site has been allowed, if not, trigger allow dialog
  4. If Gears is initialized on both, insert Cache Page link (unless page is cached)
  5. When user clicks Cache Page:
    • Capture the HTML and CSS of the Main Page
    • Store the URLs of all links to HTML, CSS, and media files in the Gears database (so we can remove them from the ResourceStore later, if needed)
    • Create an iFrame whose src is in the domain of upload.wikimedia.org. Pass all media file URLs to the iFrame in the src URL after the hash, e.g. src=”http://upload.wikimedia.org/#thisimgloc.jpg||anotherimgloc.jpg||lastimgloc.jpg”
    • Initialize Gears in iFrame
    • Capture all URLs from the iFrame’s href hash.
  6. When user clicks [x] to remove an article from cache:
    • Grab all URLs from the Gears database that correspond to that article
    • Remove all URLs from the ResourceStore of the Main Page that contain the string “en.wikipedia.org”
    • Remove all URLs from the Gears database that correspond to that article
    • Create an iFrame whose src is in the domain of upload.wikimedia.org. Pass all media file URLs to the iFrame in the src URL after the hash, e.g. src=”http://upload.wikimedia.org/#thisimgloc.jpg||anotherimgloc.jpg||lastimgloc.jpg||remove||”
    • Initialize Gears in iFrame
    • Remove all URLs from the ResourceStore that are listed in the iFrame’s href hash.

Here’s the script in action as I save pages away. This is just the beginning. Ideally we would have the code automatically save content that you have been too, and do smart spidering to get more on the subject too. We will also work on making GearsMonkey scripts even easier to write.

Related Content:

Posted by Dion Almaer at 9:24 am
10 Comments

+++--
3.4 rating from 269 votes

10 Comments »

Comments feed TrackBack URI

Why is the rating so low?

Comment by Rizqi Ahmad — November 21, 2007

hmm isnt this implementation of gears just the browser caching wheel reinvented?

pun intended :P

Comment by mhr — November 21, 2007

That’s an awesome logo!! Who made that???

p.s. nice work!!

Comment by Pamela Fox — November 21, 2007

Very impressive. I’d like to see more of this.

Comment by Annabelle — November 21, 2007

rating is so low beacause it use cookies.
If someone turn off cookies then he can vote as much as he wants.
A person who rates so many times had only turn off cookies and refresh one page.

Comment by m — November 21, 2007

…that is freakin sweet. Thanks guys. (and cool logo)

Comment by Mark Holton — November 21, 2007

You can find so much such stuff at userscripts.org

One thing I like the most in GreaseMonkey is cross server ajax requerst !!

Comment by Bollysite — November 21, 2007

Q: Did you have a character from Star Fox commentate the video? I can’t understand a word being said.

Comment by Pete F. — November 22, 2007

I’d love this script to automatically store EVERY page on wikipedia I visit, so when I accidentally go offline, I can call up that page I saw last week. Hmmm.

Comment by Paul Irish — November 26, 2007

yes I agree with the last comment its scary and overwelming

Comment by Aphrodisiac — July 31, 2008

Leave a comment

You must be logged in to post a comment.