Saturday, January 24th, 2009

Captcha cracking in JavaScript with Canvas and neural nets

Category: Canvas, Security

<p>Everybody’s favourite glass shield to protect web apps are CAPTCHAS. These are the distorted characters displayed on a page that a user has to enter before gaining access or sending off a form. They annoy normal users, are largely inaccessible to blind users or dyslexic people and are not that safe as we think they are. PWNtcha continually reports successful cracks of various captchas on the web using OCR algos and backend systems.

What is pretty amazing though is that now you can even crack the images using JavaScript and Canvas. ShaunF wrote a GreaseMonkey script that automatically solves captchas of the file hosting site Megaupload. There’s a demo of it available.

As John Resig explains in his analysis of the script there’s some pretty nifty work going on:

  1. The HTML 5 Canvas getImageData API is used to get at the pixel data from the Captcha image. Canvas gives you the ability to embed an image into a canvas (from which you can later extract the pixel data back out again).
  2. The script includes an implementation of a neural network, written in pure JavaScript.
  3. The pixel data, extracted from the image using Canvas, is fed into the neural network in an attempt to divine the exact characters being used – in a sort of crude form of Optical Character Recognition (OCR).

True, Megaupload’s CAPTCHA is pretty basic, but it is still very impressive that you can use JavaScript to crack it. Seems like the getImageData API is something to have a closer look at.

Posted by Chris Heilmann at 5:42 am

4.4 rating from 45 votes


Comments feed TrackBack URI

Excellent read! But it begs the question, are there systems that could crack a form that is *activated* with javascript (thus the source doesn’t represent the true form)? I’m not up-to-date with the modern spiders…

Comment by oopstudios — January 24, 2009

lucky ajaxian login goes again :)

this is a bit complexed but finally i understood it… great…
anybody know something or can suggest some link about MooTools+CANVAS+[IPHONE|ANDROID] ?
Does the browser mounted on that device support it?

Comment by nunziofiore — January 24, 2009

Wow its even better than the best PHP captchas! They should put it in one of those annual best captcha competitions. Its inevitable that neural networks would one day surpass human perception.

Comment by Jordan1 — January 24, 2009

Though why not use Linear Algebra instead of Neural Networks…?
I assume NN have way higher processing demands and still less right hits…?

Comment by ThomasHansen — January 25, 2009

Because patterns recognition is what neural networks do great and they are easy to implement. I have no idea how anyone could use “linear algebra” instead – I suppose it would be insanely complex.

Comment by ffreak — January 25, 2009

Agreed with ffreak, there is currently no better option to NN for pattern recognition.
Still it would be interesting to see how this works with recaptcha which is a project for using distributed users entering captchas to digitalize old scriptures.

Comment by Snyke — January 25, 2009

@Snyke and ffreak
You divide the image into a n matrices where n is the number of unique elements you find in it (characters) then you treat every matrix as a vector and check the “direction” according to your precompiled “solution”…
Normally *WAY* faster then NN and doesn’t lock up or any of the other “great problems” that NN comes with…
It’s quite common according to my knowledge for OCR and similar things…
not sure how good it would work for captchas though since they tend to “mess” with the surface making it more difficult to recognize the pattern…
Don’t get me wrong, I think NNs in JavaScript sounds hilariously cool, though I think the Linear Algebra solution would perform *seriously* faster (if possible)
Beside it was a *question* not an attack … ;)

Comment by ThomasHansen — January 26, 2009

I guess this marks the end of captchas!

Welp, let me just finish my post here, by answering this Spam Question located at the bottom of this form…

Comment by ilazarte — January 26, 2009

Why complicate when there ready firefox plugin for this? Use google and spend time on smarter things.

Comment by Bigbx — August 14, 2009

The “spam question” is far worse at stopping spam than captcha :)

Most of the effective spammers use low wage labor to post somehting kinda on topic like nice post I agree with so and so… link

Captchas have been dead for a long long time

Comment by sourceRoot — June 24, 2010

Leave a comment

You must be logged in to post a comment.