Tuesday, October 21st, 2008

Peppy: New CSS 3 selector engine

Category: CSS, JavaScript, Library

<p>James Donaghue has released Peppy, the first release of his CSS 3 compliant selector engine that runs independent of one particular library (and can thus be used with any of them).

He has some bold claims on speed:

As it stands now Peppy is faster1 than all other major JavaScript libraries with DOM querying capabilities (Prototype 1.6.0.3, JQuery 1.2.6, MooTools 1.2.1, EXT 2.2, DoJo 1.2.0, YUI 2.6.0). It is faster2 than Sizzle by John Resig and it also is cross browser (IE included). Take a look for yourselves, I have a profiling page set up here.

At 10k it is an ideal replacement for other excellent but bulkier libraries (whose feature sets span beyond DOM querying) when features additional to DOM querying are not needed in your web application. If you are designing your own JavaScript library or want to replace your existing libraries selector engine then Peppy is an ideal candidate.

Comments, both positive and negative, are most welcomed and desired. I want to improve this thing! Please take a look here.

John Resig has also discussed a new selector engine that he is working on that is performing very well, and there has been much talk about having engines that can be shared resources for the Ajax libraries.

Related Content:

Posted by Dion Almaer at 8:44 am
45 Comments

++++-
4 rating from 597 votes

45 Comments »

Comments feed TrackBack URI

Great news. That profiling page shows how badly some libraries need to update their selector engine.

Comment by staaky — October 21, 2008

Looks great.

Comment by V1 — October 21, 2008

Interesting that Dojo 1.2 is faster than any of the other major toolkits listed. Peppy and Sizzle do look like great performance boosts.

What’s the license for Peppy?

One question for you… have you tried running the tests in a different order? In my experience with SlickSpeed, there seemed to be an advantage for whichever toolkit was listed first. Not that it should make a 2-3X difference, just curious if SlickSpeed is statistically accurate since it only runs each test once, etc.

Comment by Dylan Schiemann — October 21, 2008

Looks great — I can’t wait to give it a run.

Comment by davidwalsh83 — October 21, 2008

Dion? Did you read the story you posted? ;-)

Story says…
>>It is faster than Sizzle by John Resig and it also is cross browser (IE included).

You say…
>>John Resig has also discussed a new selector engine

Or does John Resig has an even newer engine than Sizzle?

Comment by Nosredna — October 21, 2008

James, I wounder when does your implementation invalidate cache in IE that doesn’t support neither of the DOM mutation events, or in safari, that does it partially? How do you get notified on DOM changes in these browser?

Comment by caston — October 21, 2008

>>Interesting that Dojo 1.2 is faster than any of the other major toolkits listed.

I’m sure it depends on OS and browser.

On my new Vista laptop in Chrome, and on my old XP desktop with FF3, Dojo 1.2 was well down the list.

NOTHING is close to Peppy in IE6, and that’s where you might actually notice the gain. I’m sorely tempted to drop this in everywhere.

In IE6, Peppy is 188 ms, Dojo is 1509. Prototype 5233!

Amazing.

Comment by Nosredna — October 21, 2008

>>James, I wounder when does your implementation invalidate cache in IE that doesn’t support neither of the DOM mutation events, or in safari, that does it partially? How do you get notified on DOM changes in these browser?
.
That’s a great question. I noticed that the source code has a section you can uncomment if you want to have dynamic DOM.
.
Well who doesn’t have a dynamic DOM nowadays. Can you discuss this?

Comment by Nosredna — October 21, 2008

Three big problems that I see:
1) He indiscriminately caches in all browsers – including IE – but *never* invalidates the cache. Since the DOM mutation events don’t fire in Internet Explorer all queries will forever be cached and not update on a requery. This is very, very, bad and will break lots of applications.
2) He copied parts of Sizzle and integrated them into his code without including the original copyright notice (in fact, it appears to have no license information, at all).
3) Where’s the test suite? How do you know that it “works in IE”? Sizzle is currently being run against the MochiKit and jQuery test suites so you can be sure in knowing that the code quality will stand up.

@Dylan Schiemann: The entirety of Sizzle was developed from the ground, up, without using any code from other sources (developed 100% by myself) and is distributed under an MIT license. I’m already working with the MochiKit team on integrating it into their selector engine (along with jQuery) and would be happy to help integrate it into Dojo, as well. You can be sure that everything was developed by me and I’ll be happy to sign a CLA to make it possible to land.

Comment by JohnResig — October 21, 2008

John,
.
How’s the IE support coming along in Sizzle? You must be close if you’re working with the MochiKit team and prepping it for jQuery.

Comment by Nosredna — October 21, 2008

@Nosredna: It’s coming along well. I’m definitely planning on having a release before the end of the month. It’s been great getting to work with other teams on the integration – helps to round out a lot more of the edge cases (which is where 90% of the development time is spent).

Comment by JohnResig — October 21, 2008

John,
.
That’s great.
.
Does anyone ever look into those cases where SlickSpeed reports that different engines returned different results? Are those bugs or disagreements about the interpretation of the query? Do the browsers all agree on how to style those cases, or are there css selector disagreements in the browsers?

Comment by Nosredna — October 21, 2008

@Nosredna: Most of the time it’s disagreement on points that haven’t been standardized (like :contains – which was removed from the CSS spec). The introduction of querySelectorAll has helped to force everyone to fix their bugs (since they want to take advantage of the fast selector method – they have to standardize their test suite to match its results).

This disagreement is one point where having a unified selector engine would be really useful (since everyone would, then, support the same selectors).

Comment by JohnResig — October 21, 2008

Peppy: 19ms
Sizzle: 26ms

Well, who is hunting for these 7ms?

(FF/Windows XP)

Comment by Aimos — October 21, 2008

I’m wondering if “div[class^=exa][class$=mple]”
will make it into Sizzle or even makes sense at all?

Comment by Thasmo — October 21, 2008

@Thasmo: Good catch, I just landed a fix for that:
http://github.com/jeresig/sizzle/commit/f74de835e1cd349bd4a044a3756b2f02ee96929a

Comment by JohnResig — October 21, 2008

@Dylan Schiemann: Yes, I have run these in different orders with the same results. I have packaged up everything in a zip file available here: http://jamesdonaghue.com/static/peppy/Peppy.zip. I have noticed that results in slickspeed can vary and I have tried to account for this in the numbers on my post by running multiple times in each browser and taking the average for all runs.

@caston and @Nosredna: There is a bug in the first release I have already fixed it. I hope that #1 below in my response to JohnResig answers your question. If not please let me know.

@JohnResig:
I am a true greenhorn when it comes to releasing code, this is my first time. I would like to begin by saying that I have the utmost respect for you and your work. I certainly didn’t mean to step on any toes. I just wanted to share something that I am excited about in hope to spur more innovation in the community.

In answer to your three concerns:

1. I have found a fix and it is now available here http://jamesdonaghue.com/static/peppy/. In order to remedy this problem I simply keep track of the document.getElementsByTagName(“*”).length when the script is first called (only in browsers that don’t support DOM mutation events) . Every subsequent call to Peppy (only in browsers that don’t support DOM mutation events) we will check to see if the document.getElementsByTagName(“*”).length has changed. If it has then we clear the cache. This still maintains a significant speed gain over the other libraries while managing to use caching.

2. License: This is my first public release of source code, I apologize for the lack of a license. I want this to be completely open for use in any way that one can imagine it useful. I will chose a license by the end of the day today.

copying code: That is a pretty harsh statement. I wrote 100% of Peppy and I most certainly did not copy any other source code. In fact my library is structured quite differently than any of the other libraries (yours included), and this is where considerable portions of the speed gains were achieved. I am curious which sections specifically are you talking about? Are you referring to the usage of .nodeIndex in the :nth-child implementation, this is also found in EXT, … and that is where the similarity ends. My calculation for determining nth validity is completely unique from any other solution out there, as is the rest of the implementation. Perhaps you are referring to the cache invalidation code? This is similar (not identical), and I give you reference in the code, but how many different ways are there to write this?

Sizzle is a great library and the advances that it has made are fantastic and have most certainly inspired me, as many of your other projects also have. I was hoping for friendly competition, undoubtedly expecting you (and other library authors) to make more advances, forcing me to in turn advance mine further leaving the community as a whole the with faster selector engines.

3. For quality testing of Peppy I mostly used a modified version of the jQuery test suite. I would be happy to make this available (passes 129 tests in all browsers) find it here http://jamesdonaghue.com/static/peppy/unit/.

Comment by jamesdonaghue — October 21, 2008

I’ve just run the SlickSpeed test on 05 major browsers (IE 7.0.5730.13, FF 3.0.3, Chrome 0.2.149.30, Safari 3.1.1 build 525.17, Opera 9.52 build 10108). The OS is Windows XP SP3. The results seem Peppy is the winner.

Final time (less is better)

------Peppy--Sizzle--EXT--Dojo--JQuery--MooTools--Prototype--YUI
FF----..44.--...42.--130--.146--..192.--...162..--....326..--.395
IE----.113.--..350.--306--.731--..450.--...641..--...1887..--1244
Safari..18.--...89.--.85--.175--..140.--...110..--.....86..--.581
Chrome..21.--...69.--.57--.115--..121.--....86.--......77..--.334
Opera-..27.--...18.--.79--..72--..124.--...165.--.....230..--.437
Total-.223.--..568.--711--1239--.1027.--..1164..--...2606..--2991

On IE, Sizzle fails in 04 cases (div ~ p, h1[id]:contains(Selectors), a[href][lang][class], div[class]).

Comment by TanNhu — October 21, 2008

I just wanted to let everyone know that I have chosen the FreeBSD license (http://www.freebsd.org/copyright/freebsd-license.html).

Comment by jamesdonaghue — October 21, 2008

Btw. WHY has to have everything a license these days – even if it is free? FreeDSB, Apache, GPL2, GPL3, LGPL, CC, ….

20 years ago, someone has just declared his work “public domain” and everything was fine. We need more of that, when the author really want to set his source free to the wild and IF he really don’t cares what will be done with it.

ps: I do understand the use of the GPL intention. The whole get something and give something back thing (which is great).

Comment by Aimos — October 22, 2008

——Peppy–Sizzle–EXT–Dojo–JQuery–MooTools–Prototype–YUI
IE6—.84.–..653.–425–.968–..650.–…667..–…2926..–3239

Something DOES make peppy very fast.

@JamesDonaghue, @JohnReisig:

You both propably should think of some kind of Interface as a layer between Javascript Frameworks and Selector Engines. So that in the future, every JSFW can switch to the next big thing if needed.

And stop using spaces before and after your brackets. :-p

Comment by Aimos — October 22, 2008

So the hack using ‘document.getElementsByTagName(”*”).length’ completely works? Seems kinda bit imprecise cause noone else yet came up with this idea?!

Comment by Thasmo — October 22, 2008

If you remove 2 DOM elements and add 2 new elements in place, or just changed attributes that the document.getElementsByTagName(‘*’).length would have failed.

Or am I missing the “bigger” picture here.

Comment by V1 — October 22, 2008

Thought of that too … that’s why I’m asking.
Better way would be calculating a hash value (md5) or something to really check if something changed, but I guess that isn’t really fast or doable at all.

Comment by Thasmo — October 22, 2008

I just downloaded sizzle out of curiosity and don’t actually see any copyright notice, nor any indication of any license at all :-)

Comment by heswell — October 22, 2008

I am the only one who’s thinking that these selector engines don’t need any more optimizing? What I mean is: who here really concretely notices an immediate and clear improvement in real applications when they switch selector engines to a faster version?

Comment by Joeri — October 22, 2008

@Joeri, it just depends where you use your selectors for. They are really handy build unobstructive javascript. Instead of spamming a shit load of id’s on your elements you just run the selector query and gather the elements.

Comment by V1 — October 22, 2008

@V1, I understand the point of selectors, I’m just wondering whether they really are the bottleneck in real-world web applications. Wouldn’t it be better, for example, to invest that effort in finding optimized ways of doing layout in javascript?

Comment by Joeri — October 22, 2008

I guess it’s both?!

Comment by Thasmo — October 22, 2008

@Aimos: You both propably should think of some kind of Interface as a layer between Javascript Frameworks and Selector Engines. So that in the future, every JSFW can switch to the next big thing if needed.
- This interface does exist (Selectors API, http://www.w3.org/TR/selectors-api/). Please respect it.
@Joeri: I am the only one who’s thinking that these selector engines don’t need any more optimizing?
- I share your concern. There is absolutely no need in taking these selectors engines to a new level with respect to their speed. Instead start writing efficient code (scripts written against jQuery quite often lack of efficiency, go to StackOverflow to check on my words)

Comment by SergeyIlinsky — October 22, 2008

@jamesdonaghue: Glad to hear that you took a lot of inspiration from my code – that makes me feel good :-) Just to respond to your couple points:

1) Unfortunately that doesn’t work as you would expect it to. Imagine a div that contains a single bold element – then you do this: div.innerHTML = “foo“; (that should be an italic element). This won’t effect the * length. Currently there is no way to construct a reliable caching behavior in IE. If someone finds a technique then I’d love to hear about it.

2) That’s fine – glad to see that you’ve picked the BSD license, as well! I wasn’t trying to bit nitpicky, but it’s something good to be aware of when writing and releasing open source code. (As far as the specific code goes I was referring to the two portions at the bottom of the code base.)

3) That’s great to hear about the test suite, as well! A quick run shows it passing in Firefox 3 but failures popping up in Safari 3.1, WebKit nightly, and in Opera 9.5 (just the browsers that I had on hand).

@heswell: Well, Sizzle isn’t released yet. Hence the large “—- It’s a work in progress! Not ready for use yet! —-” notice – although I’ll be releasing it under the MIT license once it’s complete.

Comment by JohnResig — October 22, 2008

May be a new approach is needed. Application that do need to know every element at every possible time, and some who don’t. I guess developers will accept a manual fire of events for IE6 :D

Comment by Aimos — October 22, 2008

Or you could implement a option, and allow to choose IF they want to cache the constructor, as users of the selector they must surely know IF the selectors will be affected with possible updates that they are planning on the clientside..

Comment by V1 — October 22, 2008

“- This interface does exist (Selectors API, http://www.w3.org/TR/selectors-api/). Please respect it.”

Either this document has some issues with the english language, or, it’s just as stupid as it reads.

Comment by Mikael Bergkvist — October 22, 2008

OK. This think had something like 4 stars, then all over the sudden a flood of what must have been single-star votes.
.
Childish mechanical voting detected.
.
This guy put some work into speeding up the web for everyone. He deserves some praise for that. If you have specific problems with his implementation, tell him. I bet he’ll fix the problems.
.
Really people. Shameful. At the very least he has everyone wanted more. Is that so bad?

Comment by Nosredna — October 22, 2008

On a side note, it seems my comment over on James’ blog got deleted for some reason.

As for the cache invalidation, there was a question on StackOverflow related to it and one of the ideas that popped up was to check the onpropertychange event in IE. Have you guys (John/James) or anyone else tried playing with that?

Comment by LeoHorie — October 22, 2008

Another idea that was presented for cache invalidation was to check for innerHTML, which is nowhere near ideal, but that apparently works.

Comment by LeoHorie — October 22, 2008

I think it could be solved by comparing the (filtered) innerHTML value.
you should cache the innerHTML like this
cache = innerHTML.replace(/\>[^\<]*\<’).replace(/\]*\>/g, ”).replace(/^\s*(.*)\s*$/, ‘$1′)

Comment by rizqi — October 23, 2008

Have you considered the time it takes to call innerHTML? its not as fast as you think

Comment by TNO — October 23, 2008

@Thasmo, @V1, @JohnResig:
Yes, the solution that I chose (for DOM change/caching in IE) will not work. It was only taking into account addition/removal of DOM elements that resulted in a different total element count and did not take into account any DOM changes that don’t leave the element count different (which is most likely the more common case for DOM changes). It was a poor solution and will be removed. Also, I will disable caching all together in IE and continue in the desperate search for an alternative solution, although as @JohnResig points out there probably isn’t a performant one.

@LeoHorie:
Sorry about that. I noticed that I had no comments there, just thought they were all coming here:) It is a brand new blog, I will look into this and fix it. Thanks for your comments though!

@LeoHorie, @rizqi, @TNO:
Great thought, however @TNO is right innerHTML is not very performant and unfortunately takes so much time that it would be faster (in most cases) to not use caching at all.

@JohnResig:
response to #2: Thank you for making me aware of this. I obviously was a bit naive with respect to licensing and how this works in the Open Source community. Since, I had modified the code and provided reference to your original source I incorrectly thought that was enough. I will add your copyright to these two sections of code. Is this ok?

response to #3: I made some last minute optimizations, just prior to releasing, and apparently this broke some things in Opera, and Safari. Currently the release fails 9 tests in Safari, and 3 tests in opera. I am planning another release that will address these bugs as well as the caching issue for IE on Saturday or Sunday.

Take aways for me are:
- That I should never introduce and release optimizations or otherwise without testing again in all browsers (I already know this, however I was in a whirlwind of excitement to release and unfortunately compromised this important development principle).

- The test suite should contain some dynamic DOM changes, this would have uncovered the IE caching issue for me earlier.

@Everyone: Thanks for all of the posts and interest! Already the feedback has been more than I had hoped for and Peppy will be much better for it.

Comment by jamesdonaghue — October 23, 2008

With respect to cache invalidation, simply adding event listeners for mutation events can slow down the entire page:
http://www.oxymoronical.com/blog/2008/10/How-extensions-can-slow-down-Firefox-my-dirty-little-secret

Is maintaining a cache worth the reduction in DOM mutation speed?

Comment by jkd — October 24, 2008

I have fixed the unit test failures in both Safari and Opera. Both were quick fixes and they can be found in the latest maintenance release of Peppy version 0.1.2 which can be found here http://jamesdonaghue.com/static/peppy/. So Peppy now passes 131 unit tests in all of these browsers!

I have also disabled caching in IE.

Comment by jamesdonaghue — October 26, 2008

John resig says this of Sizzle…
.
“Currently this engine is expected to become the new default selector engine of jQuery, MochiKit, Prototype, and Dojo.
Please contact me if you’re interested in working on integrating Sizzle into your library.”
.
Any way you can get it into YUI, John? Their selectors are the all-time worst.

Comment by Nosredna — October 28, 2008

Both Peppy and Sizzle does not return “document ordered” result sets, this is one of the reason they are fast. Slicing down the DOM tree is faster but will yield results that are not compatible either with XPATH or with the newer Selectors API (querySelectorAll).

This is a problem for example when trying to collect text in different elements. The texts will not be in the original document order, they will be randomly ordered. It will also be difficult to build test units.

Another reason they are fast is the kind of check they do is really minimal. An example “nodeName” could be lower or upper case, or “className” can contain white-spaces (LF and CR too), or elements ID in FORMS may be easily overwritten by bad HTML code.

However I would very like have James lend me his optimizations. :-)

Good job James, keep up improving your code.

Comment by dperini — November 6, 2008

Wow…. I need a cup of Herbal tea and a little lie down after reading all that. Very useful – I think???

Comment by Remedies — November 19, 2008

Leave a comment

You must be logged in to post a comment.