Monday, May 12th, 2008

Everything you wanted to know about String performance

Category: JavaScript, Performance

Tom Trenka has detailed a great analysis of JavaScript performance across the various browsers. This is important work, and it reminded me of the JVM days where people tried to use pools and such… to find out that they were bad for performance as newer VM technology came out. When you try to be too tricky you can end up in a bad state as new versions try to optimize the common task, and not your trick.

Eugene Lazutkin had a great explanation on Strings in languages:

Many modern languages (especially functional languages) employ “the immutable object” paradigm, which solves a lot of problems including the memory conservation, the cache localization, addressing concurrency concerns, and so on. The idea is simple: if object is immutable, it can be represented by a reference (a pointer, a handle), and passing a reference is the same as passing an object by value — if object is immutable, its value cannot be changed => a pointer pointing to this object can be dereferenced producing the same original value. It means we can replicate pointers without replicating objects. And all of them would point to the same object. What do we do when we need to change the object? One popular solution is to use Copy-on-write (COW). Under COW principle we have a pointer to the object (a reference, a handle), we clone the object it points to, we change the value of the pointer so now it points to the cloned object, and proceed with our mutating algorithm. All other references are still the same.

JavaScript performs all of “the immutable object” things for strings, but it doesn’t do COW on strings because it uses even simpler trick: there is no way to modify a string. All strings are immutable by virtue of the language specification and the standard library it comes with. For example: str.replace(”a”, “b”) doesn’t modify str at all, it returns a new string with the replacement executed. Or not. If the pattern was not found I suspect it’ll return a pointer to the original string. Every “mutating” operation for strings actually returns a new object leaving the original string unchanged: replace, concat, slice, substr, split, and even exotic anchor, big, bold, and so on.

Tom then went on to detail the rounds that he went through:

  • Round 1: Measuring Native Operations.
  • Round 2: Comparing types of buffering techniques.
  • Round 3: Applying Results to dojox.string.Builder.

There are a few surprises here, and Tom later concludes:

  • Native string operations in all browsers have been optimized to the point where borrowing techniques from other languages (such as passing around a single buffer for use by many methods) is for the most part unneeded.
  • Array.join still seems to be the fastest method with Internet Explorer; either += or String.prototype.concat.apply(””, arguments) work best for all other browsers.
  • Firefox has definite issues with accessing argument members via dynamic/variables

Erik Arvidsson reminds us of the reason to use push(): IE6 and it’s really bad GC.

I look forward to the IE 8 / FF 3 results too.

Posted by Dion Almaer at 8:34 am
Comment here

4.3 rating from 18 votes

Comments Here »

Comments feed TrackBack URI

Leave a comment

You must be logged in to post a comment.