Wednesday, December 31st, 2008

Why Load Testing Ajax is Hard

Category: Testing

<p>Today we are fortunate to have a guest post by Patrick Lightbody, most recently of BrowserMob fame (and previously Selenium work, OpenQA, WebWork, and more). Let’s listen in to him talk to us about load testing, and let him know your thoughts in the comments below:

I’ve been developing and testing complex web apps for a long time. I was the co-creator of WebWork (now Struts 2.0) and an early champion of DWR, writing one of the first AJAX form validation frameworks for Java web apps. But over the years, I noticed that as our web technologies and techniques got more sophisticated, our testing techniques were not keeping up.

That was why I founded OpenQA and helped grow Selenium to the popular testing tool that it is today. Selenium helps with functional testing of complex AJAX apps, but there isn’t an equivalent for load testing, which is why I started BrowserMob, a new type of load testing service.

Traditional load testing

In order to achieve high levels of concurrency, traditional load testing tools (both open source and commercial) work by sending large numbers of HTTP requests as a way to simulate many concurrent users interacting with your web page. These tools work by recording the traffic that comes from a browser session and then requiring that the load tester tweak a generated script so that it worked properly when played back X times concurrently.

Common problems would be that the initial recording would embed in cookie values that were tied to individual sessions. Additional unique state might be encoded in other hidden form elements, all of which required some fine tuning after the fact. If you’ve ever tried to run a load test, this is probably a very familiar process. It has worked reasonably well up until recent years, but AJAX has made this process even more difficult.

Ajax + load testing = hard

The reason Ajax has complicated things is that it encourages more logic and state to run inside the browser session. This means that just watching the traffic across the wire doesn’t necessarily tell the full story. The richer an app gets, the more difficult it gets to simulate the exact effects of hundreds or thousands of users hitting your site.

This is the problem I decided to solve when I started BrowserMob. It’s on-demand, low-cost and uses real browsers to completely change the way load testing is recorded and played back.

Do real browsers really matter?

Real browsers absolutely matter. There are two major reasons:

  1. It simplifies the script creation process by letting you avoid all the complexities and hacks you have to do with traditional load testing tools.
  2. It ensures that you’ll see 100% of the traffic and load against your site that a real user would cause.

We’ll look in-depth at each of these topics separately to see how use of real browsers helps and how a service like BrowserMob compares to existing load testing technologies.

Simplifies script creation

In today’s modern web applications, AJAX is just about everywhere. And we’re not necessarily talking about super rich applications like Google Maps or Yahoo Mail, but even simple sites like google.com now use advanced AJAX techniques. See Google’s auto-complete for a real-world example:

In this case, when typing values in to the search box, the web browser executes JavaScript logic that in turn makes AJAX calls to Google’s search engine, asking for search suggestions to display. It does this on every keystroke that the user types in. This is a standard auto-complete control that most Ajaxian readers are very familiar with.

When recording a script with a traditional load testing tool, one of two things may happen here:

  • The recorder will see the AJAX traffic and capture it for playback in the load test
  • The record will not see the AJAX traffic and will only capture the request made when the user clicks the “submit” button

Obviously these Ajax requests are causing real load, so we want to make sure they get played back in a load test. Let’s assume you’re using a tool, such as JMeter, that does capture the AJAX traffic. Here’s what that looks like:

The recorded traffic is effectively:


http://clients1.google.com/complete/search?hl=en&gl=us&q=b


http://clients1.google.com/complete/search?hl=en&gl=us&q=ba


http://clients1.google.com/complete/search?hl=en&gl=us&q=ban


http://clients1.google.com/complete/search?hl=en&gl=us&q=bana


http://clients1.google.com/complete/search?hl=en&gl=us&q=banan


http://clients1.google.com/complete/search?hl=en&gl=us&q=banana

    

Each key stroke by the user is included in each subsequent search term. Let’s ignore the requirement of validating the results that come back from the AJAX requests for the moment (they are usually in JSON or XML format and difficult to validate using most tools). Instead, let’s just add a twist to the load test requirement for doing searches: the load test must search from 100 different search terms.

Parameterization is very common requirement, since it ensures that the load is realistic and doesn’t get cached in any unnatural way. This means that now in addition to searching for the term “banana”, we’re also searching for “apple”, and “orange”, among others.

However, this means your script can’t just blindly submit requests to those previous URLs either, since those were tied to the “banana” term. Instead, they must search for the sequential characters of the respective search term, such as:


http://clients1.google.com/complete/search?hl=en&gl=us&q=a


http://clients1.google.com/complete/search?hl=en&gl=us&q=ap


http://clients1.google.com/complete/search?hl=en&gl=us&q=app


http://clients1.google.com/complete/search?hl=en&gl=us&q=appl


http://clients1.google.com/complete/search?hl=en&gl=us&q=apple

    

Unfortunately, this is where even the best traditional load testing tools fall down. They don’t provide any help here, so it’s up to you to figure out how to, if it’s even possible, write complex scripting logic that breaks down the randomly selected search term by characters and then subsequently issue Ajax requests for each character in the term.

At this point, you’re basically rewriting the same logic that the web app developer wrote originally. If you’re a QA engineer, this may be difficult since you don’t know all the internal AJAX logic coded in to the application. If you’re the developer, it’s still annoying because it’s tedious and likely in a language other than the original JavaScript that you wrote your code in.

So how do real browsers help?

Because BrowserMob uses real browsers to both record and playback load, that means you don’t have to worry about trying to simulate the logic in a web browser. Instead, all you have to do is record the human interaction with the browser, such as typing in a randomly selected search term. BrowserMob will then pass those instructions on to the hundreds or thousands of browsers participating in the load test, and those browsers will in turn “do the right thing” and issue the proper AJAX requests.

And if the underlying logic, such as the request URL pattern for those AJAX requests, changes? With traditional load testing it’s up to you to detect and fix the problem. If your test uses real browsers to play back the traffic, your script won’t need to change one bit – the new AJAX logic will be run by the browser in real time.

Ensuring realistic playback

We’ve seen how use of real browsers helps with script creation, but what about playback? As we just learned, using real browsers simplifies the process of recording and shrinks the behavior coded in to the script itself. This means we’re letting the real browser – the same type of program your end users will use – make the decisions about what requests to make.

For example, when visiting http://ebay.com you might see the following page:

But reload the page and now you might see this:

Notice a difference? The upper right section has completely different images displayed. That’s because eBay’s home page chooses what to display based on complex and multi-variant logic determined at runtime. It’s quite likely that it’s going to be impossible for a load tester to know which images will be displayed on any given request.

It’s true that some load testing tools will try to parse the pages in real time and figure out which images should be displayed, but that’s hardly comforting once you’ve already learned they can’t deal with even the most simply Ajax components, as we just saw. And as most AJAX developers know, resources such as images and stylesheets are more and more likely to come from complex JavaScript logic and not due to a simple static reference in an HTML page.

Instead, the only way to guarantee that every single object (image, JavaScript, AJAX request, advertisement from an ad partner, etc) gets requested is to use a real web browser during playback. While it is much more resource intensive, it is also a major time saver on both the front-end, as scripts are much simpler to write, and the back-end, as you can be confident that the most realistic level of load was produced.

So next time you hear of load testing happening on one of your Ajax apps, make sure those doing the testing understand the complexities and difficulties associated with testing a complex web app. Help them be on the lookout for the issues highlighted here.

Thanks to Patrick for writing this. Do you have something important to say? If so, contact us with your idea!

Related Content:

  • Watch the metrics during your load tests
    In this member-submitted tip, Steve Cam says monitoring the metrics during your load tests will help you detect performance-related...
  • Direct Ajax - Goodbye to Ajax Deadly Sins
    A tech lead was assigned to enrich his company's "Expense Claim" application with Ajax technologies. At first, he planned to deliver the application...
  • Ajax Learning Guide
    Chances are, you've been doing JavaScript and XML developer work in Lotus Domino for quite some time. This old/new approach is causing quite a stir in...
  • Ajax Learning Guide
    Are you a Web developer? The time has come to rethink your entire approach to developing Web applications. Find out about the Ajax approach...
  • Resolving issues in baseline, load and stress testing
    A software testing expert tackles root causes for application testing failures specifically in the realm of load, stress and baseline...

Posted by Dion Almaer at 9:01 am
7 Comments

++++-
4.2 rating from 30 votes

7 Comments »

Comments feed TrackBack URI

I’ve had great success using a modified version of the powerful HTTPERF command line program in loadtesting several AJAX applications I’ve built. I made a simple patch to the HTTPERF program to allow for the appropriate AJAX header per-request as well as more custom headers:
http://www.overset.com/2008/03/27/load-test-ajax-applications-with-httperf/

My writeup has some serious work to be done – but it show the core of how easy it is to write a test case script in order to load test your AJAX application. Test case script creation is as simple as firebug cut-and-paste of url requests, headers, and post data into a sequential flat file. That and the scripts are so straight forward and simple to read it’s easy to build a template script and simply edit the post arguments to build another test case.

When it comes to if the browser matters or not, it doesn’t. You can script a HTTP test case “session” to download assets on demand. You can even put randomized delays per request. You can simulate a completely non-caching browser or a browser that might cache certain assets at first load. The browser matters when unit testing, regression testing, etc. but not load testing – except just to aid in building test case scripts.

In the case to simulate a non-serial hightraffic environment, it couldn’t be easier to HTTPERF. I usually enlist several simple unix workstations on several networks to run my tests concurrently. This even works great in a unix/xen “cloud” environment. The last quick load test I simply instantiated several Amazon EC2 instances, loaded HTTPERF and all the test scenarios and let her rip. This is of course biased to a single connection – but took all of 10min to setup 3 virtual machines to begin load testing my application. Much more time was spent building the test cases of course.

I’ll polish my horrible blog post a little more to help make this process easier.

Comment by num — December 31, 2008

Num,
Thanks for leaving a comment. I agree that testing AJAX sites with traditional load testing tools such as JMeter, HTTPPERF, Apache ab, LoadRunner, etc is certainly possible.

However, I disagree that it’s “easy” for most people. Based on your blog post, you have very deep technical knowledge, going so far as to patch a C library to achieve your goals. That’s fantastic that you have those skills, but I’m not sure the average developers wants to dedicate that level of effort for someone as mundane as testing :)

While it certainly is possible to skip the browser, doing so introduces additional complexity that could have been avoided if a real browser was used in the first place. What I’ve found is that people who are responsible for performance testing (web developers, QA engineers, etc) often have a very hard time simulating browser session traffic, especially when that traffic has complex state embedded in the browser. This article outlines those cases, which a tool like HTTPPERF would still need complex scripting to achieve.

It becomes especially difficult, as I show in the Google Suggest scenario, when there is a need to provide random/parameterized data in to each simulated virtual user. If you aren’t running a browser (or at least a browser emulator, such as HtmlUnit), then your load test scripts will need to duplicate much of the same AJAX logic that effects state.

For those that don’t have the system resources necessary to run real browsers and aren’t interested in my service, HtmlUnit (http://htmlunit.sourceforge.net/) is a decent alternative. It doesn’t consume as many resources, so it can be run in a larger quantity on a single machine. However, it also isn’t a perfect emulator and often chokes on complex JavaScript such as Dojo and jQuery. The team is always working to improve this though, so check with them for the latest status.

Thanks again for checking out the article!

Patrick

Comment by plightbo — January 8, 2009

I’ve done lots of load testing for a variety of companies and I will say that one of the biggest factors that determines a successful test is the skill of the tester(s).

Countless times I have seen projects where the testing team used a variety of self-service tools and either had problems getting the test to execute at all or executed the test incorrectly (problems setting up dynamic data sets, etc.).

For this reason, I think testing is best left to seasoned industry experts that know how to design, build and execute a successful test.

There are lots of self-service tools out there now but they are less expensive for a reason. They don’t come with the expertise and knowledge of how to build and execute a good test.

Comment by loadtester — January 15, 2009

I agree that my post is by no means a solution to easily implement and is really a kick start to help hacking a solution. I also fully agree that getting an expert is the best for the majority of companies in need from a cost-benefit standpoint considering how niche this type of testing is. I’ve had great luck contracting out to certain companies to do full on non-serial ajax load testing for a great price. I really see my approach as more a test-suite that should always be part of a project prior to a full “good” test possibly outsourced. I treat it as something almost as important as building unit tests along with development. It’s important to have a good idea how your application is going to handle so that you don’t waste time/money on a good test; it’s always important to have a control in your experiment. Even without a good test, just having a simple idea of what you can expect out of your application to aid in building service standards sometimes will suffice – something to use as the baseline.

I’d be interested to see what self-service products you have seen in this realm…

Comment by num — January 29, 2009

When I do performance testing, I’m not so much interested in “realistic” loads or timings as in finding out where the bottlenecks are. Does this approach help to do that? For instance, am I seeing a 10 seconds delay from logging in because nothing was cached from the last session, because there are 10,000 elements in the DOM, because the web server is still running in debug – and so on.

Rather than run an elaborate load test simulation to see if a web app is performing badly, then pick through the code to find out why, I’d rather assume it’s going to be bad (because it always is) and go straight to reading the code…

Comment by NickTulett — February 13, 2009

Great post topic – very relevant! In my observation it may be more appropriate to state that performance testing Ajax applications is DIFFERENT – which means some people think it is hard. Instead, the equation could be stated as:

AJAX + LoadTesting = Change

Rather than testing being hard, it is CHANGE that is hard. This is especially obvious with individuals who have become comfortable with their traditional testing tools. I was one of these engineers who had been testing for so long in 2-tier and 3-tier Web 1.0 architectures. When I first started seeing new client architectures, web services, AJAX and n-tier systems – it was definitely a new challenge, and it meant that I needed to CHANGE. But this is exciting change – because with change comes learning and growth.

I do disagree on the point that “…even the best traditional load testing tools fall down.” because is simply not true. It’s not accurate to suggest at the existing testing tools are falling down because these tools have virtual user or driver technologies for Web 1.0 architectures. We do have those older solutions, of course – but we also have new solutions. For instance, LoadRunner has had support for AJAX testing for nearly 2 years and that’s with a new driver and new scripting capabilities that seriously innovate the proces. The “traditional” Web 1.0 solutions in our tool are maintained because it is still useful solution for older Web architectures. Even the traditional testing tools have new innovations available for customers who are ready for change.

Comment by mtomlins — March 12, 2009

motomlins – I absolutely agree that change is part of the problem. It’s also getting tougher for non-engineers (those who didn’t _write_ Ajax apps) to understand how to test these Ajax apps. There’s definitely a growing gap in education.

But I disagree that traditional load testing support for Ajax has been good. That’s not because they are bad products. I think it’s just because they are the _wrong_ products for this problem.

At the end of the day, LoadRunner can never provide great Ajax support because it doesn’t run a real browser in real time. Today’s Ajax apps are just too complex to get in to the game of simulating the protocol-level traffic. Sure, it can be done with some level of work, but I just don’t believe it’s worth it.

Comment by plightbo — March 23, 2009

Leave a comment

You must be logged in to post a comment.