I’ve told two people about this trick in the last few days, so it is worth writing it up.
It is hard to get a good distribution of requests in your HTTP load tester, it usually requires a knowledge of valid keys or users and a model of the distribution of the accesses. This can all be built in the load tester, but that seems to be a big barrier, since I’ve rarely seen that happen. I’ve certainly never done it and I’ve needed it several times.
The easy way to do this is to add a “random choice” parameter to the app. The app already knows the legal set of keys or users and can quickly make a choice. You already know the language and the code in the app, and the changes are localized to the URL parameter parsing. Let’s say you have a back-end server that returns records.
http://example.com/getRecords?key=12345&key=67890
http://example.com/getRecords?key=random&key=random
An HTTP load tester can access the single randomizing URL over and over again, and fetch different records each time. This is a trivial load test script. In Jakarta JMeter, it is one of the samples.
This is really very easy to write inside of the server. Getting a random key looks something like this, assuming that we already have an instance of java.util.Random
initialized and ready to go.
key = cache.keySet().get(random.nextInt(cache.keySet().size()))
In Python, you can use the default, shared instance of the random source and the choice()
convenience method:
key = random.choice(cache.keys())
This can all be done in the code that parses the URL parameters. Once you have a random key, the remainder of the app executes with no changes.
If the app should be tested with a non-uniform distribution of accesses, that is also easy to do. Python’s random.paretovariate()
looks especially good for Zipf (80/20 or “long tail”) distributions. Or you could duplicate that code in your favorite language:
def paretovariate(self, alpha): """Pareto distribution. alpha is the shape parameter.""" # Jain, pg. 495 u = 1.0 - self.random() return 1.0 / pow(u, 1.0/alpha)
For user logins, add an option to masquerade as a random user, or even a random user from certain classes (big profile, frequent login, new user …).
For testing search, I once made an especially fancy tester that would access a log of queries in order, but start at a different place for each client. This preserves the time locality of queries while giving each client a different set. I used a cookie to hold the per-client state, so that each client would access the queries in order from their starting place. It went roughly like this:
- If the client did not send a cookie, choose a random index in the log.
- Otherwise, read the cookie to get an index.
- Set the cookie to the next index.
- Wrap the index, modulo the log size.
- Run the search with the query at that index.
Now go test your software. I might need to use it someday.