Progressive Reranking

Occasionally, people want to rescore search results and rerank them after they are returned from the search engine. The usual answer is that it is very slow because you need to fetch the full list of results, score them, then sort by the new score.

You don’t really want to rerank the full list. If your search engine is returning very useful results at the end of the list, you have big problems, bigger than you can fix with reranking. The normal case is that you are moving some results up and down a few positions, and users will only look at the first ten results. So let’s look at how to do that efficiently.

A Pull Model for Search Results

The Ultraseek XPA Java library has classes to combine results from multiple engines and rerank them by global tf.idf scores. This class can be used for other kinds of reranking. You can use write your own code, of course, but I’ll use the XPA classes for this example.

CompositeSearchable (and CompositeSearchResultList) reads results into a priority queue, ordered by score. When client code requests a result from CompositeSearchable, that is removed from the queue, added to a “frozen” list of scored results, the PQ is replenished with one new result, that result is scored and inserted into the queue (positioned according to its score), then the requested result is returned.

The frozen list allows downstream searchables to treat the already accessed portion of the search results as read-only and unchanging.

This is a lot faster than it would seem, because the next result is usually already locally cached in the local UltraseekServer class because it was part of the chunk of results requested from the server.

If a CompositeSearchable is merging results from different sources, the replacement result always comes from the same source as the one moved to the frozen list. If all the good results are from one collection, this replacement policy guarantees that they will all be shown.

Applying Score Adjustments

Assume we have a score for each document that represents a priori goodness. Perhaps it is a measure of how long it would take monkeys to type that document (MonkeyRank). This score ranges from 0..1 with 1 meaning good.

SearchResult already has a document quality measure which matches this, with a range of -1.0..+1.0. We can use the default result scorer, but replace (or modify) Ultraseek’s document quality with our own MonkeyRank.

We do this with SearchResultWrapper, which delegates all methods to a wrapped SearchResult. We’ll scale the MonkeyRank value to a range of -0.2..+0.2. This is a bit more powerful than the Ultraseek quality score, which goes from -0.16..+0.15.

class MonkeySearchResult extends SearchResultWrapper {
public float getQuality() {
float mr = MonkeyRank.getRank(baseResult.getURL());
float scaledMR = (mr * 0.4) - 0.2;
return scaledMR + baseResult.getQuality();
}
}

Then we make a MonkeyResultScorer class. It wraps the search result, then uses DefaultResultScorer.

class MonkeyResultScorer implements ResultScorer {
public float score(SearchResult sr, SearchResultList srl) {
return CompositeSearchable.RESULT_SCORER.score(
new MonkeySearchResult(sr),
srl);
}
}

The only thing left is to figure out the maximum number of rank positions that a score boost can move a result up the list. That determines the read-ahead level. We only care about movement to a more relevant rank (earlier in the list). Pushing results down the list is much easier.

Now you need to set the read-ahead. If the read-ahead is too small, then some good results will not be re-ranked as high as they should be. If the read-ahead is too big, performance will suffer because you will need to fetch more results from the engine before the first one can be shown.

Here are a few ways to guess at the read-ahead size. You’ll probably need to do these for quite a few queries in order to get good numbers. I’d use the top 100 single-word and top 100 multi-word queries. Those are scored somewhat differently in the engine, and multi-word queries are underrepresented in the top 100. Overall, they are more than half of queries.

  1. Look at the maximum score modification (+/- 0.2) and then at the scoring for result lists. Count how far a document would move if it was scored 0.2 higher and everything above it was scored 0.2 lower. This is the worst case read-ahead value for your data and queries. It will be much too large.
  2. Compare a raw list and a re-ranked list and look at the biggest movements up the list. This will be the typical read-ahead value for your data and queries, probably a much smaller number than the above.
  3. Do either of the above, but only count results that move into the top three slots (“above the fold”) or the top ten slots (first page). This number is more practical, especially for scores based on link-graph data, like PageRank. Those tend to make a big difference for a few documents, thanks to the power-law distribution in the web link graph. Also, if your scoring doesn’t move hits above the fold, it is a waste of time and you don’t need to implement any of this!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.