Spock Challenge Has Some Nasty Problems

The Spock people search engine is running a competition similar to the Netflix Prize. The Spock Challenge started at 9 AM this morning (April 16th), runs for four months, and has a grand prize of $50,000.

Unfortunately, the criteria for winning is not clear, so you don’t have any way to tell whether your code is getting better, or even what “better” means. Here is the explanation of “How Winner Will be Determined” from the rules.

The winner will be be the Software Submission, that in the discretion of the judges selected by Spock, most elegantly and efficiently resolves the problem of conflation in data collected for search applications such as Spock’s.

Looks mysterious to me. Is there only one solution and they are just shooting for clean fast code? They can’t write clean fast code themselves?

If you are lucky enough to win with the unclear goal, you lose control of your source code and even your algorithms and patents.

Upon acceptance of the prize, the winning Software Submissions and all source code and algorithms related thereto becomes the sole and exclusive property of Spock. You agree to take such actions as are desirable to Spock to vest such ownership interest in Spock. Spock may use, reproduce, display, offer for sale, sublicense and otherwise exploit the winning Software Submissions and their source code as it sees fit in its sole discretion.

Compare that to the terms in the Netflix Prize rules, especially the “non-exclusive license”:

After qualifying for either the Grand or Progress Prize and being verified by the Contest judges, as a condition to receiving either Prize, the winning individual and/or team must grant to Netflix (including its affiliates and subsidiaries, employees, agents, and contractors), an irrevocable, royalty free, fully paid up, worldwide non-exclusive license under the Participants’ copyrights, patents or other intellectual property rights in the winning algorithm (“Winning Algorithm”) to reproduce, distribute, display, and create derivative works from the Winning Algorithm and also to make, have made, use, sell, offer for sale, and import products that would otherwise infringe the Winning Algorithm. Except as encompassed in the concept of “have made”, this license will not include the right to grant further licenses or sublicenses.

Let’s see, a non-exclusive license for $1M or total control for $50K. That just sounds greedy. Why would anyone enter this? With the Netflix Prize, you could start a business. With the Spock Challenge, you don’t even get a down payment for a condo. Remember, you’ll be spending a chunk of it on patent searches to make sure you own those algorithms before selling them to Spock.

Spock’s lawyers need to go back to the drawing board on this one. Put together a clean, fair contest with a prize that matters.

Disclaimer: I work at Netflix.

Update: I found the criteria for qualifying on Spock’s learn more page. The criteria for winning are still a beauty contest.

A comment in the discussion board has pointed out that this “sole and exclusive property” requirement makes it hard (impossible?) to use open source code in your submission.

I Made a Second Grade Girl Cry

I made a second grade girl cry. It wasn’t pleasant, but I was following the lesson plan. I was helping with Ability Awareness Week, a program for elementary schools to make it really clear that people have different abilities. This exercise was threading Cheerios on a string while wearing socks over your hands, with an adult (me) urging you to hurry up. Urging in a pleasant, encouraging tone, but without leaving enough time to finish the job. It was a painful experience for both of us, but it was worth it to experience what it is like to have a disability.

Ability Awareness Week teaches elementary school kids about varying abilities by experiencing a disability, then following up with classroom discussion. The simulations are tuned to grade level. It is a great program, and not too hard to implement. You need a commitment from the principal and teachers, plus a few parent volunteers. The volunteers are easier to get than you might think. You’ll need a few supplies — Cheerios, socks, button-up shirts, some cardboard and mirrors for the dislexia simulation — nothing too expensive or difficult. Our school disctrict’s supplies were assembled as an Eagle Scout project.

Try it at your school. There is a wonderful manual for implementing it. We are all temporarily abled. If we’re lucky, we’ll live long enough to join the disabled.

A Crippling Calm?

On a backpacking trip to the Pecos Wilderness in the 70’s, my Dad and I stopped to chat with a wilderness ranger. We swapped stories of course, and he told us about a hiker who had taken off his boots and socks for an afternoon nap. While the hiker slept, the shade moved, and when he woke, the high-altitude sun had burned his feet so severely that he could not hike back out. The ranger had helped evacuate him.

So you can imagine my horror when I read this on the back of a package of Tazo Calm Herbal Infusion

A single cup of Tazo Calm has been known to have the same effect as sitting for 45 minutes in a mountain meadow on a sunny day with your shoes off.

No thank you. I prefer to be able to walk after I drink a cup of tea.

7th Grade Reading

I’ve been having a fun time sharing books with my 7th grade son.

In 8th grade, I was a library aide. It was at the lunch period, so I’d shelve all the books returned that morning (all of them, because the one I had returned before school was usually on the bottom of the pile), go to lunch between rushes when the line was short, then come back to the library to choose a book and start reading. I’d finish it that night and return it the next morning before school.

As a result, I have a very good grounding in juvenile literature (through 1970, I’m catching up), and I was ready when my son reached 7th grade. Luckily, he’s a team guy and likes having other people recommend books. I’ve been re-reading some and hunting down new ones.

As you can see by the list below, I want to make sure he has a good grounding in the classics.

Ralph 124C 41+, Hugo Gernsback. Michael loved this book, even though it was written nearly a hundred years ago and that shows in the style and vocabulary. Gernsback was totally caught up in the wonders of the year 2660 and that somehow connected. I have a soft spot for visions of the future written in the past, and this one is from 1911, so it is even more fun. It is mostly a travelogue of the future, but there is enough plot to keep it moving.

Be sure to get the edition from the Bison Books Frontiers of Imagination series because it has the cool illustrations. Sigh, that web site is a disaster, but the books are really nice. If they could reissue the catalog of Sam Moskowitz’s Hyperion Press, I wouldn’t care if they wrote their whole site in PDF.

Have Space Suit, Will Travel, Robert Heinlein. You think that all those SF juveniles are the same, then you find one that is just better. This one has it all — tinkering engineers, good aliens, bad aliens, a prison on Pluto, a galactic tribunal on the worthiness of the human race, and a strong female character who’s good at math. Michael ate it up, but found the 1950’s small town scenes somewhat stranger than being imprisoned on Pluto. Maybe we should go watch a bunch of The Andy Griffith Show or Happy Days episodes to get a proper grounding in 50’s stereotypes. Or maybe not.

Space Cadet, Robert Heinlein. You’d probably pass this one up because of the title, but you’d be wrong. Yes, a lot of the plot is predictable, but it there is something interesting going on besides the regular academy and coming-of-age stuff. The Space Patrol is in charge of a global deterrent, orbiting nuclear weapons. The folk on the ground are so used to peace that even talking about the bombs is impolite. Could we make a lasting peace out of Mutually Assured Destruction? What kind of guardians would we need to make that work? The chill of the cold war spawns a bit of hope.

Heinlein’s Space Patrol has a lot in common with Doc Smith’s Galactic Patrol, but without the all-knowning Arisians to keep them on course. This time, it is all up to the humans.

Of course, Ender’s Game is the best space cadet novel of all time, but I think it is a lot stronger if you know which direction a space cadet story is supposed to go. There are always a couple of cadets who don’t make the grade because they aren’t moral enough, but we don’t expect them to be psychopaths. Space Cadet stands on its own, but if you haven’t read Ender’s Game, you now have another reason to read Heinlein first.

So Yesterday, by Scott Westerfeld. Set the time machine for today! The main character is a Cool Hunter on the watch for emerging fashions. He blows apart a marketing session by inviting an Innovator, a girl who starts fashions instead of following them. Then someone disappears and fashion gets deadly.

I really like how the plot charges ahead while peeling back the facade of marketing and fashion. The language has a now, post-modern shine (is post-modern already passé?) decorated with brand names. Even the cool hunting protagonist is nearly a brand name, Hunter Braque. He makes an aside early on about mentioning brands when he avoids saying “Google” because it is just too common.

It’s good this book is short, because both my son and my wife had to finish it in one sitting. Westerfeld writes longer stuff, too. He has a trilogy on another set of themes that hit home with teens. Uglies, Pretties, and Specials is set in a future where everyone is forced to get surgery and mods to be pretty and happy at age 16. Well, almost everyone. What did they give up to be pretty and happy? Was it worth it? What would you choose?

The King in the Window, by Adam Gopnik. This one isn’t science fiction. Well, there is some weird quantum physics stuff at the end, but that is more fantasy than SF. It is there for narrative effect not intellectual effect (but that is a different blog post). The wonderful part about this book is the feel of Paris and the presence of the past in the present. Racine, Molière, and Richelieu (still adjusting his mayonnaise) are here, and Versailles really is a portal to a different world. Unlike the other books on this list, this book is more about place and character than plot. The plot is fine, but what I remember is Paris, the dinner with Mrs. Pearson, the clochards, and all the windows.

I think the first half of the book was more satisfying and that it loses itself a bit when the American startup guy enters the story. Maybe New York authors just can’t write convincing Silicon Valley stereotypes. But that is a nit on a fun story with a nice bit of depth. My son didn’t see anything wrong with it. For me, catching myself reflected in the café window isn’t quite the same anymore.

When Worlds Collide, by Philip Wylie and Edwin Balmer. When a book gives away the ending in the title, you know the authors are betting everything on the ride. Imagine a mystery titled “The Butler Did It”. This edition combines When Worlds Collide and After Worlds Collide. That’s the sequel, but I bet you figured that out. This is another early SF novel, but from 1932 this time. That is a long twenty years from 1911 — world war, depression, and an influenza pandemic. In When Worlds Collide the destruction is not limited to govenrments, economies, or populations, the entire Earth is destroyed and it is done with convincing detail: huge tides, monster storms blotting out the sun, mass panic, and a final desperate dash to another planet. Once there, the meteorological and geological scares are over, but the sociological and political problems are just as serious.

Progressive Reranking

Occasionally, people want to rescore search results and rerank them after they are returned from the search engine. The usual answer is that it is very slow because you need to fetch the full list of results, score them, then sort by the new score.

You don’t really want to rerank the full list. If your search engine is returning very useful results at the end of the list, you have big problems, bigger than you can fix with reranking. The normal case is that you are moving some results up and down a few positions, and users will only look at the first ten results. So let’s look at how to do that efficiently.

A Pull Model for Search Results

The Ultraseek XPA Java library has classes to combine results from multiple engines and rerank them by global tf.idf scores. This class can be used for other kinds of reranking. You can use write your own code, of course, but I’ll use the XPA classes for this example.

CompositeSearchable (and CompositeSearchResultList) reads results into a priority queue, ordered by score. When client code requests a result from CompositeSearchable, that is removed from the queue, added to a “frozen” list of scored results, the PQ is replenished with one new result, that result is scored and inserted into the queue (positioned according to its score), then the requested result is returned.

The frozen list allows downstream searchables to treat the already accessed portion of the search results as read-only and unchanging.

This is a lot faster than it would seem, because the next result is usually already locally cached in the local UltraseekServer class because it was part of the chunk of results requested from the server.

If a CompositeSearchable is merging results from different sources, the replacement result always comes from the same source as the one moved to the frozen list. If all the good results are from one collection, this replacement policy guarantees that they will all be shown.

Applying Score Adjustments

Assume we have a score for each document that represents a priori goodness. Perhaps it is a measure of how long it would take monkeys to type that document (MonkeyRank). This score ranges from 0..1 with 1 meaning good.

SearchResult already has a document quality measure which matches this, with a range of -1.0..+1.0. We can use the default result scorer, but replace (or modify) Ultraseek’s document quality with our own MonkeyRank.

We do this with SearchResultWrapper, which delegates all methods to a wrapped SearchResult. We’ll scale the MonkeyRank value to a range of -0.2..+0.2. This is a bit more powerful than the Ultraseek quality score, which goes from -0.16..+0.15.

class MonkeySearchResult extends SearchResultWrapper {
public float getQuality() {
float mr = MonkeyRank.getRank(baseResult.getURL());
float scaledMR = (mr * 0.4) - 0.2;
return scaledMR + baseResult.getQuality();
}
}

Then we make a MonkeyResultScorer class. It wraps the search result, then uses DefaultResultScorer.

class MonkeyResultScorer implements ResultScorer {
public float score(SearchResult sr, SearchResultList srl) {
return CompositeSearchable.RESULT_SCORER.score(
new MonkeySearchResult(sr),
srl);
}
}

The only thing left is to figure out the maximum number of rank positions that a score boost can move a result up the list. That determines the read-ahead level. We only care about movement to a more relevant rank (earlier in the list). Pushing results down the list is much easier.

Now you need to set the read-ahead. If the read-ahead is too small, then some good results will not be re-ranked as high as they should be. If the read-ahead is too big, performance will suffer because you will need to fetch more results from the engine before the first one can be shown.

Here are a few ways to guess at the read-ahead size. You’ll probably need to do these for quite a few queries in order to get good numbers. I’d use the top 100 single-word and top 100 multi-word queries. Those are scored somewhat differently in the engine, and multi-word queries are underrepresented in the top 100. Overall, they are more than half of queries.

  1. Look at the maximum score modification (+/- 0.2) and then at the scoring for result lists. Count how far a document would move if it was scored 0.2 higher and everything above it was scored 0.2 lower. This is the worst case read-ahead value for your data and queries. It will be much too large.
  2. Compare a raw list and a re-ranked list and look at the biggest movements up the list. This will be the typical read-ahead value for your data and queries, probably a much smaller number than the above.
  3. Do either of the above, but only count results that move into the top three slots (“above the fold”) or the top ten slots (first page). This number is more practical, especially for scores based on link-graph data, like PageRank. Those tend to make a big difference for a few documents, thanks to the power-law distribution in the web link graph. Also, if your scoring doesn’t move hits above the fold, it is a waste of time and you don’t need to implement any of this!