Click Residual: A Query Success Metric

Posted on August 8, 2022 by Walter Underwood

How do you find out which queries need the most improvement? Look at the ones that are underperforming compared to their expected number of clicks. If you look for low click-through rate (CTR), you’ll find underperforming queries, but they’ll almost all be in the long tail. Improving those won’t make an overall improvement. Click residual is a metric that combines CTR with overall traffic to give a useful number.

To find the queries with the most impact, start with the click count. “Click residual” is the difference between the expected number of clicks and the actual number of clicks. When that is negative, you can see how many times a customer did a search, but wasn’t satisfied enough with the results to click, relative to the overall performance of the search system.

Continue reading →

I’m an Amazon Influencer!

Posted on November 17, 2018 by Walter Underwood

My most popular blog post has 7408 views in the 3+ years since I posted it. It mentions three things and I just saw those three things listed on Amazon under “Frequently bought together” when I visited the page for one of them.

Continue reading →

Measuring Search Relevance with MRR

Posted on September 12, 2016 by Walter Underwood

At Chegg, we test the relevance of our search engine using customer data. We extract anonymous information about queries and clicks, then use that to automatically test improvements to search. When our search engine provides results that better match what our customers are choosing, we call that an improvement.

The most basic measurement of search quality is clickthrough rate. For each search page that is shown, how often is at least one search result clicked on? This fits well with overall website conversion, showing how search works in getting a visitor to a successful result.

Continue reading →

Ultraseek vs. Google Search Appliance

Posted on February 12, 2016 by Walter Underwood

On the occasion of the Googlebox end of life news, it is time to talk about what a weak product it really was.

Sandia Labs was an Ultraseek customer and ran a relevance experiment where Ultraseek trounced the Google Search Appliance. But first some history.

Continue reading →

What is like Dersu Uzala?

Posted on January 19, 2010 by Walter Underwood

Tom Mangan recommended Dersu Uzala, so I added it to our Netflix Queue.

Funny, Netflix isn’t quite sure what other movies are like Dersu Uzala. I don’t really blame them — what is like a masterpiece?

Continue reading →

Most Popular Netflix Movie in Palo Alto

Posted on April 1, 2009 by Walter Underwood

Kinda recursive, but true:

Continue reading →

Query Box as Confessional Box

Posted on November 7, 2008 by Walter Underwood

We have a very small number of very long queries that are timing out in the search engine, so I was digging through the logs looking at long queries. I found this.

“something that i think will never happen just did and i dont like it one single bit, no not even a drop, and i wish it never did because it just ruined my life and i just want to watch indianna jones”

199 characters. I think I’ll set the limit somewhere over 200 characters. I’d hate to make their day worse.

Search Evaluation by Kitten War

Posted on September 16, 2008 by Walter Underwood

On a search engine mailing list, the topic of simple A/B testing between search engines came up. This can be between different implementations, different tunings, or different UI presentations. The key thing is that users are offered two alternatives and asked which one they like better. One bit of information, this one or that one. If you’ve been to the Kitten War site, you’ll understand why I call it “kitten war testing”. Others may call it a “beauty contest”. They are wrong, of course.

Continue reading →

Odd Cataloging Decisions at Palo Alto Library

Posted on July 28, 2008 by Walter Underwood

I really wonder about the cataloging at my local library. I was looking for books by Jo Walton and I noticed that a series by her was spread across two areas, both arguably wrong. First, Ha’penny is a sequel to Farthing, so they really should be shelved in the same section. Second, they are both alternate history novels from a fantasy author, and I wouldn’t look in either Mystery or Fiction for them.

Check out this screenshot from their search on July 2nd.

Big hint, Tor has been a major SF&F imprint for over 25 years.

I’m looking forward to Palo Alto’s choice for Half a Crown, the next book in the series. Maybe DDC 737 (Numismatics)?

I reported this to the reference desk at Main. Let’s hope they fix it.

The fun doesn’t stop there. I’m currently reading The Fall of the Kings. That was shelved in YA Fiction, where it doesn’t even belong. I read a fair mix of books, from Westerfeld to Dostoevsky, with plenty of YA, and this just doesn’t fit in the Teen collection. It is long (476 pages of small print), there are no teenage characters, nearly every chapter has sex and/or violence, it is quite slow moving, and it helps if you care about university politics. I read Valiant immediately before, and that book has half the word count with double the action and four times the dialogue, plus teens, fairies, drugs, NYC, and a massive betrayal by mom. Valiant belongs in the Teen section. Dreamhunter belongs there. The Fall of the Kings does not.

I thought that maybe, just maybe, they put it in YA because the most recent book in the series, The Privilege of the Sword, has a 15 year old girl as the main character and can easily be considered YA, so they decided to keep them together. Sorry, they shelved that one in Science Fiction.

I know that strictly defining Science Fiction (or Fantasy) is nearly impossible, but they must be able to avoid howlers like this. Yes, Michael Chabon has written fantasy (Summerland) and SF (The Yiddish Policemen’s Union) but it might as well be shelved in the mainstream section because that is where people will look for him. On the other hand, Jo Walton has written a sword and sorcery trilogy and a book set in Victorian England where the nobility are dragons. Where would you look? Heck, ask Jo Walton. Her answer to the FAQ “What genre is Farthing?” reads “It’s an alternate history mystery. I think that makes it SF.”

Hmm, Palo Alto also shelves The Lord of the Rings and Jacqueline Carey’s Kushiel series in mainstream Fiction. Bizarre. The Kushiel books are also published by Tor. Can we just shelve all the Tor in SF, as a stopgap?

How much does metadata cost?

Posted on April 11, 2008 by Walter Underwood

It is very hard to find numbers on what it really costs for metadata, but here is one from a Netflix job posting. $6 per movie for “original, descriptive movie and TV episode synopses.”

Here are links to a Hacking Netflix blog posting (likely to remain a valid URL) and to the Netflix job posting (guaranteed to succumb to link rot as soon as the opening
is filled).

The only other published numbers I’ve found are similar, $6.20 to $14.67 per jazz CD depending on the detail in 2003 at the Public Library of Cincinnati and Hamilton County. They were given a collection of 6200 jazz records and were estimating what that gift would cost them. See the article How Much Will It Cost? Making Informed Policy Choices Using Cataloging Standards.

The Netflix numbers are probably closer for an ecommerce or search application. Still, the close agreement in the numbers makes it pretty safe to say “less than $10 per document”.

Remember that the metadata must be updated when the document changes. Maybe “$10 per document per year” is a better number. HP was spending about that much to manage the HP-UX spec (man pages) about ten years ago. That covered all activities, not just metadata.

The Netflix job posting is for six openings, each with a six week duration. That sounds like a lot of work, but if I assume each writer does three synopses per hour (seems very fast for finished work), that is still only 4300 movies. Metadata is very, very expensive.

I have a couple of other stories without dollars, but still instructive.

One publishing company needed to digitize their back content and planned to start a division in the Philippines with 3000 employees to get it done. They found a different way.

I was consulting with a telecom company, and the CEO asked for metadata on every page in their intranet. They had 4M documents.

One final note, since I work for Netflix. All of the Netflix info here is derived from the job posting. No insider information was required or is included in this post.

Everything is Miscellaneous (not!)

Posted on October 29, 2007 by Walter Underwood

I checked out David Weinberger’s Everything is Miscellaneous from the local library with every intention of reading the whole book, but after 150 pages, I just can’t waste any more of my time on it. It is just too sloppy — badly researched and the conclusions are flimsy.

One of my father’s frequent sayings is “Do your homework.” This doesn’t mean to finish your spelling paper, it means you should research your subject and not go in half-prepared. Weinberger has not done his homework and his book shows it. He repeatedly starts with bad assumptions, so the book crumbles into a pile of anecdote and opinion.

I find it especially embarrassing to finish something and find out that I’ve done a halfway job of reinventing a known solution, so I try to follow Dad’s advice and do my homework. I’ve been working in search for the past ten years and I’ve done a lot of homework. In multiple places in this book, Weinberger just gets it wrong, and wrong in places where it shoots down his argument.

Let’s take Chapter 3, where he criticizes how libraries organize information. Libraries have been beating on this problem for centuries and they have something that works. It isn’t always pretty, but it works. Weinberger takes the Dewey Decimal Classification (DDC) as representative of all library organization. It isn’t. DDC looks like a classification scheme, but it is really used to put books on shelves in some useful order, not to fully describe their topics. The goal is to get the WWI books near the WWII books, not to decide whether war is history, politics, or technology.

Then he makes fun of Library of Congress Classification (LCC) for putting the the Balkans at the same level as Africa. He manages to make three ignorant mistakes in one example. First, LCC is enumerative, so it doesn’t have a strong hierarchy. It is designed so it is easy to add space at the end, perhaps because there is no end to knowledge. Second, book classification schemes are shaped by when they were designed and by the universe of books. There are lots of books about the Balkans, probably as many as there were about Africa when the scheme was designed. Of course, he makes the same mistake as he made with DDC, confusing a locating scheme with a subject classification scheme.

He certainly should have looked at the Library of Congress Subject Headings (LCSH). These have the cross-referenced graph structure that he wants DDC and LCC to have. They are updated weekly, extensible, and on-line.

Weinberger picks The Little House Cookbook to demonstrate that Amazon’s info on the book is superior to a library “card catalog”. He shows the Amazon categories below, then uses a customer’s list to find additional books about and by Laura Ingalls Wilder.

Children’s Books>Authors & Illustrators, A-Z>Williams, Garth
Children’s Books>History & Historical Fiction>United States>1800s
Children’s Books>Sports & Activities>Cooking

Now let’s look up the record in WorldCat. Amazingly, it is already tagged with Laura Ingalls Wilder as one of the subjects. We don’t need user lists to do that, just good categories and catalogers (both of which are expensive). Aside from the quaint word “cookery”, this list of categories seems more useful than the ones at Amazon:

Cookery, American — History — Juvenile literature.
Literary cookbooks.
Wilder, Laura Ingalls.
Cookery, American — History.
Frontier and pioneer life.

This is his crowning example in Chapter 3, which he uses to show that Amazon is better than his misunderstanding of DDC. But Amazon isn’t better in this example, so the whole thing falls flat.

If an author makes these kind of mistakes in a book about organizing information, I can’t trust him. Library technology has been continually developed since at least Callimachus at Alexandria. If you care about information, you need to grok library technology and its true strengths and weaknesses, not tell us why Melvil Dewey spelled his name oddly.

I recommend you don’t waste your time on this book. I gave up when I was spending most of my time picking apart his examples and not learning anything in the process. Disagreeing with someone who has done their homework is invigorating, but this was just red pencil time.

Instead, choose one of these books. I found each one of these increased the depth and breadth of my thinking. You might not agree with the authors, but you’ll get a good workout doing it.

The Social Life of Information, John Seely Brown and Paul Duguid
Understanding Comics, Scott McCloud
A Theory of Fun for Game Design, Raph Koster
Women, Fire, and Dangerous Things, George Lakoff (I just read the first part, part two is for linguistics geeks only)
The Nature and Art of Workmanship, David Pye
Managing the Flow of Technology, Thomas Allen
Democracy in America, Alexis de Tocqueville
Decisions and Organizations, Jim March

By the way, Clay Shirky is just as sloppy. His Ontology is Overrated makes the same mistakes. What a mess.

Variant Spellings of “Guns N’ Roses”

Posted on September 11, 2007 by Walter Underwood

Paul Lamere posts different spellings of “Guns N’ Roses” and “Tchaikovsky” from ID3 tags. Here are the top twelve for “Guns N’ Roses”:

Guns N Roses
Guns and Roses
guns ‘n’ roses
Guns ‘N Roses
Guns & Roses
Guns’N’Roses
Guns N’Roses
Guns’N Roses
Guns´n Roses
Guns N´ Roses
Guns -N- Roses
GNR

You’ll need to go to the posting for the “Пётр Ильич Чайковский” variants.

Searchers Punt Early

Posted on July 10, 2007 by Walter Underwood

Amidst the usual creative spellings and phonetic thrashing around (“napolinian dynomite”) that I see in the search logs, I’ve noticed a small but distinctive subclass of searcher behavior. People type as much as they are sure of then, instead of making a mistake, they stop typing and submit the fragment to the search engine. Said that way, it kinda makes sense, but search algorithms are tuned for complete, if imperfect, attempts instead of exact prefixes.

Here are some selected examples from logs.

Frank Gehry
- frank g
The Adventures of Baron Munchausen
- the adventures of baron
- baron munch
- baron munc
- baron mu
The Last Mimzy
- the last mimsy (that’s how Lewis Carroll spelled it)
- the last mimz
- the last mim
- the last mi
- the last m
Final Fantasy
- final fan
Apocalypto (lots of misspellings)
- apocalypse
- apocalypso
- apacal
- apoca
- apoc
- apo
- ap
- rudy (yeah, that one is for real)
Ratatouille
- ratatou
- ratato
- ratat
- rata
- rat
- ra
Koyaanisqatsi
- koyaanisq
- koyaanis
- koyaani
- koya
- coonskin

The “coonskin” query may seem bizarre, but that is exactly what phonetic search is tuned to solve. Unfortunately, people don’t seem to be that brave or that deluded.
Querying for “mimsy” instead of “mimzy” is a typical, supported phonetic match.

The Koyaanisqatsi example is the one that tipped me to this other behavior, with additional evidence from Frank Gehry and Baron Munchausen. Note how they get the double-“a” in Koyaanisqatsi, but freak out at the “q” not followed by “u”. They are almost there, then punt because they are not sure what to type next.

Is this behavior learned from auto-completion, from texting completion, or is it caused by our reluctance to make mistakes? Maybe it doesn’t matter, since I need to help these folks regardless.

This is probably best addressed with auto-completion, not matching in the engine.

Cataloger or Director of Metadata?

Posted on July 7, 2007 by Walter Underwood

Marc Siry posts about Job Descriptions from the Future: Director of Metadata, but it sure sounds like a cataloging librarian to me. Here are the key parts of the job description:

Fashion metadata requirements for content partnerships
Determine metadata needs for editorial programming interfaces
Develop a schema that supports product capabilities for discovery and reporting
build out the CMS and data entry methods to support the schema
plan for future expansions and revisions of the schema as the business evolves

A 1200 word post on managing metadata with no mention of librarians. It looks like NBC Universal is going to be reinventing a lot of stuff. To reword Isaac Newton’s famous comment, this is standing on the toes of giants, not their shoulders.

Do all-stopword queries matter?

Posted on May 31, 2007 by Walter Underwood

Many search engines don’t index “stopwords”, words that are very common and have little meaning by themselves. The stopword list is often just the most frequent words in the language: “the”, “be” (and its inflections), “a”, “of”, and so on.

Search engines that index all words like to show off searches for “to be or not to be”, because stopword elimination can remove every word in the phrase. Of course, no one really searches for “to be or not to be” because we all know where it came from.

Are there any real titles that are all stopwords? Does this matter? I’ve been indexing movie titles, and found a more than a few that are 100% stopwords.

Being There (this is the first one I noticed)
To Be and To Have (Être et Avoir)
To Have and To Have Not
Once and Again
To Be or Not To Be (1942) (OK, it isn’t just a quote from Hamlet)
To Be or Not To Be (1983)
Now and Then, Here and There
Be with Me
I’ll Be There
It Had to Be You
You Should Not Be Here
You Are Here
Click

The last one isn’t a traditional stopward, but think about the number of “click here” links on the web. It is a web stopword, for sure.

	Walter Underwood on Using a Mobile Antenna as a Te…
	Letters to me: April… on Are Websites Dead?
	Letters to me: April… on Are Websites Dead?
	Randolph King on My Father’s Pens
	Whoopers on Women at Philmont

Most Casual Observer

In physics class, many things are intuitively obvious to the most casual observer. Welcome to my casual observations.

Category Archives: Search Engines

Click Residual: A Query Success Metric

I’m an Amazon Influencer!

Measuring Search Relevance with MRR

Ultraseek vs. Google Search Appliance

What is like Dersu Uzala?

Most Popular Netflix Movie in Palo Alto

Query Box as Confessional Box

Search Evaluation by Kitten War

Odd Cataloging Decisions at Palo Alto Library

How much does metadata cost?

Everything is Miscellaneous (not!)

Variant Spellings of “Guns N’ Roses”

Searchers Punt Early

Cataloger or Director of Metadata?

Do all-stopword queries matter?