On the occasion of the Googlebox end of life news, it is time to talk about what a weak product it really was.
Sandia Labs was an Ultraseek customer and ran a relevance experiment where Ultraseek trounced the Google Search Appliance. But first some history.
Many of the US national laboratories used Ultraseek. I don’t remember how it started, but I was invited to give talks about search at two of their IT conferences, one at Oak Ridge National Laboratory (auditorium named after Nobel laureate Eugene Wigner) and one at SLAC, Stanford Linear Accelerator (first website outside Europe).
The labs were quite happy with Ultraseek, but at Sandia, the search team was asked to evaluate the Google Search Appliance. Like good scientists, they set up an experiment. They formatted the results from the two engines in similar, anonymous styles. They set up a table in the cafeteria, offering free cookies for people to try searches and choose the best results from the two engines.
This is a simple but very effective evaluation technique. I like it because it judges the whole result set, both ranking and presentation. It isn’t good for diagnostics, but it is great for customer satisfaction. I call this approach “Kitten War“.
Ultraseek won the experiment, 75% to 25%. That is a three-to-one preference. I’ve never seen that magnitude in a search experiment. In search, we break out the champagne when we get a one percentage point improvement in clickthrough. I’m not kidding. This is beyond massive.
Whoever was pushing Google at Sandia asked them to re-run the experiment with the logos. With that change, Google won 55% to 45%.
Also, performance? Ultraseek was spec’ed for 15 queries/sec and surpassed that spec. The first release of the Googlebox was spec’ed at 30 queries/min, thirty times slower. They later increased that to 60 queries/min. That is one query per second.
Ultraseek actually ran at around 25+ qps, though some new features dropped us closer to 15 qps.
We were the public search engine for irs.gov through Anderson Consulting. Instead of reading the specs, Anderson promised what they had measured instead of the specs, then complained to us. They were massive a-holes about it, even after I made it very clear that it was their fault. But we made Ultraseek even faster, because who wants the IRS search to be slow? irs.gov ran a cluster of fifteen Ultraseek servers. Would not want to try and make that rate with Googleboxes.
Sadly, the relevance test was the point when Ultraseek should have just given away the source code and gone home. The Google logo was enough to sell a massively inferior product. There was nothing we could do in engineering, sales, whatever, to compete with the Google logo.
Sandia Labs did stay with Ultraseek and we continued on for a number of years, but the writing was on the wall.
I’m confused about one point… towards the end you say GSA was a massively inferior product [to Ultraseek]. But you conceded that Ultraseek failed the relevance test. (search ranking/results I assume you mean). Isn’t that very very important for a search engine?
BTW I once observed a panel of search industry sales guys from most of the major companies (including Google for GSA) at an event private to a large company I once worked for. I remarked to myself how cocky the Google GSA sales guy was — it was all about the huge Google brand.
Sorry, I wasn’t clear. Ultraseek won the blind relevance test, very handily. Without showing the Ultraseek or Google logos, Ultraseek won.
It lost the “we trust this search engine because it has a Google logo” test. When they added the logos to the results page, Ultraseek lost.
So we won the technology battle and lost the branding battle.
So has Ultraseek become abandonware under HP? That would be a real pity if so. We had a Ultraseek license for 10 years and when HP refused to confirm that they were updating Ultraseek to run natively on Windows 2008 R2 or later we gave up and moved into the Lucene world. Why don’t they turn the code over to independent developers. It was a very efficient search engine.
Autonomy swapped out the guts of Ultraseek with the IDOL engine, so the relevance was probably already dead.
Earlier, we had made the spider portion into an outboard spider for Verity K2 (Ultraspider). That was converted into a spider for IDOL.
I see a few search hits for “HP Autonomy Ultraseek”, so they might provide a bit of support for it. A vulnerability was reported in Dec 2013, so it was in use then.
But…it would have needed some major work over the past ten years. The spider is not designed for AJAX-centric web pages. And search collections are just too big now. The Infoseek Ultra search core could handle 16 million documents, period. Today, I’m running a Solr collection in prod with 17.7 million docs.
I did propose open-sourcing Ultraseek. I suggested it to Mike Lynch, in person. I suggested it could undercut Google’s enterprise products. Lynch said that Google didn’t matter, because their engine wasn’t as good as IDOL. That was probably because it was invented by people who didn’t go to Cambridge.