Project Cassini – Pulling the wraps off eBay’s top-secret new search engine project

January 25, 2012

ChannelAdvisor ChannelAdvisor By ChannelAdvisor



Long-time readers will recall that I have a background in Computer Engineering so from time to time do like to geek out on the intersection of e-commerce and technology.

Today’s post is one of those times!! But hang-on, I’ll ‘land the plane’ and we’ll talk about eBay buyers and sellers in a bit.

Also by way of background there is a big trend in software driven by online businesses like eBay and cloud computing called BigData.  BigData is all about how can you store a ton of online data, manipulate it, analyze it, etc.. – essentially use data to grow your business more rapidly.

In the World of BigData there is a very popular system, called Hadoop,  which features a cute little yellow elephant (big data get it?) for the logo and is modeled after the Google MapReduce+File System technologies.  Hadoop is becoming the go-to system for companies looking to tackle BigData problems and eBay is no exception.

At the end of 2011, eBay technologists started speaking about their top-secret Hadoop-based project called Cassini as a way of recruiting engineers to help with their efforts.

Interestingly (for you eBay vs. Amazon watchers) the efforts appear to be driven out of the Seattle engineering centre for eBay which is led by Ken Moss, who along with a ton of current eBay technologists came from Microsoft.  You can read the job openings here which give you some more insights into what they are up to.

Finally, I’m on the record for the last 5+ years as being frustrated with eBay’s lack of innovation in the buyer experience.  It hasn’t really moved much in the last 2yrs even though eBay is reportedly ramping up R+D spending.

I’m optimistic that Project Cassini will answer all of these questions – What’s eBay doing with Hadoop? What’s eBay doing to improve the buyer/search experience? Why is eBay ramping up a Seattle dev office?

What is the mysterious Project Cassini?

We’ve been hearing rumblings and whispers about this ‘big’ (e.g. lots and lots of resources) project at eBay for the last two years.  In the last 60 days, two presentations that are out on the net from the different talks that eBay has given around Cassini have leaked that reveal what is going on with this (previously) top-secret project:

  • Hugh E. Williams, VP of Search gave this talk (embed) at Hadoop World/Cloudera.

Hadoop World 2011 Keynote: Ebay – Hugh Williams

  • Juhan Lee, Director of Hadoop engineering, gave a talk in China and his slides are available here.

This slide from Juhan’s preso does a great job of explaining what eBay’s up to with Project Cassini


In a nutshell, Project Cassini appears to be a complete re-write of the entire eBay search infrastructure and software system.  In Hugh’s talk, he highlights that the current search system (Voyager) goes back to 2002 and has tons of problems and flaws.

eBay has put over 100 engineers on Cassini for the last 18 months and will be launching it in 2012 to have an entirely new eBay search experience.

Details on the release (timing and features) are scant, but there are hints in these decks and other comments:

  • Cassini uses all data by default – I presume this means that Voyager was limited in the datafields it could index and manage, but Cassini explodes that out.
  • “Platform for ranking innovation” – This hints that the new system will have an entirely new ranking/ordering algorithm (just when we got used to BestMatch!) or it also hints that eBay will be able to do a lot more testing and tweaking than the current system allows for.
  • Much more history in our index – The BM RecentSales algorithm only looks back ~15-30 days, this must be because they have to get the data out to keep the system crisp. It sounds like Cassini will keep a lot more data and thus allow for a longer RecentSales window.
  • Ability to rescore entire site inventory – eBay could implement 4 flavors of best match, (or give it a new name) and then pick the winner much faster than with Voyager.

The new system will be able to handle the 250m searches/day and 2b pageviews so it has some big shoes to fill.

What does Cassini mean for Buyers?

It’s early to tell and these decks have no screen shots or anything, so this is pure speculation (with a healthy sprinkling of wishful thinking).  Let’s face it, BestMatch just isn’t very good.  You have to think WWGD (What Would Google Do) and WWAD (What Would Amazon Do) on this topic.

Amazon – well we know what they would do – they would have a gold-standard catalogueueue and then sellers would list against that.

Google – Google would make the search engine so good you wouldn’t have to have bestmatch, lowest, highest, lowest with shipping, lowest without shipping, highest with shipping, ending soonest, newly listed and the myriad of other confusing options that the eBay search engine has today.

Going back to 2009 – eBay started the NPSE (New Product Shopping Experience) which we covered here.  The experience was and still is pretty buggy – you can tell there’s something just not ‘right’ behind the scenes.  Also, eBay has not made much progress at all rolling this out, so something is definitely up – it either doesn’t perform well or breaks their infrastructure.  Perhaps Cassini will solve the back-end problem and we’ll finally see the experience a) work and b) roll out further

One concern I do have that I worry Cassini+eBay aren’t doing enough to solve – if you do any general search on eBay for hot items and sort by lowest price, there is still a lot of really weird and scary stuff on the site.  Here’s just one random example:

I did a search for an iPhone 4S, chose ‘New in Box” and “cell phone (not accessories) and did a low-price search.  I took a screen shot and dropped in red numbers so I can reference the top 9 results here: (click to enlarge)



  1. Ok this one is an accessory and I explicitly said no accessories – Seller mis categorised
  2. Not sure what this is, but it’s not an iPhone
  3. Seems to be an iphone 4s for $495, but look at the picture, this is listed as unopened – ok?
  4. Complete fraud if you ask me, you are getting an iPhone 4s box for $500 – this is going to create a great buyer experience.
  5. Ok – looks like we get an iPhone, but what does “BAD ESN” mean?  Here’s what the seller says: “You are bidding on a sprint apple iPhone 4s black 16gb in brand new condition with a bad esn .still in plastic as you can see in the photos. Cannot activate on the sprint network, however it can be flashed to work on other networks .”  I’m going to go out on a limb and assume most folks won’t be able to get this thing to work and notice the BAD ESN thing.
  6. I selected NEW IN BOX – this says “NO BOX” right there in the title? (Top Rated Seller!)
  7. Seems like a real iphone4s for $500
  8. Seems real

So out of the top 8 results, we don’t hit what seems to be a real item we are looking for until we get to result 7 and 8.   That’s 6/8 or 75% of the top listings that are bad for those keeping track at home.

In CompSci we have the idea of GIGO (Garbage In, Garbage Out) – eBay could have the world’s best search technology and if it just serves up bad listings faster, it won’t improve the buyer experience.

Hopefully eBay has a dual approach – technology and clean out the garbage that will yield results.

What does Cassini mean for Sellers?

Obviously a better buyer experience is good for everyone in the eBay ecosystem.  Using my example above, not only is it bad for the seller of iphone number 7 and 8 that there are 6 bad results above them, but when the eBay buyer spends $500 for that empty box in listing 4, or gets an iphone without a box when they they thought they would – they will do everything they can to get a refund and leave eBay probably forever.

What I think sellers can anticipate as a ripple from Cassini is:

  • eBay will have to go through a cleansing period to clean up the bad data that has calcified
  • eBay will then want a whole bunch of new data from sellers – this can be painful, but does improve the buyer experience so is a necessary evil
  • Most sellers ignore or don’t even know about the NPBE so will need to revisit this as hopefully it rolls out site-wide post Cassini.

Timing and what do you think?

eBay has been very hush hush on the Cassini roll-out, saying only ‘maybe 2012’.  In the past they have rolled out new technology like this in smaller markets (usually AU, IT, DE and UK are the first guinea pigs), so we’ll keep a watch out for that.

That’s all the scoop on Project Cassini – let us know your thoughts, questions and comments in comments.

SeekingAlpha Disclosure – I am long Google and Amazon. eBay is an investor in ChannelAdvisor where I am CEO