The sorry state of ebook search results
Why is Google so popular and how does it quickly help you find what you’re looking for? It’s all about their algorithm. Google uses a variety of metrics, including how many inbound links a site has, to determine what’s in their search results and how those results are presented.
Imagine Google without their algorithm. Rather than using all these metrics to figure out which site is most relevant, they just give you a list of sites that happen to contain your search phrase. Pretty worthless, right? So why do we accept that same, lame functionality in ebooks today?
Let’s look at an example. I remember reading Jean Edward Smith’s terrific FDR biography awhile back and wanting to go back to re-read the details about Hyde Park. The author provided information about the location earlier and I wanted to find that specific part of the book. Here’s what I got in the Kindle in-book search results:
What an awful user experience. I get every instance of the phrase, listed in the order of appearance in the book. There’s absolutely no indication of how in-depth the coverage of Hyde Park is in any section; I’m left to figure that out on my own.
Now take a look at these search results:
There I searched for the phrase “BeagleBone” in a technology book and each of the results has a score associated with them; the results with higher scores offer more in-depth coverage of the topic.
How did I produce those results? They came from an ebook reading platform that does much more than simply reproduce “print under glass.” The content comes into the system as a simple, text-embedded PDF. It’s then analyzed and converted to render in a browser-based reading engine. No third-party apps or plug-ins are required.
The magic is in the content ingestion process. This platform knows when a phrase appears in a first-level heading vs. a second-level heading as well as how many times it appears on that page or in that section. In short, it applies technology to produce a far superior set of search results.
When will we see this type of functionality in any of the popular ebook reading apps? I’m not holding my breath. The leading vendors apparently don’t see a need to bring their search capabilities out of the dark ages.
If you’d like to learn more about this platform you’ll find summary information here. You’ll also notice it’s the ebook platform solution offered by my employer, Olive Software, Inc. I may not be a book publisher anymore but I’m thrilled to be part of an organization that’s helping lead the industry forward. Relevance-ranked search is just one of the cool innovations that sets us apart. Let me know if you’d like to learn more.
The "awful user experience" on a Kindle looks about the same as you would get reading a printed book and trying to find something - go to the index at the back of the book, search, end up with a list of page numbers. At least Kindle gives you a small extract for each location found, printed books generally do not.
Posted by: Henry Wood | February 25, 2014 at 01:44 AM
Hi Henry. Why should we have to accept the limitations of print when the digital platform has so many more capabilities? Frankly, I think that's one of the big problems in the publishing industry: It's always trying to emulate print in digital. That only limits the results and is one of the reasons I'm so glad so many startups are being created by people *outside* publishing, so we don't limit the possibilities.
Posted by: Joe Wikert | February 25, 2014 at 08:59 AM
I'd amend that to the sorry state of any word-dependent search. Pity some poor soul (me on some occasions) who must look for a term about which he knows little, but that's also the name of: 1. a rock band or 2. some new bit of technology. You'll be flooded with a great pile of hay out in which you must look for that needle. You don't even know enough to come up with limiting terms.
That said, Google search usually does a good job. My gripe lies with the utter sloppiness of Google Books, scans made by poorly paid people who didn't care and entered into a database whose contents shows no awareness of the need for additional information about that book.
Posted by: Inkling | February 26, 2014 at 10:30 AM