The most surprising aspect of Kindle MatchBook
The "more content is better" myth

In search of a better search

Imagine Google's search results with no sophisticated algorithm behind them. Rather, when you type in your search phrase and press Enter, Google simply shows you a list of websites where that phrase can be found. No indication of relevance. No ranking mechanism. It's just a list of the sites that contain the phrase. Maybe the list is arranged in chronological order, where the older sites containing your phrase appear first.

Pretty worthless, right? So why do we accept that as our search solution in every major ebook reader app? Open an ebook, search for a phrase and the results merely list each occurrence of it, arranged from page 1 through the end of the book.

You might think a sophisticated search would only be useful for specialty products like textbooks and other reference materials. I disagree. I could see something like this being quite useful in novels, for example. Let's say you forgot who a minor character is and you'd like to quickly learn more about them. Sure, you could do a simple text search and see where the character first appears, but wouldn't you prefer results that provide some context? Maybe it's really the fourth occurrence of that character's name where you'll get the real details on their role in the story. That wouldn't be easy to figure out with today's ebook search capabilities.

And yes, I'm aware of at least two specialty ebook platforms that offer better search results. That's because they have editors who spend hours and hours parsing the content to build this feature manually. I have one word for them: scale. Their solution simply doesn't scale...more on that in a moment.

Here's another use case for a smarter ebook search feature: catalog-wide search. I challenge you to go to Amazon or a publisher's website and use their site search feature to tell you which book has the most in-depth coverage of a particular topic. Let's say you're looking for a book about creating websites. You want one that provides thorough coverage of HTML but also offers a solid introduction to JavaScript. You can't use site search on Amazon or a publisher's site to figure this out. You'll have to look at each book's table of contents and determine the answer yourself, one book a time. A better solution is one where the results show you exactly how deep the JavaScript coverage is in each book, arranged where the books having the most in-depth coverage appear first.

Now back to the scale problem... The key here is to enable these richer search results without requiring a bunch of manual labor. Book publishers are trying to reduce staff and cut costs, not add more of either. So the only way to deliver this service is through a software solution where the content is analyzed and a rich, context-sensitive index is created.

Does that sound far-fetched to you? I don't think so, In fact, I believe we'll see a service like this very soon. I know I'll get a lot of use out of it and I bet you will too.


David Grace

I would settle for decent search tools in Amazon itself. For example, if they put three radio buttons, Title, Author and Keywords, under the search box that would vastly improve the search results. For example if you entered "steel" and the Title button was checked you would get a book titled: Blue Steel. If author was checked you would get Death Moon by Bret Steel, etc.

8000 titles appear if you enter the key words "police procedural." You can't sort them by author. You can't limit them to books first published in the last year. You can't limit them to ebooks between $3 and $8. You can't limit them to books with at least a 4 star rating AND sorted by price. Etc.

Unless you know what you're looking for in advance, Amazon and the rest as essentially a big dogpile but good luck getting them to have a programmer spent maybe two days to add some front-end search and sort tools. I admit I don't understand the reluctance.

--David Grace

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Comments are moderated, and will not appear until the author has approved them.

Your Information

(Name and email address are required. Email address will not be displayed with the comment.)