Here’s a better model for book search and discovery

Screen Shot 2016-10-15 at 3.49.38 PMHow are you helping consumers find the perfect book for their needs or interests? If you’re like most publishers, you offer a search function on your site. Visitors simply type in a topic and relevant titles from your catalog are displayed.

This is pretty similar to how search works on Amazon. In both cases, book metadata is used to determine the best matches. So if the search phrase happens to be in a book’s title, description, etc., that title is likely to float to the top of the results.

That’s great, but why not leverage the book contents, not simply its metadata, for the search process. Amazon’s Search Inside feature lets you do this, but only after you’ve selected a particular book. What if you’re a publisher with a deep catalog on religion and someone is looking for the book with the most in-depth coverage of Pope Francis? Metadata-only searches can help, but the full contents are the only way to truly measure topical depth, especially if you want to compare two similar titles to see which one has the most extensive coverage of the search phrase.

Google Book Search (GBS) offers this sort of visibility but most publishers have a cap on the percentage of content visible to GBS users. That’s primarily because publishers want to prevent someone from reading the entire book without buying it.

I believe the solution is to expose all the contents to a search tool and display results that only show snippets, not full pages. That’s exactly what we’re now offering on our bookstore website at Our Sunday Visitor. If you click on the Power Search link at the top of the page you’ll be taken to this new search tool.

If I search for “Pope Francis” I get these results. The top title has 203 hits, so if I click “view 203 results” I can then take a close look at every occurrence of my search phrase in the highest ranked title. Note that this platform takes proximity into consideration, so if you have a multi-word search you can limit the results to just those instances where the words are closest to each other. At any point the user can click on the cover image to read title details or buy the book.

Think about how powerful this tool is for publishers with deep lists on vertical topics (e.g., cooking, math, science, self-help, etc.). Instead of relying exclusively on the book description to make the sale, the contents are fully searchable and comparable across a list of related titles.

We’re in the early experimentation phase with this platform. We’re planning to use a variety of ads that say something like, “find your next great read”; users who click on those ads will be taken to the search landing page where they can explore the full contents of our entire ebook catalog.

This search platform is powered by the outstanding team at MarpX. If you’d like to experiment with this on your site, you’ll find contact info at the bottom of their home page. MarpX has been a wonderful partner for us and I highly recommend you explore their solution as well.

I hope you’ll join us in this effort to move content search and discovery to the next level.


2016 Trend Report: What publishers need to know

Statistics-1020319_1920The Future Today Institute has created a terrific, free report summarizing key technology trends and what they mean for tomorrow. I’ve embedded the report below so you can quickly flip through it.

I read the whole report and highlighted the most noteworthy elements for publishers below. That leads me (once again) to the topic of curation, a very important (current and) future publishing trend. Curation is becoming as important as creation, especially as we’re bombarded with more information than we can possibly consume.

As you read through my curated list below, with slide numbers in parenthesis, be sure to look at each item through the lens of publishing. How will each one of these affect how your content is discovered, acquired and consumed in the future?

Bots (slide 15) – This type of automation will be combined with other emerging technologies, leading to things like highly customized audio learning platforms where the UI is totally voice-controlled (see SVPAs below).

Natural Language Generation (slide 17) – I’ve written before about Narrative Science and I’m confident we’ll see more and more algorithmically-generated content in the future.

Smart Virtual Personal Assistants, or SVPAs (slide 22) – Alexa is the one I use every day when interacting with my Amazon Tap device. Expect this one to evolve quickly as today’s functionality will be considered very primitive in a year or so.

Ambient Proximity (slide 23) – Beacons haven’t taken off yet but they represent such an interesting opportunity. Think of all the interesting things your local bookstore could do with beacons and promotional content.

Attention (slide 25) – Despite the lame name, this one will have a significant impact on the ongoing evolution of content presentation, especially when married to beacons and additional knowledge of the user’s current state.

Ownership (slide 36) – Up to now, creators of user-generated content seem more interested in visibility than compensation, but how long will that be the case?

One-to-few Publishing (slide 39) – Podcasts are dead, right? No, in fact there’s a significant opportunity in smaller, more tightly-focused audiences. This market concentration likely leads to higher subscription prices and/or advertising rates.

Intentional Rabbit Holes (slide 42) – Great concept that’s all about deeper engagement. What services can you add to your site or content to encourage readers to take a deeper dive and perhaps expose them to additional monetization opportunities?

Augmented Reality (slide 52) – It’s been around for a while but was only recently legitimized by Pokemon Go. Think of all the ways your content could be augmented via tools like Layar, for example.

Internet of X (slide 63) – Let’s say you’re a publisher of architecture books and other short-form content about design and construction. What’s preventing you from creating The Internet of Architecture?

Each of these are on different timelines, of course, and won’t affect content at the same moment. All of them, however, are likely to have a profound impact on just about every type of content in the next few years.


What can publishers learn from Pokemon Go?

Pokemon-go-logoLong considered nothing more than a gimmicky fad, it turns out that augmented reality (AR) is actually alive and well. At least that’s the case when it’s associated with a brand as large as Pokemon.

By now you’ve undoubtedly heard all the Pokemon Go stories and maybe you’ve even dodged a player or two, overly-focused on their phone while embarking on a virtual hunting expedition. On the surface it’s nothing more than another time-wasting game but I believe it offers some very important lessons for publishers.

Let’s start with the hybrid, print-plus-digital opportunity. Recent reports indicate ebook sales have plateaued and growth has shifted back to the print format. There are a number of underlying reasons for these trends including higher ebook prices as well as the adult coloring book phenomenon. But as I’ve said before, publishers need to stop thinking about print and digital as an either/or proposition. Some customers prefer print while others lean towards digital. Many readers are in both camps, switching between print and digital based on genre, pricing, convenience, etc.

Most publishers overlook the fact that digital can be used to complement and enhance print. Skeptical? Have a look at a few of the demos Layar offers on this page.

Stop and think about how something like Layar could be used to bring your static pages to life. Maybe you publish how-to guides, print is your dominant format and you’ve always wondered how you could integrate videos with the text. You’ve tried inserting urls but very few readers bother typing them in. QR codes are an option but they’re clunky and take up precious space on the page. Why not use AR to virtually overlay those videos on the page without having to dump in a bunch of cryptic-looking urls or QR codes?

Are you looking to engage your readers in the book’s/author’s social stream? Here’s your chance to integrate them virtually using a platform like Layar.

Better yet…have you always wanted to know who all those nameless, faceless consumers are who bought your print book from third-party retailers like Amazon or Barnes & Noble? Here’s an opportunity as a publisher or author to initiate a conversation directly with your readers. Add an Easter egg to the print edition where readers can receive a reward via an AR-powered offer; you will, of course, ask for each reader’s name and email address before handing out those rewards.

This approach to marrying digital to print is totally unobtrusive. Print readers who don’t want to bother with their phones can continue reading the book without interruption. Those customers interested in learning more, interacting with authors or uncovering special publisher offers will likely see the value of connecting their phones with the printed page.

The possibilities are endless. So the next time you see a Pokemon Go player wandering aimlessly be sure to thank them for helping identify new ways of distributing, promoting and enriching content.


Here’s how Siri, Alexa and other IPAs will revolutionize publishing

Information-1183331_1280For the past several years I’ve been writing about how containers such as books, newspapers and magazines are slowly fading away. They’ll certainly be around for many years but their relevance will slip into the background as personalized, digital content streams become more important.

The more I think about the future the more I believe two other trends will have an even more significant impact on reading, learning and engaging with content: voice user interfaces (VUI) and artificial intelligence (AI).

Today Apple’s Siri and Amazon’s Alexa are mostly perceived as gimmicks. Tomorrow these intelligent personal assistants (IPAs) will become the gateway to a whole new way of consuming and interacting with content.

A few weeks ago I wrote about how these IPAs need to break free of their current apps and devices, becoming platforms to a broader set of content services. It’s great that Amazon’s Alexa can now be experimented with via the Echoism.io site, but how long will it take before these services realize their full potential, not simply serve as a way to ask whether or not it will rain tomorrow?

Ultimately, I’m convinced these IPAs will enable us to have conversations with the most knowledgeable experts we’ll never meet and who really don’t even exist. Think about that for a moment.

It’s one thing to ask Alexa questions like, “what was the score of last night’s Cubs game?” or “what was Muhammad Ali’s most famous quote?”. It’s entirely different when you treat the device like a trusted advisor or teacher by asking things like, “who was the best Cubs player of all time?”; in this case, the response can’t simply be retrieved from a reference guide as it requires a highly subjective answer based on gathering and interpretation of facts as well as a healthy dose of conjecture. That’s where AI comes into play.

The model I’m describing likely requires AI capabilities that are more powerful than today’s. In 2016 company like Narrative Science can take a baseball game box score and turn it into a two-paragraph newspaper summary; tomorrow these AI platforms will need to be able to tell more of the story as well as answer questions like, “how did Anthony Rizzo get to second base in the fourth inning?”.

Let’s apply this to a more interesting, lengthier use-case. Maybe I want to learn about electricity and electrical wiring for a home project I’m working on. I want to do this all via voice and audio during my daily commute to and from work. Today I could turn to a variety of YouTube videos, websites and books. Tomorrow I want to simply start with this request: Tell me the essentials of electricity.

The IPA then dives right into a tutorial, perhaps taken from one of those resources noted earlier (e.g., books, websites, etc.) The session is highly interactive though. Every so often I might ask a clarifying question like, “what’s the difference between the black wire and the white wire?” or “is a wire nut OK on its own or should I also wrap the connection in electrical tape?”, and the assistant provides the answers then returns to the lesson.

To contrast, in today’s world we’re used to thinking in terms of the document model and how search results are simply an intermediate step. That step might just be one of many the user has to proceed through to ultimately get their answer. In the IPA world of tomorrow the experience needs to feel more like a conversation with an old friend or instructor; the IPA selects the best path rather than relying on you to find the needle in the search results haystack.

All of this dialog presumably will go through the Amazon’s and Google’s of the world and the answers come back through those same gatekeepers as well. But ultimately consumers will insist on the dialog and answers coming from other trusted brands and sources. So one day I might start that electricity session by saying something like, “take me to the Home Depot channel” and then I can have my dialog within an ecosystem of more reliable, highly relevant content and responses.

In order to make this giant leap the content must either be richly tagged, thoroughly analyzed by a powerful AI platform or a little bit of both. Either way I’m excited about the new opportunities it represents.


Let’s take “Search Inside the Book” to a whole new level

Telescope-187472_1920Do you remember when Amazon introduced both “Look Inside” and “Search Inside” functionality for books? They were such simple yet revolutionary features at the time. Before Look/Search Inside it was impossible to do a simple flip test like you could at a brick-and-mortar store.

Fast-forward to today where we take Look/Search Inside features for granted, so much so that there’s been virtually no innovation on this front. I believe there’s a real opportunity here though to help consumers find what they’re looking for as well as significantly improve the overall content discovery and evaluation process.

Let’s start with a simple question: Why are Search and Look Inside both limited to individual books? What if my first problem is to figure out which book has the most in-depth coverage of topic xyz? Let’s say I want to do some research on the Pittsburgh Pirates, specifically looking for coverage of a former player named Dave Parker. How do I find the book with the most in-depth coverage of Parker?

The typical approach is to search on Amazon. The search results there are initially sorted by relevance and you might think that’s the end of the story. But all Amazon is really doing is searching the metadata associated with each book; they’re not searching the actual contents of the books to push titles with higher relevance to the top of the results. That means books with that name or phrase in the title often get pushed to the top.

Take a closer look at those search results and you’ll quickly appreciate just how ineffective the current Amazon solution is. You’ll need to skip past the first four results as they’re not books at all; I requested “books” only but the results reflect the challenges Amazon has with internal product types and definitions. Those are followed by a couple of titles that have nothing to do with Dave Parker the former baseball player but they happen to be authored by another guy named Dave Parker. This shows how much Amazon’s search prioritizes a book’s metadata; there are probably very few references to “Dave Parker” inside those books but these titles float toward the top of the results simply because of the author name. Next is a book about Dave Winfield, another former baseball player, which looks promising. The problem here is that it made it to the first page of results because the book’s co-author is Tom Parker, so when Amazon sees “Dave Winfield” and “Tom Parker” next to each other it thinks there’s a hit because of the former’s first name plus the latter’s last name. Ugh.

At this point you might think the solution is to go to Google Book Search. Take a look at Google's results and I think you’ll agree I’m no closer to finding the right book than I was at the start. To be fair, Google Book Search is a better solution than Amazon’s search but there are still some enormous holes. For example, although Google’s service is searching the book contents it’s still highly biased by the metadata. Just look at the author names of the first several titles in those search results and you’ll see what I mean. Also, Google is severely limited because their solution is tightly connected to their book preview service. That means Google will only show you some of the pages with hits, hiding many others and then completely cutting off your view once you reach a certain threshold.

What we really need is something like Google Book Search across an entire library, with full visibility into all the content, featuring an algorithm that’s smart enough to focus on true relevance and isn’t thrown off simply by metadata. The results would show two or three lines of the text surrounding each hit so the reader can appreciate the context throughout.

This uber-search would be powerful for some types of books and totally useless for others. For example, there’s absolutely no need for it in the fiction space but think about how useful it would be in non-fiction areas like business, science, technology, biography, cooking, etc. I see this as a service a publisher could place on their website, dramatically improving the current metadata-only search results you typically find.

In fact, this uber-search vision is a service my OSV colleagues and I are currently exploring with a third-party developer. Before we get too far along with it we wanted to describe it for the publishing community to see if anyone knows of a better solution that already exists. We haven’t found one yet but as we roll it out we’ll be sure to describe the process here so other publishers can learn from our experience and potentially embrace our solution as well.