The lost art of indexes in ebooks
When was the last time you used an index in an ebook? Maybe the better question is this: Have you ever used an index in an ebook? One of the challenges here is that most ebooks don’t have indexes, the result of the misguided notion that text search is a better solution.
Every so often I come across an ebook with an index. More often than not it’s just the print index at the end of the book, sometimes with nothing more than the physical page references that offer almost no value in a reflowable e-format.
Fiction represents a large chunk of ebook sales and those books generally don’t benefit from an index. The same is true for some types of non-fiction books. But for pure reference guides, in-depth how-to’s and other works, an index can be pretty useful.
If you’re relying exclusively on text search in an ebook you have to know exactly what you’re looking for. More importantly, why do we settle for such a lame text search solution when we’re spoiled every day with powerful, relevance-ranked search tools like Google?
When you search for a phrase in an ebook the results are shown in chronological order. You see all the occurrences from the beginning of the book to the end. Imagine if Google worked that way. So when you type in a phrase Google tells you the first (oldest) site to use that phrase, then the next oldest site that used it, etc. Users would laugh and reject it, yet that’s exactly what we’re forced to accept in ebook search.
What I really want is relevance-based results. Show me the location in the book with the highest density of that phrase and prioritize occurrences of it in a heading over occurrences in body text. I’m sure there are other attributes that could be rolled into an effective ebook search algorithm but I’ll take just those two features for starters.
The other problem with relying on search instead of an index is that you lose the benefit of synonyms and related terms. An indexer takes all that into consideration so you’re much more likely to find everything you’re looking for with a good index than a simple text search.
I’m not lobbying for back-of-book indexes in ebooks like they appear in print books. That’s another aspect that needs to change when you go digital. I want to see index functionality right there on the page I’m reading. The trick here is to offer it in a manner that’s not disruptive for the reader.
Remember that article I wrote a few weeks ago with the video showing a vision for auto-enriched ebooks? The same UI approach described there could be used here. The content is initially presented in as clean a manner as ebooks are today. But when you tap the screen on your tablet all the phrases that are indexed magically change color or are denoted with some other UI effect (e.g., underline). Just tap the phrase you’re interested in and a pop-up appears with relevance-ranked index results. These would be presented in a scrollable list with each entry having a preview of the text from that location in the ebook. Make it easy for me to bookmark those entries right in the pop-up. The net result is a way to quickly and easily access a smarter index without having to leave your current location.
This feature doesn’t exist today because we’re still stuck in the print-under-glass era of ebooks. I’m optimistic that one or two of the popular reading applications will eventually add such a capability though and help us get beyond today’s model where we’re consuming so much dumb content on all these smart devices.
Hi Joe: I think of indexing the way Bill Kasdorf explained it in The Columbia Guide to Digital Publishing: "a good index provides an intellectual view of the content unavailable by any other means. It is the result of an intelligent reading by an indexer trained in recognizing and documenting the interrelationships of the intellectual content; the indexer not only notes topics and subtopics, but also makes judgments about them, selecting the most important and relevant sections to direct readers to."
Put that way the index forms a semantic abstraction of the text, and its value goes beyond just locating words or phrases toward communicating concepts about the content of the text.
Your ideas about embedding the index within the book make sense, but I still find value having the index available in one place.
Posted by: Thad McIlroy | March 13, 2016 at 10:45 PM
You're right, Thad, and I believe our points of view are not mutually exclusive. The end-of-book index can co-exist with what I described in the article.
Posted by: Joe Wikert | March 14, 2016 at 08:12 AM
The American Society for Indexing and other indexing societies around the world certainly promote indexes in eBooks (See http://www.asindexing.org/about-indexing/digital-trends-task-force/).
Many pundits in the industry are pro-indexes (See http://www.levtechinc.com/indexing-resources/why-create-indexes.asp). There several reasons, often misguided, about why indexes aren't needed (See http://www.levtechinc.com/pdf/ExecutiveSummaryForPublishers.pdf). Last fall the IDPF approved a specification for EPUB3 Indexes (See http://www.idpf.org/epub/idx/). Some software has been developed to provide linked and EPUB3 indexes but publishers must integrate these into their workflow.
Posted by: David K Ream, Co-Chair IDPF Indexes WG | March 14, 2016 at 10:12 AM
One of the biggest issues is that Amazon actively discourages indexes in CreateSpace. Indexers have devised many ways of getting active indexes in ebooks, each aimed at specific publishing workflows. Adobe has now made it possible to export an ebook as EPUB with an active index in one step in InDesign, so if there is an embedded index, you have your ebook index ready to go.
The EPUB 3 indexes spec has been approved, and has the capacity to enable a lot of the functionality you describe. We need to get the hardware reader companies to support and implement it. We are ready for live indexes, we have published many live indexes already, but we need the reading hardware and software to support the markup. We are open to the discussion and have knocked on the doors of companies like Amazon. I wish I could say someone answered, but they don't.
Posted by: Jan Wright | March 14, 2016 at 11:41 AM
Thanks for adding your valuable comments to this discussion, Jan and David.
Posted by: Joe Wikert | March 14, 2016 at 11:48 AM
I bought a Kindle version of a nonfiction book originally published in print with an excellent index. The index was reprinted in the Kindle edition with all the locators stripped out.
Posted by: Laura Gottlieb | March 14, 2016 at 12:54 PM
Indexes can be useful, particularly with complex non-fiction. But there are issues beyond usefulness.
I've done tolerable-but-no-more indexes for my books and those of others. But a well-done index requires an experienced expert and doesn't come cheap. Since most users don't expect to see an index, particularly in ebooks, there's little incentive for publishers to pay for them. If anything, their incentive is to leave them out so that expectation never develops.
Ebooks do allow users to take notes as they read and that can become a crude, personalized index.
Adobe certainly deserves commendation for allowing their print index to become an active index in the ebook version.
Posted by: Michael W. Perry | March 14, 2016 at 02:08 PM
Thanks for posting this thought-provoking article (especially for introducing me to the "print-under-glass" metaphor). As an indexer, editor, and book producer, I think your idea of a "subsurface," pop-up index holds excellent possibilities, but it doesn't support an important back-of-the-book index feature: browse-ability. When I consider reading (and purchasing) a nonfiction title, my first stop is the front matter, but the real test is the index. By browsing through it, I'll know almost immediately whether the book holds information pertinent to my interests and needs. I don't know whether a live index could provide such a tool, but if not, couldn't a static-page pop-up index offer much the same capability? The other element I'd miss is the visual structure; by itself, a concept main entry provides little insight--it's the cluster of subentries that fall under it that demonstrate the author's principal interests.
Concerning Michael Perry's observation about the cost, I'd like to note that most publishers' author contracts put the onus of creating the index on the author. Chipping away even more from a typically paltry advance is no incentive to add an index (although writers of serious nonfiction are apt to recognize its importance anyway). It can also persuade authors to index the book themselves, which is a bad idea in just about every instance.
Posted by: Brian Hotchkiss | March 15, 2016 at 10:41 AM
You're quite welcome, Brian. I'd like to clarify something... The vision I described in the article isn't mutually exclusive with an index in the back of the book. I can see where it would still be beneficial to include the back-of-book index as well as the inline one, mostly for the points you described.
Posted by: Joe Wikert | March 15, 2016 at 12:27 PM
It is possible to create a linked index in Word using XE tags. When transferred into the XML properly, these tags generate page references for the print version of the book, and hyperlinks for the ebook. It is definitely possible to put the XE anchor tags against one phrase in the book, but have it appear under another word or phrase in the index. You can add cross references, invert names to last name, first name, and create subentries --and this is all done at the manuscript stage, before typesetting, so the index appears with the first-pass page proofs.
Cambridge University Press has been using this method for quite a few years. At least a couple of their professional freelance indexers have adapted to this system. Authors sometimes find it more cumbersome, but it saves time in the production process, and makes the index more usable in the ebook.
Posted by: Cathy Felgar | March 16, 2016 at 02:48 PM