Dumb content on smart devices

The original Amazon Kindle is almost seven years old and the first iPad was released more than four years ago. Plenty of other e-readers and tablets have followed and the digital content marketplace is vibrant. So why do we spend most of our time reading dumb content on smart devices?

I’m talking about ebooks, newspapers and magazines that are oftentimes nothing more than digital replica editions of the print versions. “Print under glass” is how they’re sometimes described and I think that sums it up.

The phone in your pocket contains more computing power than an Apollo spacecraft yet we’re still largely consuming content that rarely, if ever, takes advantage of the device’s capabilities.

Why?

One reason is because the old model works. People are used to reading content in a print format, so why change? More importantly, change is hard, expensive and often requires the publisher to overhaul their editorial and production process, all of which is costly and risky, particularly when the new model is unproven.

An alternate solution is to add digital capabilities in layers to the existing print format. IOW, keep your current editorial and production model in place but layer on digital content and capabilities as a post-production process. Test a few titles out, see which ones do well and invest in even more digital layering for the winners.

Curious to hear more about this? Join us next Tuesday, September 30, at 1:00PM ET for a short webinar where you’ll hear from a publisher who is already doing this. Click here to register now for this free webinar.


How print is slowly killing publishers

It’s a textbook example of The Innovator’s Dilemma. The crazy part is we all know it’s a big problem and yet very few publishers are taking evasive action.

I’m talking about the reliance on print, even at the expense of digital transformation and growth. Here are a few reasons why print is a publisher’s silent killer:

Presentation style – A newspaper is pretty much defined by what appears on the front page as well as how everything else follows it in each edition. Books aren’t that different; they have a beginning, middle and an end. Digital content, on the other hand, isn’t as rigidly defined. Have you ever reached the end of a Google News feed, for example? Publishers who have deep roots in the print world are often too focused on how the print product is presented, often allowing it to drive their digital product.

Workflow – Presentation style is, of course, closely tied to workflow. A publisher who built their editorial and production models around print is likely to apply the same existing model to digital. This is the main reason many publishers evolve from static to dynamic content. And when they experimented with digital they often overhauled their workflow with disastrous results. The problem wasn’t due to a new workflow. Rather, a product nobody wanted was created and the new workflow got tossed aside with the failed product.

“It’s all we know, what we do best” – Editorial teams that have perfected print delivery often have problems adapting to digital. It’s foreign to them and outside their comfort zone. One publisher recently told me they’re pulling digital strategies out of editorial and into a small, centralized team; publishers and editors are to think about print and print only. Meanwhile, print revenues continue to decline. It might make the editors more comfortable but it’s also got to be pretty demotivating. Now is not the time to revert to comfort zones.

Opens the door to startups – As The Innovator’s Dilemma teaches us, disruption is great for the startup and often not so kind to the incumbents. It’s actually an excellent opportunity for established publishers to engage with startups but that rarely happens. As one startup founder recently shared with me, “I get the impression the publisher wants to simply copy our technology, not partner with us.”

Print defines your brand – This is probably the biggest killer of all if your brand is directly associated with print. When consumers hear your brand name all they can think of is a print product. There’s no association with digital whatsoever. Newspapers struggle mightily with this one. The solution is tough to swallow: Create a new brand that’s built around and tightly aligned with digital. It’s OK to say “Powered by old-print-publisher”, but the main brand needs to be detached from your existing, print-centric name. 


Structured documents for science: JATS XML as canonical content format

MollyIt’s only my 7th day on the job here at PLOS as a product manager for content management. So it’s early days, but I’m starting to think about the role of JATS XML in the journal publishing process.

I come from the book-publishing world, so my immediate challenge is to get up to speed on journal publishing. And that includes learning the NISO standard JATS (Journal Archiving and Interchange Tag Suite). You may know JATS by its older name, NLM. As journal publishing folks know, JATS is used for delivering metadata, and sometimes full text, to the various journal archives.

But here’s where journal and book publishing share the same dilemma: just because XML is a critically important exchange format, is it the best authoring format these days? Should it be the canonical storage format for full text content? And how far upstream should XML be incorporated into the workflow?

Let’s look at books for a minute. The book-publishing world has standardized on an electronic delivery format of EPUB (and its cousin, MOBI). This standardization has helped publishers drill down to a shorter list of viable options for canonical source format. Even if most publishers haven’t yet jumped to adopt end-to-end HTML workflows, it’s clear to me that HTML makes a lot of sense for book publishing. Forward-thinking book publishers like O’Reilly are starting to replace their XML workflow with an HTML5/CSS3 workflow. HTML/CSS can provide a great authoring and editing experience, and then it also gets you to print and electronic delivery with a minimum of processing, handling, or conversion. (O’Reilly’s Nellie McKesson gave a presentation about this at TOC 2013.) And which technology will get the most traction and advance the most in the next few years, XML or HTML? I know which one I’m betting on.

In terms of canonical file format, journal publishing may have one less worry than book publishing, because many journals are moving away from print to focus exclusively on electronic delivery whereas most books still have a print component. Electronic journal reading—or at least article discovery—happens in a browser; therefore, HTML is the de facto principal delivery format. And as much as I’d like to think HTML is the only format that matters, I know that many readers still like to download and read articles in PDF format. But as I mentioned, spinning off attractive, readable PDF from HTML is pretty easy to automate these days. So I ask:

If XML is being used as an interchange format only, what do we gain from moving the XML piece of the workflow any further upstream from final delivery?

Well, why does anyone adopt an XML workflow? The key benefits are: platform/software independence (which HTML also provides), managing and remixing content to the node level (which is not terribly useful for journal articles), and transforming the content to a number of different output formats such as PDF, HTML, and XML (HTML5/CSS3 can be used for this transformation as well, with a bit of toolchain development work).

But XML workflows come with a hefty price tag. The obvious one is conversion, which is not just expensive, but costly in terms of the time it takes. Another downside is the learning curve for the people actually interacting with the XML—how many people should that be? In the real world, will you ever get authors, editors, and reviewers to agree to interact with their content as XML? So more likely than not, you’re either going to need to hide the fact that the underlying format is XML through a WYSIWYG-ish editor that you either buy or build (both are expensive), or you’re doing your XML conversion towards the end of the process. On a similar note, how easy is it to hire experienced XSL-FO toolchain developers? But developers who work in the world of HTML5, CSS3, and JavaScript are plentiful.

So building an entire content management system and workflow for journal publishing around XML—specifically JATS XML, which is just one delivery format, that isn’t needed until basically the end of the process—doesn’t seem like a slam-dunk to me. I should clarify that using JATS XML for defining metadata does seem like the obvious way to go. But I’m not so sure it’s a good fit to serve as the canonical storage format for the full text. One idea is to separate article metadata from the article body text, to leverage the ease-of-editing of HTML for the text itself.

What about moving HTML upstream, and focusing efforts on delivering better, more readable HTML in the browser? What about shifting focus away from old print models and toward leveraging modern browser functionality, maybe by adding inline video or interactive models, or by making math, figures, and tables easier to read and work with?

Just to throw a curve ball into the discussion, I attended Markdown for Science last weekend, where Martin Fenner and Stian Håklev led the conversation about whether it makes sense to use markdown plus Git for academic authoring and collaboration. I want to hear from as many sides of the content format conversation as possible.

So, what do YOU think?

This article was written by contributor Molly Sharp, appeared earlier on the PLOS site and has been presented here with permission of the author. Molly has worked in various content management-related roles since the late 90′s, when she led the implementation of an XML editing and production system for Sybex, a tech book publisher. Most recently, Molly was the Director of Content Management at Safari Books Online, an electronic reference library of 30,000 tech & business titles, where she created and managed a Content Team to ensure the quality of incoming content; designed and maintained content-related processes and workflows; and managed a publishing partner community of more than 100 organizations.