EPUB 3 and Interactivity

February 14, 2014

Adapted with permission from EPUB 3 Best Practices (O`Reilly Media)

Why you would want a book to be interactive? For many readers, a book’s immutability is a feature rather than a bug. A print book does not demand anything from the user but their full attention. It does not entice the reader to click, to comment, to share, or to tweet. It promises total immersion in the text, a direct conduit to the author’s thoughts.

Yet there are many cases where the static nature of the traditional book is perhaps an artifact of print technology rather than the canonical best form for the content. Books that aim to teach complex, real-world subjects could benefit from the opportunity for readers to engage with the material. New forms of storytelling can encourage readers to choose new paths, or let the reader dig deeply into the narrative, uncovering hidden motivations through careful discovery. Books packaged with their primary source material could be rich scholarly resources, allowing researchers to independently verify assertions or build follow-on experiments out of raw data. Far from being a de facto distraction, interactive publications have only begun to be explored.

EPUB 3 provides the capability to do all of the above, but author beware: interactivity remains at the vanguard of ebook support. The more the publication deviates from a traditional book, the less likely it is to be fully crossplatform. Many EPUB 3 reading systems will never support interactivity, and the standard is not specific about how conditional or partial support should be best implemented. Careful testing is required.

Support for interactive elements is described in  EPUB 3 Content Documents. This document will be referenced throughout this section.

The EPUB 3 specification defines two inclusion models that affect the scope of the document that scripting can potentially modify in response to user input or script action. The larger the scope of the interaction, the less likely it is that the EPUB will be interoperable.

Ebooks present a particular challenge here for two reasons:

For both of these reasons, scripts should modify as little of the DOM as possible. The specification even has support for reading systems which prohibit global modification of DOM. These systems are said to support only container-constrained scripting and should indicate that as part of their epubReadingSystem object support. Conversely, potentially modifying any part of the DOM is known as spine-level scripting, meaning such scripts are capable of modifying any DOM object that can be discovered via the reading system by traversing documents found in the Publications Metadata spine.

Progressive enhancement is a design principle in which the goal is to develop the most accessible version of content first, as the primary version, and gradually add layers of enhancement based on the capabilities of the reading system, device, and end user. While the final product of a progressive enhancement workflow can be indistinguishable from one arrived at via a graceful degradation approach, the intent behind progressive enhancement is to view the accessible version of the content as the canonical one, rather than designing a high-resolution, heavily visual, interactive experience as the “real” content and then downgrading that presentation to an accessible version as an afterthought.

Progressive enhancement can be seen as complementary to a “mobile first” design approach. In both cases, the content creator understands that all users—regardless of ability—benefit from fast, concise, flexible access to information. By designing for a mobile and/or accessible audience first, the content creator is able to pare down the enhancements to only those that are most critical to consuming the content, and typically pays dividends in reducing unnecessary costly investment in eye candy that only serves to distract readers.

Because digital books are often specifically meant to educate, creating thoughtful publications that are highly accessible and cross-platform is even more critical than in general web development. All EPUB 3 content authors are strongly encouraged to consider developing interactive publications in a progressive enhancement approach, limiting those scripted qualities to only the most necessary, and ensuring that the nonscripted presentation contains valuable content in its own right.

 

Executing embedded source code from a procedural programming language is a prerequisite for most forms of ebook interactivity. While EPUB 3 does allow content creators to embed an arbitrary executable as the object element, in the majority of cases, interactive publications will be created through the use of in-book source code. Because JavaScript is the de facto standard scripting language for SVG and HTML5, EPUB 3 content documents can be assumed to be scriptable only if they contain JavaScript code. The standard does not define which versions of JavaScript (ECMAScript) are required for support; content creators should defer to the most commonly supported features in web browsers for best results.

Reading systems that support scripting are required to provide the navigator.epubReadingSystem property. This allows developers to query the reading system about its scripting support and the specific scripting properties that it makes available.

A simple way to find out whether a script-aware reading system supports this property is to query the epubReadingSystem object:

<script type="text/javascript">
alert("Name: " + navigator.epubReadingSystem.name +
      " / version: " + navigator.epubReadingSystem.version +
      " / layoutStyle " + navigator.epubReadingSystem.layoutStyle);
</script>

At this time, navigator.epubReadingSystem is supported in both iBooks (Figure 7-1) and Readium (Figure 7-2).

Note

Consult the Book Industry Study Group (BISG) EPUB 3 support grid for updated information on support of this and other scripting properties.

Through its hasFeature() method, epubReadingSystem allows content authors to query the exact characteristics of the reading system’s scripting support. The following features are required to be reported by any system that implements epubReadingSystem:

Each of these will return true or false depending on the support available.

The important point to note about EPUB 3 event handlers is that the reading system passes the event to the content. Modern screen-based reading systems will naturally support input from a mouse or keyboard, but to provide consistent navigational interfaces, the reading system may choose to not pass those events down into the book’s DOM. This could be true even for reading systems implemented in JavaScript: they may deliberately suppress event propagation into the content viewport.

Even if a reading system explicitly does not support passing through mouse events, it is safe to assume that it will still allow users to trigger ordinary hyperlink and anchor-based navigation. (If it doesn’t, consider reporting a bug!)

Note

In its default implementation, Readium will return “true” for all but touch-events. The absence of touch-events should not be taken to mean that the reading system is wholly touch-incompatible: since any web-based reading system could potentially be run on a touch-capable device, consider touch-events to mean events which are specific to touch only, such as multitouch or swipe. Since touch events are typically delegated to mouse events in the absence of touch-specific listeners, most publications can safely concern themselves only with mouse-events when considering non-keyboard interaction.

EPUB 3 scripted documents define the range of elements that can be accessed by interactive components using one of two inclusion models: container-constrained or spine-level scripting. The inclusion model describes the extent to which the script can access and manipulate the DOM of the parent, spine-level content document. For many applications, the container-constrained model would result in fairly onerous development, because it requires creation of discrete subdocuments for each interactive component. While most publications are likely to use the spine-level scripting model, understanding the intent of the container-constrained model is useful, because emulating that will provide the best results in cross-platform, accessible EPUBs.

The container-constrained inclusion model requires that a script be associated only with an embedded subcomponent rather than the main content document. The most likely example would be an iframe with an explicit width and height that contained a smaller content document with its own script. The contained script would interact only with the DOM of its own content document, not the parent.

The potential advantage of this approach is that the scripted area has an explicitly defined bounding box, and thus could be laid out by the reading system without concern for shifts in numbers of pages or other document stability issues.

The specification requires that reading systems that support scripting always support the container-constrained model. It might be natural to assume, then, that a best practice would be to always use container-constrained markup if the use case supports it. But this would be a mistake, because wrapping each interactive component in an object or iframe is awkward and bears no resemblance to interactive content authoring on the Web. Additionally, iframes in EPUB publications are extremely uncommon (and sometimes explicitly disallowed in legacy EPUB 2 reading systems). It is unlikely that most reading systems have been extensively tested with documents that include iframe. At this time, we strongly recommend avoiding creating EPUB 3 documents with the container-constrained inclusion model.

Scripts can directly modify only the EPUB content document to which they are attached. This is not a limitation for most simple interactivity use cases, where the script drives a quiz or animated example. It does present a challenge when a choice a user makes in one chapter should affect content presentation in another—for example, in a branching story. In nonfiction, the restriction could also be a problem for authors who want to have large interactive publications, such as textbooks, and want to track student performance on in-book quizzes throughout the reading experience.

One solution to this problem of intra-publication communication is to store information in a globally accessible place and have each content document check that area for updated data. HTML5 offers a number of storage options, from traditional web cookies to the localStorage and sessionStorage models, to one of the full-blown HTML5 database APIs such as IndexedDB or the now-deprecated but widely available Web SQL.

The EPUB 3 specification neither requires or prohibits the presence or use of any of the storage mechanisms above. Indeed, like JavaScript itself, many of these will come “for free” with the HTML rendering engine packaged with the ebook reader. However, reading system developers may choose to exclude or block access to any of those APIs. In some cases, they may be unofficially supported in one version and then removed in a later version. Ebook ecosystems which support syncing across devices are not likely to sync ancillary storage or state. Publications that use these features may need to clearly communicate their limitations to readers, and provide adequate fallbacks for reading systems that support scripting but not the desired storage mechanism.

A content document that includes scripts of either inclusion model must be declared as such in the Publication Metadata, as defined in the Publications 3.0 specification. The property scripted must be included as part of the item definition for that document:

<item
      properties="scripted"
      id="c1"
      href="chapter-with-scripting.xhtml"
      media-type="application/xhtml+xml" />

If a reading system encounters a scripted publication and does not support scripting, it has two options according to the specification:

At the time of this writing, it is unknown how many reading systems will implement epub:switch. It was a mechanism included in EPUB 2 [as opf:switch] that did not see wide adoption. Similarly, publication-level fallbacks may also not be widely used or deployed.

A third, and likely best, fallback mechanism is to implement the scripted content as progressive enhancement rather than with any explicit declarative fallback mechanism: create the fallback content as the native HTML markup and replace that content with the scripted enhancement through JavaScript itself. For example:

<div class="quiz">
  <p class="question">How much wood could a woodchuck chuck? </p>
  <ol class="answers">
    <li> <a href="answers.html#correct" role="button">Two</a> </li>
    <li> <a href="answers.html#correct" role="button">Five</a> </li>
    <li> <a href="answers.html#incorrect" role="button">Ten</a></li>
  </ol>
</div>

In this example, the EPUB would contain a static page that simply displayed “correct” or “incorrect” based on the quiz and could be traversed as a normal hyperlink in a non-scripted reading environment. But packaged with a script, the script could “hijack” the anchor link, instead display an in-page pop up with the answer, and potentially even tally the number of correct/incorrect results. No explicit fallback required.

The HTML5 canvas element allows a publication to contain an almost unlimited variety of widgets. Canvas-based interactivity has a number of attractive qualities for a content author:

A significant downside is that canvas is not natively accessible. Content authors should take care to provide thoughtful fallbacks to canvas-based interactivity.

It is possible to simply place textual canvas fallbacks inside the canvas element itself:

<canvas width="200" height="200" id="leaves">
  An interactive animation that, when clicked, presents an image of trees
  changing their leaf color and then shedding their leaves.
</canvas>

However, typically, the purpose of a canvas element is to convey information that would be difficult to express using textual information itself. (If the canvas is simple enough to describe in words, it probably didn’t need to be coded in the first place!) Also, just because a canvas element exists doesn’t mean that it’s impossible to make it accessible. WAI-ARIA roles can instead help to describe the canvas and potentially let a visually impaired user interact with it.

In this case, role="button" informs the reading system that the canvas is one that a user can click on to interact:

<canvas ... role="button" tabindex="0" aria-label="Tree leaf demo">
  An interactive animation that, when clicked, presents an image of trees
  changing their leaf color and then shedding their leaves.
</canvas>

Another option to creating an accessible canvas is to wrap it in elements that are themselves more accessible, such as figure:

<figure>
   <figcaption>
      <span class="caption">Figure 1 — During certain seasons, trees will drop
      their leaves in preparation for cold weather. The leaves typically change
      color from green to red, orange, or yellow just before falling.</span>
   </figcaption>
   <canvas width="200" height="200" id="leaves" role="button"
           tabindex="0" aria-label="Tree leaf demo">
     An interactive animation that, when clicked, presents an image of trees
     changing their leaf color and then shedding their leaves.
  </canvas>
</figure>

This approach is likely to be appropriate for many uses of canvas in EPUB 3, where they are typically playing the role of an illustrative figure.

The most accessible approach is to prepare a complete shadow DOM inside the canvas element. In this implementation, the script controlling the canvas also manipulates fallback elements. This approach works best when presenting textual data, such as charts or graphs, but can also be used to hook into audio events or to present semantic elements that help explicate the text.

In this final example, the shadow DOM elements can be revealed one after the other, read either by a screenreader or through embedded audio, conveying the progressive nature of tree leaf change even without the benefit of the canvas animation or any kind of visual presentation:

<figure>
   <figcaption>
      <span class="caption">Figure 1 — During certain seasons, trees will drop
      their leaves in preparation for cold weather. The leaves typically change
      color from green to red, orange, or yellow just before falling.</span>
   </figcaption>
   <canvas width="200" height="200" id="leaves" role="button"
           tabindex="0" aria-label="Tree leaf demo">
    <header>
       This demonstration describes the stages of leaf change that deciduous
       trees undergo during transition to the cold seasion.
    </header>
    <ol>
      <li>Trees will respond to decreasing length of daylight with a series of
         chemical changes that reduce the presence of chlorophyll in the
         leaves.</li>
      <li>When chlorophyll disappears from a leaf, previously hidden colors of
          red, orange, and yellow are revealed.</li>
      <li>Gradually the leaves turn from their fall colors
          into dead leaves.</li>
      <li>Eventually the dead leaves fall from the tree due to wind, rain, or
         their own weight.</li>
   </ol>
  </canvas>
</figure>

A curious consequence of using HTML5-compatible rendering engines as ebook readers is that there may be cases where the reading system does not want to support scripted content, but may not default to rendering an inline canvas fallback.

The HTML5 canvas specification requires that the absence of JavaScript support be sufficient to trigger the display of canvas fallbacks:

However, some reading systems may only be post facto disabling JavaScript in contained publications, while retaining JavaScript use for their own purposes (or may be implemented entirely in JavaScript). In this scenario, the reading system cannot rely on the browser’s canvas fallback behavior and will have to implement fallback rendering itself.

Since it is unclear whether this will occur in practice, content authors should consider using canvas as part of a progressive enhancement approach, injecting the canvas element with JavaScript, while authoring the document markup using an image or other fallback. For best results, ensure that the block to be replaced has the same dimensions as the canvas, to allow the reading system to properly paginate as soon as the DOM is available. If a canvas or other element is added by JavaScript after rendering/pagination, unexpected results may occur.