White paper: Theory Behind Players and Intelligent Reading Systems

Author: George Kerscher
last revised: 5 July 2004
Status: draft version 4, added comments from Lynn, Jennifer, and Michael

Abstract

DAISY Players are different from DAISY Intelligent Reading Systems. Understanding the differences between these two concepts will help the DAISY Consortium make strategic decisions for planning for the future.

Definitions

MP3 Music CD Player
A mainstream mass market CD-ROM player that will play Red Book CDs, and CD-ROMs that contain MP3 music.
If a DAISY DTB is authored using the utilities in our existing production tools that order and name the MP3 files, then a MP3 Music CD Player can be used to playback the audio of DAISY DTBs. Some mainstream MP3 players also support the CEA specification, and this can be generated by a (not yet created) DAISY utility.
DAISY Player
A hardware or software player that interprets and renders the NCC of DAISY 2.02 or the NCX of DAISY 3. These can be used to navigate the structured audio content of the DTB and go to pages . They may also provide additional features, such as retaining the 'stop point', speed up and slow down of audio, placement of bookmarks, and conformance to the DAISY IPP (DRM) specification.
These DAISY Players are also excellent MP3 Music Players.
DAISY Intelligent Reading System
A software reading system (and possibly a hardware reading system ) that understands DTBook, the XML source content document. A DAISY Reading System uses its understanding of the marked up text to provide the reader with intelligent (rule based) presentation and full featured functionality.
This is also a MP3 Music CD Player, plus a DAISY Player.

Note: DAISY 3 is fully defined in the ANSI/NISO Z39.86. The first release of this specification was in 2002, and an updated release is scheduled for 2004. The DAISY Consortium has expressed great interest in moving this specification to the International Standards Organization (ISO).

Goals of the DAISY Consortium

I am not trying to explain all the goals of the DAISY Consortium. Rather I want to identify some goals that are relevant in the Player versus the Intelligent Reading System discussion.

Develop a Replacement for the Analog Cassette

The DAISY Consortium has been successful in building the specifications for the replacement of analog cassette players. This has been envisioned as portable players that will play digital recordings of DTBs. While we still want to see the prices of these DAISY Players drop, we have been very successful in encouraging the development of players. Plextor, VisuAide, and Telex have excellent products that meet this requirement.

Intelligent Reading Systems

The development of DAISY Reading Systems has progressed, but the lack of full text content has not given implementors the incentive to move forward. It seems that we are just at the beginning of Intelligent Reading System development. The current generation of software players mimic hardware player functionality and provide text display. Some simple searching is also provided, but the Intelligent Reading Systems of the future will have the ability to go far beyond what is here now. These Reading systems depend on the availability of full text content that is richly marked up.

Opportunities that DAISY 3 Provides

Easier to Implement

DAISY 3 is easier for player manufacturers to implement than are DAISY 2.0 and 2.02. It is very precise and provides the information developers need in a clear compact specification. It is also easy to move valid DAISY 2.02 content forward to DAISY 3.

Resource File provides Book Specific Prompts

The developers can also choose to add more features and functions that are not available in DAISY 2.02. For example, there is the concept of a "resource file" that enables the producing library or organization to add prompts to the navigation model. Today, everything at a level is reported as level 1, 2, etc. Through the resource file, this could be modified to say Part, chapter, and section. It can be book specific. The player manufacturer may choose to support this functionality, or they may stick with the common approach used today. In any event, if the book does not provide a resource file, the player will need to fall back to the current way of reporting levels.

Support for all Six Types of DTB

Currently we see that most organizations have provided books that contain audio (MP3) files. The DAISY specifications allow for text only, and varying amounts of mixed audio and text. Players that do not have the ability to generate synthetic speech, can only play the recorded audio portions of books. Once we begin to provide books that have varying amounts of audio and text, only Intelligent reading systems will be able to play these types of books.

Semantic Reading, the True Test of an Intelligent Reading System

The key test of what I would call an Intelligent Reading System is the ability to understand the XML markup and then tailor the reading experience based on this understanding. This was described in the paper, Theory Behind the DTBook DTD The Intelligent Reading System does at least two things. It changes the presentation of information based on rules associated with a particular markup construct, and it changes how the user interacts with that construct. For example, If you are reading and encounter a table, it should be possible for a reading system to change how the person reads. It is common in EXCEL spreadsheets to hold the Alt+Ctrl and then use your arrow keys to move by cell (cell navigation) within that spreadsheet. It would be fairly straightforward for a Intelligent reading system to do this, but it must also change the presentation. When the reader moves right one cell to the next column, it should read the column head and then the cell content. If you move down it would read the row head before it reads the cell content. However, this intelligent behavior depends on the understanding of the XML table markup. You don't get this from the straight audio reading. As we add more complex markup, such as mathematics, it will be essential for reading systems to give the end user the power and flexibility to read mathematics properly. This will require research and development, but it can only be built on intelligent reading systems, not players.

Some Strategic Implications

Conclusions

I do not want to draw any conclusions, but the issue of players versus Intelligent reading systems does force strategic decisions:

  1. If we focus on players only, our work has a narrow scope. Our strategy would then be to drive down the price of players while increasing quality. We could also focus on a worldwide distribution model for these players and the content.
  2. If we see that Intelligent reading systems are of the highest long-term importance, then we will want to focus on publisher relations and acquiring the text. We will want to move R&D in the direction of mathematics and other technical areas, including dictionaries and reference works. This direction also points to production tool development that supports this technology.
  3. If we decide that Intelligent reading systems are not really in scope for the DAISY Consortium, some other organization, similar to DAISY, will need to be formed. This is a real threat and could happen if the DAISY Consortium is viewed as not moving fast enough.
  4. If we believe that standards for both players and Intelligent reading systems are important for the DAISY Consortium to provide, then we will need to clearly communicate these directions and describe these differences. The organizations will need to decide if they are providing player content, Intelligent Reading System content, or DTB that supports both. Our library catalogs will need to indicate the differences.