By George Kerscher, International Project Manager, DAISY Consortium email@example.com
And Jim Fruchterman, CEO, the Benetech Initiativejim@benetech.org
The Open eBook Forum (OeBF) provides the ideal "forum" for the exploration of issues related to ePublishing. In addition to standards and growing the industry, there are issues, sometimes controversial, and often relating to various rights holders that come before the OeBF. This document will lay out the heated rights controversy concerning the use of synthetic speech -- Text-To-Speech (TTS) as it relates to the use of eBook publications by persons with disabilities.
Books are increasingly coming in formats other than the traditional printed book. These formats appeal both to members of the general population as well as to people with disabilities, who cannot read a printed book. Audio and eBook formats provide both new opportunities and new challenges, some of which are illustrated by the following examples.
Imagine a famous actor seated in a recording studio, with a printed book in front of him. The room has been carefully constructed to shield the recording process from extraneous sounds, and the sound engineers take great care to achieve the best results. The sonorous tones of Shakespeare are carefully captured in this soundproof room, and the resulting cassette tapes and CDs will delight many thousands of listeners. Whether the listener has a long commute, or has a visual impairment, this audio form of the book delivers great value.
Imagine the digital form of this same text, the eBook form. This same book is available in many different digital formats, delivering the text of the book rather than a recording. The form of the text is analogous to the text in a word processing file or a Web page. Dedicated eBook reading systems will provide a way to download this eBook and read it on the device's display, and PC-based reading systems will present this text on the computer's screen. Thanks to access technology, individuals who are blind and other persons with print disabilities will be able to read this text independently. Depending on the needs of the individual, the text may be presented as braille or enlarged text, but the most common access method is synthetic Text-to-Speech (TTS). TTS has improved over the past twenty years, but still sounds distinctly mechanical. while most words can be pronounced clearly, there is much room for improvement with phrasing and expression. However, its ability to make digital text accessible has made it an incredible instrument of equality to many people with disabilities.
Now imagine a man who is blind sitting at a PC, and listening to that Shakespeare passage with an eBook reader using TTS. The mechanical tones are far less evocative than those of the professional narrator, but the listener who is blind can see past the quirks of the robotic voice. Long experience has made the TTS familiar and the user can focus on the content.
Next, the user goes to the Web, makes a purchase and downloads the latest best-selling novel, just out in eBook format. With the same eBook reader, the user opens the eBook and hits the start speaking button. And, nothing happens. No sound comes out of the PC's speaker. The eBook has been soundproofed. If the person is lucky, it may be possible to get a refund for this eBook purchase. But, the person won't get access.
This paper will describe the technical and legal issues behind the soundproofing of this book. The differing rights of publishers and individuals with print disabilities under contract and under law have led to choices in the structure and delivery of eBooks that sometimes deliver access and sometimes do not.
One minute sample recording of an audio book using professional narratorHuman recording of passage
One minute recording of the same text using synthetic speech. Synthetic Version
note:This audio extract is taken from the title Almost a Crime by Penny Vincenzi. It may not be reproduced outside the context of this paper, and may not be re-published or broadcast in any form. We thank the publisher, "Orion," for their kind permission to use this 1999 publication.
That wonderful vehicle for presenting ideas, thoughts, and experiences - the book - is usually inaccessible to persons who are blind and to others who have disabilities that prevent them from reading standard print. Blindness, obviously, prevents one from getting information visually and the challenge of laying one's hands on published materials in accessible formats has always been a lifetime challenge. In the USA, for example, fewer than 7,000 of the 70,000+ books published each year are ever made accessible in a recorded or braille format. Recording For the Blind & Dyslexic (RFB&D), the largest provider in the world adds 4,000 titles per year to their collection by taking advantage of over 5,000 volunteers in 32 recording studios around the country. (See http://www.rfbd.org for more information). The National Library Service for the Blind and Physically Handicapped (NLS) a division of the Library of Congress records approximately 2,500 each year and other smaller organizations contribute additional titles in braille or audio, but collectively less than 10% of published books ever make it into an accessible format.
To say, "the plight of the student with a print disability is extremely difficult," would be a gross understatement. It is common for students to be without their accessible version of a textbook at the beginning of the school year. It takes many months for volunteers to produce the recorded version at RFB&D. For the college student, the situation is even worse. As a student progresses in his or her education, the likelihood of textbooks being accessible decreases due to the specialized nature of the material being studied. At the colleges, Services for Students with Disabilities have the responsibility to make courseware accessible. The use of scanners and OCR software is common, but there is no consistent effective way to create a high quality accessible textbook within the time demands and financial constraints facing the students and the Disabilities Services offices. The result is that students drop courses and change careers based on information availability in their field.
For books not available in accessible form, readers with print disabilities either rely on a human reader or on a scanning system. Reliance on a human reader is expensive and not always available when access is needed. Scanning an entire book with OCR software can take hours, and the resulting text will have recognition errors. Although a new Web-based service has been recently launched to provide shared access to scanned books, Bookshare.org (see http://www.bookshare.org for more details), scanned books are not equal in quality to electronic books. Textbooks and technical books are often unusable when scanned because of complex content beyond today's character recognition technology. Access to eBooks offers a major step forward for people with print disabilities.
When the work on the eBook Publication Structure started, the disability community eagerly joined in the effort. The focus was to ensure that the file specification for eBooks was completely accessible. The disabled community found great support in the working group's development of the eBook Publication Structure 1.0, which became a standard in September of 1999. The XML data encoded in this file specification is completely accessible. However, it is important to point out that the XML data is compiled for distribution into a proprietary wrapper that includes a Digital Rights Management (DRM) component, which often prevents accessibility. Nevertheless the disability community continues to work within the OeBF to ensure that eBooks will evolve as accessible reading material "right off the shelf." If you have the structure and content encoded in XML with sufficiently rich semantics, there is no reason why the presentation of the information cannot be tailored to meet each person's needs. This is true for all people and at all times; this is the promise that ePublishing holds for persons who are blind and print disabled.
The personal computer is the information access tool of choice for many persons who are blind. The computer is made accessible through a screen reader program. Screen readers use a text-to-speech synthesizer (TTS) to speak aloud the information that a sighted person would visually read on the computer screen. These screen readers intercept the text being written to the display and keep track of it, so that it can be vocalized in response to the user's control. For example, pressing certain keys will cause the screen reader to read the current word, line or paragraph. Screen readers also permit the use of dynamic braille displays instead of, or in addition to, the TTS.
The screen readers are external applications to the PC-based eBook reading software. The DRM wrappers are designed to work with reading applications that present the text visually without allowing the text to be copied, to prevent the illegal distribution of the book. Unfortunately, these anti-copying provisions also prevent the screen reader from providing access with TTS or braille. The secure reading application views these external applications as security threats and blocks their access. As a result, persons who try to use their screen reader with eBook reading systems find that their screen reader is not allowed to do its job and leaves the person who is blind with no access to the ePublication, unless the reading application builds access directly into the user interface.
In 2000 Adobe was the first to provide a version of an eBook reading system with speech capabilities. This product uses TTS to present the textual information. Blind people and their advocacy organizations were disappointed when Microsoft's initial eBook products came out and didn't work with the screen readers. Late in 2001, Microsoft's Reader group released a version that included an interface that used TTS to present information. With the host of eBook reading systems on the market, it is only Adobe and Microsoft that provide access to persons with print disabilities through TTS.
NOTE: Various federal laws mandate accessibility for persons with disabilities, such as the Americans with Disabilities Act or Section 508 of the Rehabilitation Act, which specifies that the federal government should purchase products that are accessible to people with disabilities. In addition, the copyright law concept of fair use is often used as a justification for access. For example, an individual scanning a printed book for TTS or braille access for his or her personal use is generally considered fair use. In addition, there is a provision in the copyright law permitting nonprofit organizations such as RFB&D and Benetech to provide accessible books. However, the interaction of these laws with eBooks is an open question.
In some cases the authors and publishers have sold the rights to the audio version of their books. The intention of the audio publisher is to make a sound recording of the book available for sale commercially, usually in the form of a cassette or Compact Disc, but also, more recently, as a digital product available for download distribution via the Internet.
When technology companies such as Adobe and the Microsoft Reader group discussed requirements with publishers, the topic of TTS came up. Both Adobe and the Microsoft Reader Group were told that in many cases the potential eBook publisher had sold the audio rights to another company. This evolved into the requirement for the technology provider to disable TTS in certain classes of eBooks. The requirement to have control over the use of TTS is being put forward by the publishers to resolve this rights issue; both Adobe and Microsoft have implemented this disabling feature. In simple terms, some people consider the TTS presentation an audio rendition, and therefore permitting the TTS presentation would be an infringement on the audio rights holder. People with disabilities do not agree with this interpretation, since the eBook is delivered as electronic text and not as recorded human speech, and since turning off access prevents them from reading the eBook.
NOTE: Persons with print disabilities purchase and enjoy commercial audio recordings. While productions of audio books are increasing, the availability of audio books falls well behind that of traditionally published titles. In addition, the print disabled community can benefit from the combination of textual presentation on the screen, accompanied by the synthetic speech, word spelling capabilities, and the added flexibility of specific page positioning.
Microsoft and Adobe, which have implemented the use of TTS in their eBook reading systems, have heard from publishers that the audio rights to their eBooks may have been sold. Therefore a feature has been added that allows the use of TTS to be turned off. This means that at the time of creation, a decision can be made by the publisher to disable the use of TTS for this particular eBook.
NOTE: The cost of TTS has dropped from $4,000 in 1985 to almost free, now that it is being implemented as software using the standard PC sound cards. The quality of TTS is steadily improving, and while the quality of TTS may not be considered serious competition to a professional narration today, this may change sometime in the future.
In the case of Microsoft Reader, if the highest level of security is selected, TTS access will be disabled. Unfortunately for people with disabilities, the latest and most popular eBooks are almost always released at this highest level of security. So, while some eBooks formatted for Microsoft Reader now talk, the ones in greatest demand generally do not.
Adobe takes a different approach that does not associate TTS with security. Adobe's eBook authoring tool provides the option to turn off TTS access. Publishers using this option sometimes turn off this access because they are not certain they have the rights to turn it on.
At the beginning of this document are links to sample MP3 files that represent a TTS version of a passage and the same passage being read by a professional narrator. The relevant issues and positions of the rights holders have been described. In a nutshell, you have the facts, but to summarize:
Audio publishers feel having TTS enabled infringes on their rights; EBook publishers want maximum security for their electronic documents; and persons who are blind and print disabled believe they have a human right to read published documents and especially ePublished materials they have purchased, not to mention rights under various federal and state statutes.
Clearly, the Open eBook Forum must provide for discussion of the issues surrounding this conflict. We have produced this unbiased presentation of the facts to clearly explain the issues. Now, it is up to the various rights holders to discuss ways to address this controversy. We invite thoughtful comments through the OeB Forum's Web site http://www.openebook.org. The discussion items relating to this thread will be placed on the OeBF web site. It is our hope that a clear direction will emerge from this discussion and all rights holders, including people with disabilities, will be the winners.