The DAISY Consortium
Last revised: June 8, 2007
Version 2
Scribe: Lynn Leith
Marc Van der Aa (Plextor Europe), Dinesh Kaushal (Code Factory, India), Greg Kearney (Individual Supporter, USA)
Relevant documents and specifications were noted. HumanWare, Plextor, and Solutions Radio each provided relevant documentation that was sent to the DAISY Online list. In addition portions of the PDTB Specification were noted as being relevant. Presentations and discussion on each will be added to Agenda item 7.
Lynn Leith was confirmed as the note taker for the meeting.
An overview of how DAISY working groups operate, the role of a project charter and the draft policy governing our operation was given by George Kerscher.
This was covered under agenda item 2, however the following question was posed to the participants at this point: "Is there anything patented in any of the existing implementations?" The response from the group was "NO".
Peter Osborne summarized the expected outcomes of this meeting:
The next face to face meeting will be held October 3rd in London prior to the DAISY Technical Conference. Dates of two other meetings are yet to be determined. We have land line conferencing as well as Skype. He suggested that we have weekly conference calls during intense development.
What are the key issues we are trying to tackle and where are the boundaries? Our discussion should refine Peter's initial notes and the subsequent postings on this issue into an agreed project scope. Peter hopes the Working Group can identify key dependencies with other project areas.
James asked Peter describe what he thinks the project is about in one sentence. Discussion, questions and suggestions followed:
Lynn proposed the following overarching statement for the protocol: "The protocol will define the interaction between the service provider and the user agent or reading system for the online delivery of DAISY content." The group agreed with the summary statement.
Definitions will be required following overall scope statement. For example:
The "content" will need to be defined:
A "Message" category is in scope. Examples of categories may be firmware updates or player/reading system management.
Also in scope are:
The nature of content/messages that may be sent is out of scope.
Discussion of the terminology "online delivery" followed. Suggestions such as "delivery via the Internet" were put forward. Questions such as "How is the content handled" and "What are the messages at the user level?" were raised. In simplified terms there are two areas of content: valid DAISY content (all formats), and messages players may receive (firmware upgrades etc).
A Principle for DAISY Online delivery is that it supports/is compatible with all versions of DAISY content, 2.02, DAISY/NISO, and "DAISY Next". Delivering DAISY publications is in scope.
Decision:
To continue using the term "online delivery" and possibly revisit it at a later date
The IDPF Container Specification may be applicable for the DAISY Online specification. It is a zip file.
Is streaming in scope? With streaming, nothing is stored. After it is played, it is "dropped" of the system. Progressive downloading and streaming are very similar. The user would be navigating the book when the book is on the server rather than on the player/reading system. The player would download an ncc that simply points to files on the server. The content provider and service provider may not necessarily be the same.
Decision:
Streaming is in scope.
Requirement:
Ability to preview a book, without having to download the entire book.
Requirement:
Ability to skip to any part of the book, that is have a DAISY reading experience when the book is not "on" the player; DAISY access to remote content; the DAISY application running on the server. (A company in Sweden is doing this).
Decision:
It is out of scope to define the reading system.
Requirement:
No modification to DAISY DTBs is required for them to be publishable in the DAISY Online environment.
Decision:
Content negotiation is in scope for release 2 of the Online specification.
Decision:
A minimum set of requirements for reading systems is in scope. See DAISY OK - Minimum Requirements.
The acronym "ROAD" was suggested for "realtime online access to DAISY".
Group discussion on related work and its influence on agreed scope, agree what information is relevant to the project and how we are going to share it.
The initial library concept was "why can't we have a system that automatically delivers DAISY books over the Internet?" The idea was to define a way in which players and servers communicate, to define what the player is capable of. A fully automated system of delivery of content is possible. In Section 5 of the precirculated document, there is a series of things listed that have been mentioned but not explored. The report describes what they did. There is public disclosure, there is no IP. The scope of project was seniors with little or no computer/Internet experience.
The Plextor Online delivery system is PlexNet which is both PC based and player based. It is implemented in 3 libraries in Japan. It was introduced in 2004 at the Japan Braille Library and there are now over 800 users. It uses an HTTP protocol therefore should be compatible with most network environments. The server sends a list of available books to the player bookshelf (My Library). The selected book is streamed. The user can resume playback at the place in the book where he left off. The environments are Windows, Linux, and possibly Java. Encryption and authentication are not yet implemented: Plextor will look at this.
The protocol defines the relationship between the server and the player. The player downloads the NCC and SMIL files, then the audio files are streamed. Copyright law in Japan does not allow the download of content to the player.
In 2003 Solutions Radio began radio streaming. From there they included the streaming audio books. They have experience with Internet delivery protocols and are hopeful that the DAISY Online protocol will use http. Their system downloads the NCC HTML and SMIL files to support navigation (similar idea to the Plextor system). The most difficult factor they have encountered is how to tell the player what books to download. Solutions Radio is now using the playlist from WinAmp.
content security: user authenticity, signatures, DRM file encryption, could cause problems. The content on SD cards can be protected as well.
Some parts are relevant to this project, for example, unique identifiers for end users (unique identifier scheme). Authentication should be transparent to user. Assurance is needed on both ends.
It is a fully open specification. James suggested that the Online specification should be flexible and open. The PDTB Specification can be implemented online or for delivery on physical media.
There is a dependency on DAISY OK. The issue holding it back is sample content, both 2.02 and DAISY/NISO 2005.
There is a proposed revision to Z39.86 - the DAISY/NISO 2005 Specification. "DAISY Next" will be a modular specification, with different profiles, targeting different types of players.
IDPF is close to approving their Ebook Specification. It incorporates the DAISY DTD and NCX. There is no SMIL , just text and navigation.
To enable the gathering of information on which we can base a technical specification - George, Markus Gylling, Ed Chandler
Production tool use cases development:
RNIB user profiles:
There was some discussion as to whether requirements such as player interface is in scope. As noted earlier in the meeting, they are not.
The two "ends" of the protocol are the Server side and the End User side
Distribution scenarios:
The Working Group then began the process of gathering use cases/user stories, both in scope and out of scope. Each use case was to include a title of case, the identity of "actor", and a story/sequence of events. From this, requirements will be developed.
Ed compiled the use cases. They will be reviewed, sorted by "in scope" and "not in scope" and distributed along with the resulting requirements list.
The subgroup that will review, sort the use cases/user stories, and derive the Requirements will consist of ED (lead), Kathy, Lynn, Dave, Clive and Ron.
The agenda was revised for day three, based on the activities of the first two days.
A clear set of user requirements supported by use cases/user stories must be developed. The Protocol needs to describe the demands that it will make on the client (server). Links to other specifications and perhaps a glossary should be included. A formal DAISY Specification for online delivery will be developed. A clear statement of what is required and what is optional is required. Reference implementations should include client implementations (reading systems) and server implementations.
There may be a discovery phase in which existing protocols will be examined to determine which may meet our needs and that we can incorporate into the DAISY Online specification. There are 3 reference implementations in place within this group.
Strategically it is important that this specification is harmonized with others.
Definitions of compliance, for example, "a compliant server will do xxx" should be developed.
How does this relate to DAISY OK? Do we need to define what a compliant player must do to conform to the specification? How the protocol relates to the server and to client is relevant for the implementation.
Requirement:
Fundamental support for the DAISY Online Specification must be present in a compliant Online DAISY player.
Player UI is out of scope, but how reading systems implement the protocol is in scope. A list of required functionality that an online player must have, and a list of functionality that is desired, that is compliance, may be handled within DAISY OK. "A player/reading system must implement xxx", for example, when it fails it must send an error message to the server, and must provide network status. However, how the player does this is not in scope. It will be necessary to state the mandatory specification of the online protocol for players.
DAISY OK will need to include requirements for online reading systems, both required and optional features. Online player requirements will be added to DAISY OK at the end of the process after implementations are in place.
The Working Group was cautioned against "toy" (small scale) implementation.
We will need to define transfer protocol in specification.
Risk: sample implementations defining the protocol
Server implementations will be different for each organization or service provider, but each will need to conform.
The RNZFB pilot is a sample implementation. Because RNZFB has not implemented CD distribution, it is critical that they implement online services as soon as possible. A reference implementation is owned by the standards organization. A sample implementation will identify the features against which systems can rate themselves. Current implementations are "proof of concept". RNZFB is committing to provide its implementation as a sample implementation, as will RNIB. They can be used as a model for others.
The question was put to the group: "What do developers gain by waiting until the specification is ready, the tools are ready now?"
The purpose is to develop a protocol that homogenizes online delivery. Sample implementations may identify objects in protocol that are not needed. We need a common ground from which to work, that is the protocol. It is possible to have all three of the players in any given sample implementation.
Action Item X:
Define deliverables
Responsibility: Peter and Lynn
Due date: XX
The purpose of project was to prove that the DAISY online delivery system works. Online is the way to go. The interface for borrowers with the Library did not change. They set up an experimental telephone ordering system. There was nothing on the RNZFB Web site to request books for this project. The concept was to show that instead of delivering by mail, DAISY DTBs could be delivered to the player, which is a modified HumanWare player, and to show that they could arrive automatically to player. The borrowers were not aware of the processes involved. There were no interactive aspects in the pilot. The player could "magically" get books. It was a simple electronic replacement of the postman, to prove it could be done and that people would enjoy the experience.
In terms of technology RNZFB is not wedded to their approach. They chose FTP because at both ends it was clear how it could be implemented. The FTP protocol they developed was for proof of concept - exchange between player and server. At the server end, the software is also doing SQL queries on the library system; it "knows" what requests the client has made. Once the server "knows", it responded by sending the book/s to the particular player/s.
When a book was requested a message went to the software on the server, or the server does an SQL query, and the server "knew" a player was online. Players logged in every 90 seconds. A display indicates which players are online. When online and library knows there is a request, the protocol takes over; the server is instructed as to what files to send. The books that are sent and then on the player bookshelf and a message is sent to the server saying book is on player.
The first thing the borrower knows is that book has arrived. This is not necessarily how it would be in a full implementation. Many issues raised at this meeting would be considered important. Streaming would be considered. RNZFB is keen to push technology to the max, and within the next few months to implement even if the specification isn't ready. They see online delivery as being the most viable approach. They will add newspapers to content for online delivery. Instantaneous delivery is seen as real benefit. They have looked at alternatives, as they need an alternative for those who cannot be reached online. They have not yet looked at things such as parts of books being distributed, etc.
The connection is wireless. Magazines and books have arrived on Clive's laptop during this meeting. The system uses an ADSL connection and wireless router that the player is hooked up to. It is possible for the player to have a wireless card built into it. The project involved broadband service being put into people's houses by RNZFB. Clive wrote the server software.
Each player was assigned an FTP account. Players moved from one user to the next. There were 15 players, and 40 users in total. The library server was not altered. The protocol mandated how the books appeared on the players, that is, the transfer. Books are on the server, with a URL for each book folder. The book card has the book files, sort of like DAISY/NISO package file. The system itself is not really DAISY aware, it just moved packages of files.
The project kept simple for proof of concept. They learned a great deal. The player interface was the HumanWare Classic interface (simplest of the Victor players). 39 of 40 users had never used a computer.
The group was given permission to use the papers from Plextor, HumanWare and Solutions Radio as reference.
The set up on the library server side is very simple. The player must contact the server. The transport layer, protocol (bookshelf), and protection are defined. Interaction is from player to server. Transport layers available are HTTP, FTP and RTSP (mostly used for streaming, more complex than needed), and MMS. SSH is not a good option, as it is not designed for data transport. BitTorrent not applicable. HTTP seems best. HTTP and FTP are both free. The bookshelf/list of books requested is needed. (See Solutions Radio document for additional details)
The current system supports streaming, but will support download. A demonstration given. The player has a built in WiFi, connects to server and checks for books. "My Library" (bookshelf) resides on the server for each client. The library system controls the bookshelf. There are other options such as a list of most popular books, list of recent releases, news (info from the library), etc. Borrowers can choose a book by title, author, etc. It is possible to search the entire collection on the server, by title, author etc. with the player. Features include change reading speed, search by page number, set bookmarks (stored on the player). The bookshelf is on the server and the player has a copy of it. The bookshelf on the player is updated.
A client may have more than one player, each being registered on the service provider server: single client, multiple players. Clients are registered on the system. It uses HTTP for transport, SOAP, and Pure HTTP. User and password are sent by the SOAP protocol, session ID is added. XML goes over the wire that says 'this is your set of books'. Raw XML is generated on the fly on the server (using SOAP). The system uses their own schema to say these are the new books available. An XML message is sent to the player. Books are accessed through a URL. (See the paper for additional details).
The process will not be completed during meeting, but this will get the process started.
What is the risk, what is likelihood (low, med, high), what is impact of risk on project, what mitigation can we deploy to minimize risks?
Discussion: Release V1, then distribute the second release soon after. Continue the process until the requirements are met.
In light of time available, "likelihood" and "impact" were not discussed following the first risk factor.
Mitigation: commit to moving beyond Release 1, and seek the Board's commitment to this. Also, ensure that Release 1 includes in its scope the overall structure that will allow us to move to Release 2.
Discussion: This needs to be considered in the architecture - leave hooks in place - as part of the scope of Release 1. If we know which organizations must move quickly, identify their requirements and insure they're in Release 1, leaving out pieces that are important but not required by this small group. Define in Release 1 that there is an inventory of tools required by the content provider, for example, portable bookmarks.
Mitigation: clear communication about the project, remaining open.
Discussion: how will manufacturers approach this, will they buy into the process or just produce their own tools because they have to get on with it
Mitigation: We don't want to become involved with business cases/rules in library systems. Having clear requirements and clear scope will address this.
For example, the Polish group which is implementing an online service. The level of risk depends on the impact on player developers. In California, the driving priority is online delivery. There is a danger that communication around this work preaches to the DC community only.
Mitigation: identify those organizations outside DAISY. Clear communication is essential. After a charter is approved and the scope is known, there needs to be a second, wider call for participation, making it known that participation is open to all in and outside the DC.
Mitigation: This may or may not be the case. Many are putting in place mobile data networks which are cost effective. There is no snail mail postal service or money for hardware players, but these networks will facilitate online DTB distribution. The DC will encourage a sample implementation in developing countries.
Due to time restrictions, there was little discussion of the following identified risks.
Mitigation: communications
Action Item X:
Pull this into a risk document that will continue to be maintained
Responsibility: Peter and Lynn
Due date: XX
Code Factory Player has developed a DAISY player.
This is a supporting, secondary, rather than primary DC specification.
Deliverable: identification of the prime movers interested in online delivery to facilitate communications with these groups.
Action Item X:
The political issues will be extracted from the discussion of risks
Responsibility: Peter
Due date: XX
How we get there from here will be based on the requirements and scenarios. It should be tasked to a smaller group and should operate on a fundamental assumption, such as HTTP. On the server side: "here's a list of things, here is my response".
We need to identify the types of transactions the protocol will support. Then boil these down into more "primitive" elements: how things interact with each other. It should be simple request transaction model, making a simple structure, ideally based on a number of simple "primitives/basic transactions", for example, here's a list of the things I want, this is how it is handled. Similar to the way the resource file works in DAISY/NISO if the content provider is providing human readable labels.
Question: Each organization has a different server side, we have the protocol between the server and the client, how much is involved in getting the protocol to talk to another application that talks to the server? Answer: The server interacts with the asset management system, catalogue etc. The protocol should be easy so that it can be implemented by different organizations with a variety of servers.
Most program environments are set up for interacting with Internet. Most rely on the cataloging systems in existence. On the server side, many are customized implementations, all customer experiences are different. One principle would be to keep the protocol as simple as possible - fundamental kinds of transactions that can be composed in different combinations. A big part will be defining XML data structures for things like book lists, player information, etc.
The server must be able to support different kinds of transport. The server must be able to respond to these kinds of requests, even for a reference implementation. It would be interesting if DAISY could host a reference implementation that all can use for testing. A reference implementation should receive and send every kind of message possible, and satisfy all of the requirements of the specification.
HTTP and FTP are obvious contenders.
Those who want to move forward quickly will feel better when the fundamental decisions are made. A small group needs to think about how the protocol will work, but not about payload/packets of the information to be moved. Once we have a basic protocol, we will know where we are heading and will know the structures involved. Once the key decisions are made and we have confidence in them, things will fall into place.
Between now and the fact to face meeting in October, this smaller technical group will work together remotely.
We need to research HTTP: is it scalable? We also need to look at available tools/protocols/implementations.
Is there a need for SOAP if all that's needed is to send an XML document?
The technical group will need to come forward with recommendation to this larger group. The work of full groups and teams will have to be done in parallel.
The protocol needs to be neutral in terms of content and needs to be extensible; all implementers need to be able to say what they need to say in their organizations' messages. Some messages will be core messages. Should there be several types of pick lists, or alternatively, should we define a generic pick list that is extensible/changeable: picklist and response for example.
Question: How might discovery phase continue? How do we benefit maximally from the existing implementations? How much input can this group derive from these existing implementations?
RNZFB will not carry on with FTP.
The interface could be the simple API group of W3C.
All three vendors present indicated they are open to changing from their current implementation.
We should keep it plain XML, keeping it more generic.
Decision:
All agreed that XML will be part of the solution. If there are applicable XML doctypes in existence we should use them.
Action Item X:
The technical subgroup will identify the technologies under consideration: HTTP, SOAP, XML, and any others in the arena, and they will research emerging technologies. The subgroup will come forward to the larger Working Group with recommendations as to what they think should be used.
Responsibility: Technical subgroup
Due date: XX
We don't know what resources are needed to deliver content. For example, a library with 10,000 users - what is the capacity; if there are 15,000 users, what are implications for throughput, etc.
RNIB and RNZFB are committed to sample implementations on the server side - what is validation process? Sample implementations prove that it can be done.
Server capacity varies depending upon peaks in service demands.
Decision:
Scalability is a priority
The tech subgroup will be comprised of: James Pritchett, Niel Bernstein, Simon Roy, Hiromitsu Fujimori, Jelle Martijn Kok , Markus Gylling (staff contact), Clive Lansink (Lead), Leon Gulikers, Nick Williamson.
The responsibilities of the lead are: develop conference call agendae and meeting schedules, lead the conference calls, document preparation, and submission of status updates to the full Working Group. Dinesh Kaushal may join the technical subgroup.
Peter stated that we have made a great deal of progress including initial discussion of the technical approach. Initial work on the Requirements is well underway. Teams are in place. The schedule has been set.