ZedDist Strawman2

From zedwiki

Jump to: navigation, search

Contents

Strawman 2 Overview

The second ZedDist strawman was created during the November 2009 Redmond F2F. It is based on, and refines, Strawman 1 which was published prior to the meeting.

The operationalized requirements underlying this strawman are available in Design Goals.

The text below basically describes a strawman for both the logic of the ZedDist specification, as well as its disposition: the spec would first define the abstract framework for Profile and Feature composition (centered around the Component Pool), followed by a number of definitions of concrete Profiles and Features. The defined Profiles and Features would be listed in Catalogs (just like i ZedAI), and each of them would have a Resource Directory provided at a canonical URI (again just like ZedAI).

The spec would also define incubation processes used for adding new Features and Profiles to the catalogs. Note that the intent is that new Features can be added without a spec revision. For Profiles, a spec revision is required to formally add a Profile; having an incubation process available for Profiles is nevertheless useful as it provides an organized testbed mechanism for new Profiles.

Note: the Component Pool section below is quite detailed: if you are just looking for an overview of the concrete Profiles, skip directly to the Profiles and Features sections.

Component Pool

The Component Pool Principle

The Component Pool is a principle for Profile composition. It serves as an abstraction layer underlying the concrete profiles. The characteristics of the pool define the maximum amount of variability between profiles, and inversely, a set of traits that all Profiles must share.

The spec defines which components are available in the pool; these components and the rules/behaviors defined around and within them cannot be changed without a spec revision. Thus, this abstraction is static.

Each Component in the pool has a set of characteristics and features:

  • It serves one well-scoped purpose in the context of the DTB logic
  • It defines whether its inclusion is optional or required when creating a DTB Profile (Note; a Profile adopting an optional component may in its turn make the component required within the scope of the Profile)
  • It may define one static concrete form, or it may define patterns with which a Profile creator can modify its concrete form at Profile creation time
  • It defines dependencies: the inclusion of Component X in a Profile may be dependent on (a specific aspect of) Component Y to also being included. Component X is responsible to declare these dependencies.
  • It may define "extension points": slots which the Profile creator may activate to allow injection of "foreign" constructs dynamically in instances. In a concrete Profile, these extension points are typically used as slots for optional Features.

Media Type Components

The Media Type category in the component pool contains components that contribute actual content. These components have previously been referred to as channels.

The Audio Component

Purpose 
  • Inclusion of this component makes the Audio media type available in the DTB presentation
  • Provides a specification-wide definition of what audio formats are electable for inclusion at profile creation time.
Inclusion 
Optional
Defines 
  • An audio codec 'pool'. Tentative members: MP3, Speex. (AMR-WB+ etc TBD).
  • Rule: each file can have a different bitrate
  • Rule: VBR not allowed
  • Rule: Each heading should be one audio file
Extension Points 
None.
Dependencies 
Impacts Director: activation of the Audio Component requires <smil:audio/>
Actions at profile creation time 
Select one or several codecs from the codec pool.
Notes 
including WAV in distribution profiles is not necessarily a good thing (cost increase for reading devices). See The Archival Feature, whose sole purpose would be to contribute WAV support, and explicitly state that this feature is not targeted to reading devices).

The Video Component

Purpose 
  • Inclusion of this component makes the Video media type available in the DTB presentation
  • Provides a specification-wide definition of what video formats are electable for inclusion at profile creation time.
Inclusion 
Optional
Defines 
  • A video format 'pool'. Tentative members: Ogg Theora, MPEG1 TBD Dirac?.
Extension Points 
None.
Dependencies 
  • Impacts Director: activation of the Video Component requires <smil:video/> and other associated SMIL constructs
  • Enables Timed Text for inclusion in the profile being created
Actions at profile creation time 
Select one or several formats from the format pool.
Notes 

The Image Component

Purpose 
  • Inclusion of this component makes the Image media type available in the DTB presentation
  • Provides a specification-wide definition of what image formats are electable for inclusion at profile creation time.
Inclusion 
Optional
Defines
An image format 'pool'. Tentative members: PNG (non-animated), JPEG/JFIF
Extension Points 
Features may contribute additional media types to the format pool. For example, the SVG feature would extend the allowed mimetypes on smil:img with the SVG mimetype.


Dependencies 
Impacts Director: activation of the Audio Component requires <smil:img/>
Actions at profile creation time 
Select one or several image formats from the format pool.
Notes 
Note that images embedded in Document Text are a different thing. TBD whether the formats selected here hold for embedding as well, or if that is declared separately.

The Document Text Component

Purpose 
  • Inclusion of this component makes the Text media type available in the DTB presentation
  • Provides a specification-wide definition of what XML grammars are allowed.
Inclusion 
Optional
Defines 
Extension Points 
  • a mechanism to plug in default RDF vocabs for semantic decoration
  • CDI (Compound Document by Inclusion, meaning XML fragments embedded in the host document, typically in a namespace separate from the host namespace)
  • CDR (Compound Document by Reference, meaning external XML fragments, referenced via the <object/> element)
Dependencies 
  • Impacts Director: activation of the Document Text Component requires <smil:text/>
Actions at profile creation time 
  • Select one or several document types, that thus will become allowed members of the DTB fileset.
    • (this may be redefined as: define one or several document types based on the restrictions on the module pool)
  • Elect whether to activate the CDI extension point
  • Elect whether to activate the CDR extension point
  • Define allowed image types (see Image)
Notes 
  • IDPF Revision Dependant
  • Sharing vocabularies with ZedAI
  • Future of @role is uncertain

The Timed Text Component

Purpose 
  • Inclusion of this component makes the TTML media type available in the DTB presentation
  • Provides a specification-wide definition of what subset of TTML is allowed.
Inclusion 
Optional (TBD: required when Video is activated?)
Defines 
  • One or several TTML document types
Extension Points 
None.
Dependencies 
  • Impacts Director: activation of the Timed Text Component requires <smil:textstream/> ??
Actions at profile creation time 
define the TTML-subset document type to be allowed for video captions
Notes 
  • Component may benefit from a rename (everything is timed in a DTB)

Logic Components

The Logic category in the component pool contains components that contribute to the DTB logic, defining ways to describe the presentation flow, provide metadata properties, etc.


The Director Component

Purpose 
Provides the presentation spine, and optionally timing and multimedia synchronization
Inclusion 
Required; note however that a profile can delegate the role of the Director to <spine> of the Package.
Defines 
Extension Points 
  • #Features can impact the SMIL3 document type (inject additional modules, defining additional behavioral reqs in normative prose)
  • Mechanism for defining default vocab for semantic annotation (@role)
Dependencies 
@@@
Actions at profile creation time 
  • define whether SMIL3 or package:spine assumes the role of Director
  • if the former, define 1 SMIL3-namespace document type using the module pool
    • in there, define the URI scheme used (unless this is globally static)
Notes 

The Package Component

Purpose 
Provides the Package document type, which contains publication metadata (bibliographical, presentational, and physical)
Inclusion 
Required
Defines 
  • An OPF based module pool
Extension Points 
  • None
Dependencies 
@@@
Actions at profile creation time 
define 1 concrete OPF document type using the module pool
Notes 
IDPF 2010 revision dependant.

The Navigation Component

Purpose 
Provides global navigation through the NCX document type
Inclusion 
Required
Defines 
  • An NCX-namespace based module pool
Extension Points 
None? Alternatively, consider allowing contributing markup in <ncx:label> using foreign grammars as a part of the doctype creation process
Dependencies 
None. Note that the NCX is remade to be generic enough not to require SMIL as link targets.
Actions at profile creation time 
  • define 1 concrete NCX document type using the module pool
    • in there, define the URI scheme used (unless this is globally static)
Notes 

The Semantic Overlay Component

TBD


Distribution Components

The Distribution category in the component pool contains components that contribute to the DTB as a distribution unit.


The Container Component

Purpose 
Provides the physical container for a distribution unit, and a way to express metadata about the contents of the container
Inclusion 
Required, at least the abstract container.
Defines 
  • Inheritance of the OCF-based abstract container concept
  • TAR, ZIP and FileSystem (FS) concrete container
Extension Points 
None
Dependencies 
None
Actions at profile creation time 
select which (1..n) of the concrete containers are allowed by the profile
Notes 
Refer to Fallback for a discussion on the Container's role in supplying Profile-level fallbacks.

The Search Index Component

Purpose 
Provides a pre-compiled search index (Lucene/SOLr style)
Inclusion 
optional
Defines 
  • The search index document type (static)
Extension Points 
None
Dependencies 
None (assuming that the index' links can resolve to SMIL, Audio and Video media positions there is no reason to prevent this from being used in an audio-only (or video-only) context, therefore Document Text is not required to be activated)
Actions at profile creation time 
None.
Notes 
This should be proposed for inclusion in the Epub revision as well

The Annotations Component

Purpose 
Provides a document type for user annotations (bookmarks, notes, annotations)
Inclusion 
optional
Defines 
  • The annotations document type
Extension Points 
None
Dependencies 
None
Actions at profile creation time 
  • define the URI scheme used (unless this is globally static)
Notes
IDPF 2010 revision dependant.
This component does not only pertain to User bookmarks/annotations, but also annotations as commercial addons/overlays to existing publications

Profiles

A Profile is a definition of a concrete DTB fileset, defined using the rules and components of the Component Pool.

A Profile may allow Features to be injected to dynamically to extend its feature set/exposed functionality, or it may not.

User and Processing Agents declare support for Profiles separately from Features.

Refer to Container-level Fallback for a discussion on Profile fallbacks.


The Audio Profile

Audio Profile: Target Usage Domain

An audio book on steroids, providing an audio book format which offers a feature set that satisfies the avid audio book user, while still being heavily resource- and cost efficient.

Audio Profile: Composition

  • Activates the Audio Component and selects MP3 (and Speex? TBD re AMR-WB+) from that components' codec pool.
    • Recommends that audio files be chunked at chapter/section level
    • Recommends a naming convention for audio files (so that they can be naturally sorted ordinally by devices that do not recognize OPF spine). TODO option to introduce a metadata flag that specifies whether collation is used
    • Postulates a DTB compliance requirement that the entire presentation content must be available in the Audio channel
  • As required by the rules of the component pool, activates the Package Component. TBD whether the Package File will be modularized and hence customizable; for example, in this Profile, we may want to overcome the manifest←spine IDREF obstacle. The codec selections made on activation of the audio component become allowed mimetypes in the OPF manifest.
  • As required by the rules of the component pool, activates the Director Component, and postulates that the role of Director is delegated to the OPF spine (in other words, no SMIL in this Profile, just the sequential order of audio files as given by spine).
  • As required by the rules of the component pool, activates the Navigation Component, and, using the NCX module pool, constructs a minimal NCX document type (a draft of which can be reviewed in the sandbox)
  • As required by the rules of the component pool, activates the Container Component allowing both the FS and TAR physical containers.
  • Activates the Annotations Component, essentially resulting in the annotations document type being an allowed mimetype in the OPF (which of course doesnt preclude annotation documents to be shipped separately). Support for annotations is not a Reading Device compliance criterion.
  • Activates the Search Index Component. Support for search indices is not a Reading Device compliance criterion.

Audio Profile: Supported Features

The Audio Profile is by intent essentially non-extensible, supporting only

Audio Profile: Characteristics

... as compared to general marketplace audio books (CD/A, playlist formats)
support for page navigation
support for hierarchical structure navigation (with possibility for 'closeable/openable depths' depending on reading device implementation)
space efficient, speech oriented audio codecs or are we only supporting MP3?
ability for custom text and image labels in NCX
using file naming conventions (and ID3 tags in the case of MP3), compatible with non-ZedDist-aware devices
support for single-file distribution via the Container
... as compared to DAISY 2.02 NCC Only / Z2002/2005 NCX-Only
much reduced complexity in and cost of implementation, allowing implementation on resource-constrained hardware devices
no explicit phrase markup (thats a con, for once)
support for annotations
support for search index
support for single-file distribution via the Container

The Classic Profile

Classic Profile: Target Usage Domain

The Classic Profile represents the successor to the tremendously successful audio+text DTB type, as introduced in DAISY 2.02 and evolved in Z2002/2005.

While the Classic Profile introduces a number of bug fixes and enhancements to the audio+text DTB concept (as listed in Evolutionary Targets), it does not change the nature of the text+audio DTB concept.

The Classic Profile also serves as the natural migration path for existing Daisy 2.02 and Z2002/Z2005 content. Because of changes introduced to the Document Text Component (i.e. the introduction of a loose XHTML-based document type), the migration of 2.02 becomes viable under automation.

Classic Profile: Composition

  • Activates the Audio Component, thus making <audio/> an allowed element in SMIL. Selects MP3 and Speex (TBD re AMR-WB+) from that components' codec pool.
    • Recommends that audio files be chunked at chapter/section level
    • Recommends a naming convention for audio files (so that they can be naturally sorted ordinally by devices that do not recognize OPF spine)
    • Postulates a DTB compliance requirement that the entire presentation content must be available in the Audio channel
  • Activates the Image Component, thus making <img/> and allowed element in SMIL. TBD image formats
  • Activates the Document Text Component, thus making <text/> and allowed element in SMIL.
    • Using the module pool provided by the Document Text Component, defines two document types (tentatively called 'loose' and 'strict'). It is a DTB compliance criterion that shipped Text documents validate to one of these two document types.
    • Activates the CDR extension point of the Document Text Component, thus allowing the use of <object/> as a hook for Features that have been designed to allow use in CDR mode
  • As required by the rules of the component pool, activates the Package Component. The codec/format selections made on activation of the Audio, Image and Document Text components become allowed mimetypes in the OPF manifest.
  • As required by the rules of the component pool, activates the Director Component. Using the SMIL3 module pool of the Director Component, defines one SMIL document type (which in terms of complexity roughly resembles the Z2002/2005 SMIL document type). It is a DTB compliance criterion that shipped SMIL documents validate to this document type.
  • As required by the rules of the component pool, activates the Navigation Component, and, using the NCX module pool, constructs an NCX document type, roughly resembling the NCX of Z2005 (a draft of which can be reviewed in the sandbox)
  • As required by the rules of the component pool, activates the Container Component allowing both the FS and TAR physical containers.
  • Activates the Annotations Component, essentially resulting in the annotations document type being an allowed mimetype in the OPF (which of course doesnt preclude annotation documents to be shipped separately). Support for annotations is not a Reading Device compliance criterion.
  • Activates the Search Index Component. Support for search indices is not a Reading Device compliance criterion.

Classic Profile: Supported Features

The Classic Profile is by intent extensible only to a limited extent, supporting


Classic Profile: Characteristics

... as compared to the Audio Profile
While the Audio Profile is uni-media, the Classic Profile is synchronized multimedia (audio, text and image), and in addition to this benefits from richer presentation control possibilities using facilities provided by SMIL.
However, the Classic Profile can also be used for uni-media (audio-only) DTBs, since the Document Text component is not required in a Classic DTB. Doing this adds support for richer navigation, skippability and phrase markup (since Classic uses SMIL). Note that the intent is that distribution units can ship with both Profiles, using the Container-level fallback mechanism.
... as compared to DAISY 2.02
better i18n (Ruby support in particular)
additional audio codecs allowed
richer navigation (NCX is richer than NCC)
richer content control (skippability/escapability)
explicit support for multiple Text Document chunking
support for annotations
support for search index
improvements/clarifications as a consequence of adoption of SMIL3
support for single-file distribution via the Container
support for CDR-based extensions (which allows referencing for example MathML and SVG through the respective features)
... as compared to Z2002/Z2005
all the items mentioned in "... as compared to DAISY 2.02" above, and in addition:
removes styling/display/rendering issues inherent to DTBook

The Pro Profile

Pro Profile: Target Usage Domain

The Pro Profile is designed to allow extending DAISY into new domains. Being the host for a majority of the Revolutionary Targets, it is the Profile which allows the largest number of advanced Features.

Pro Profile: Composition

Same as the Classic Profile Composition, with the following deviations:

  • In the Document Text Component, also activates the CDI extension point (allowing Compound Document by Inclusion)
  • In the Director Component, defines a core SMIL3 document type that is a superset of the Classic's SMIL3 document type (adding: <switch>, anything else?).
  • In the Director Component, activates the SMIL3 CDI extension point (so that the activation of features can include the injection of additional constructs into the SMIL grammar)
  • Does not postulate the requirement that the entire presentation must be available in the audio channel.

Pro Profile: Supported Features

The Pro Profile is by intent the most extensible of the Profiles shipped with the spec. Supporting

Pro Profile: Characteristics

... as compared to the Classic Profile
Whereas the Classic Profile has intentionally limited extensibility options (limited to CDR, e.g. object in the Document Text Component), the Pro profile opens up for full Feature injection (see the Video and Forms Features as examples, which are only available to the Pro Profile). The MathML and SVG features are available both in CDR and CDI manifestations in this Profile.
The Pro Profiles core SMIL document type is more advanced than the Classic Profile's SMIL, even in the absence of Features. In particular, the Pro Profile by default supports SMIL <switch>.
Whereas Reading Device implementations for the Classic Profile can get by with limited/homebrew SMIL implementations, support for the full Pro Profile (including its Features) will require a more full-blown SMIL engine (and, thats a con, for once)
... as compared to DAISY 2.02
All characteristics of the Classic Profile also apply to the Pro Profile. In addition:
Adds the ability to include Features such as Video and Forms in the DTB presentation (and additional Features can be added later)
... as compared to Z2002/Z2005
All characteristics of the Classic Profile also apply to the Pro Profile. In addition:
Adds the ability to include Features such as Video and Forms in the DTB presentation (and additional Features can be added later)

The EBook Profile

TBD - IDPF revision dependant. We will either define a local Text-only Profile here, or refer to (a specific manifestation of) the EPUB 3.0 spec.

Features

A Feature is a functionality addon to a Profile. It consists of a set of concrete contributions to the DTB logic (effectively extending the DTB grammars on activation), and associated normative and informative prose.

Features range from relatively "light" (such as only contributing an XML island type to the CDR port of the Document Text Component), to fully fledged (such as the Forms and Video Features, which contribute changes to both Document Text, Director, and possibly more, all depending on which components in the Component Pool allow such extensions).

DTB producers "activate" features in DTBs declaratively, likely using metadata in the Package file. User and Processing Agents declare support for Features separately from Profiles.

Refer to Package-level Fallback for a discussion on Feature fallbacks.


The Video Feature

Refer to Video Use Cases for a description of the intended usage domain.

Video Feature: Composition

  • Activates the Video Component
    • selecting a 1..n video codecs from that component's codec pool.
    • contributes the needed SMIL constructs to the SMIL grammar of the host Profile (e.g. <video/>, <excl/> etc)
  • Activates the Timed Text Component
    • defines 1 TTML grammar (using the subsetting features available in TTML)
    • contributes the needed SMIL constructs to support TTML to the SMIL grammar of the host Profile (e.g. <textstream/> etc)

Note: it doesnt look like the Video feature needs to change the NCX (the NCX targets would still resolve to SMIL fragments).


The Forms Feature

The Forms Feature enables DAISY to be used in scenarios that require User input, and submission of that input. Examples include educational testing, quizzes, questionnaires, etc.

Forms Feature: Composition

  • Contributes a (self-provided) forms markup module to the Document Text component. TODO whether this is XForms 1.1, WebForms, or something else
  • Contributes the needed SMIL constructs to the SMIL grammar of the host Profile

The Math Feature

Math Feature: Composition

  • Contributes the MathML3 module to the CDI port of Document Text
  • Contributes the MathML3 module to the CDR port of Document Text

The SVG Feature

SVG Feature: Composition

  • Contributes an SVG module to the CDI port of Document Text TODO whether this is SVG1.0,1.1,1.2 or all
  • Contributes an SVG module to the CDR port of Document Text (essentially extending the allowed mimetypes on <object/>)
  • Contributes the SVG mimetype to #The_Image_Component

The Archival Feature

This Feature's purpose is to inject WAV (and possibly other storage-oriented codecs such as MPEG-4 ALS and FLAC) as an allowed audio format in the DTB. The feature would explicitly state that this feature is not expected to be used in DTBs shipped to end users, but rather used during production and archival stages. In other words, the Feature would normatively state that its activation does not impact Reading Device conformance.

By making this a Feature with such accompanied prose, we enable Reading Device implementors to not be forced to add WAV support to their devices (which has been the case for 2.02 and Z2002/2005).

Archival Feature: Composition

  • Extends the allowed formats list of the Audio Component with the WAV(PCM) format possibly also MPEG-4 ALS and/or FLAC

Principles for Fallback/Graceful Recovery

Container-level Fallback

The advent of Profiles brings with it the risk of device incompatibility. A user may have a device that only supports the Audio and Classic Profiles for example, and gets hold of a DTB distribution unit that contains a Pro Profile DTB.

Container-level Fallback addresses this problem, in that it describes how the Container is used to provide fallbacks at the Profile level. An example is available at Container_Research_Committee#EPUB_Compatible_Container, showing how multiple Profile's (all representing different "views" of the same publication) coexist in the same Container, and sharing media resources wherever possible.

The User Agent would process the abstract container using a predefined "preference hierarchy", evaluating each "view" of the publication until it finds a view that it supports. At the bottom of this hierarchy there would be an Audio Profile DTB, or an EPUB, or both.

Package-level Fallback

Package-level Fallback provides a universal (ie not Feature- or content type specific) method for within-Profile fallbacks. It utilizes the fallback mechanism defined by OPF 2.0 (which may eventually be OPF 3.0 by the time of publication) to provide alternate media resources within a given publication. This fallback mechanism is typically used for Features (eg providing an XHTML+MathML document as the top-level choice, and a fallback where the MathML is replaced with, say, images+alt text).

Since the OPF uses mimetypes to signal content types, we may need to use mimetype parameters to signal which features/XML islands are present in a particular document.

Known Issues in Strawman 2

CDR (in Classic) may be redundant

The original intent with allowing CDR extension in Classic was that it provides a lightweight (from an implementors perspective) extensibility pattern. Given that <object/> is a member of (X)HTML5, it can be expected that Reading Devices incorporating browser components for text display could delegate to the browser to handle the object embedding and fallbacks. However,

  • Given that MathML and SVG are allowed to be embedded (CDI) in (X)HTML5, we can expect increased support in browser components for direct inclusion; using object then possibly becomes an unnecessary roundtrip.
  • Given the Package-level Fallback mechanism, we have a lightweight fallback mechanism already in place for those Reading Devices that dont support CDI.

In other words, the Classic Profile could be changed to allow MathML and SVG CDI directly, utilizing the default Package-level Fallback mechanism. Note that this doesnt necessarily preclude us from keeping the CDR extension point (for referencing other future Features that arent blessed by (X)HTML5).

Classic may be overloaded

The Classic Profile currently includes both the NCX+Audio and NCX+Audio+Text DTB types. Would this be better split into two profiles? (comment from Romain)

Original Redmond F2F Whiteboard Images

Component Pool

image:Strawman2_whiteboard_components.jpg

Profiles / Features

image:Strawman2_whiteboard_profiles.jpg

Personal tools