ZedAI Iteration0 Report

From zedwiki

Jump to: navigation, search

2008-11-04

Contents

Summary

This iteration was largely exploratory and preparatory in nature. The WG had tasked itself to perform a feasibility check on the cluster of technologies that were put forward as solution candidates for the Authoring and Interchange format during the September 2008 Google F2F.

The feasibility check had a positive outcome. The ZedAI WG have found that the outlined approach has the potential, and, as far as we can see now, the necessary properties to be the foundation for an effective and extensible DAISY Authoring and Interchange Framework.

We are therefore recommending the ZedAI WG continue down this path for the coming iteration. Read further in Deliverables for the next iteration below.

Terminology

Please refer to ZedAI_Terminology for explanation of terms used herein.

Iteration 0 Deliverables Report Part A: Technological Feasibility Check

Hypothetical leisure profile

The technology feasibility check was performed by creating a hypothetical document model (aka an Authoring and Interchange Profile).

This profile is a for-testing-purposes-only, minimalistic leisure book profile, with the following properties:

  • It uses XHTML2 as the host grammar.
  • It uses a subset of the XHTML2 modules
  • It enforces additional restrictions on various XHTML2 content models (additional to the restrictions of the XHTML2 canonical schemas themselves).
  • It injects a DAISY pagenum element into the XHTML2 content model. The DAISY pagenum is in a namespace separate from XHTML2.
  • It poses restrictions on where in the XHTML2 host document model the daisy pagenum element may occur.
  • It employs RDF and xhtml:role to provide behavorial and structural semantics for elements foreign to the host namespace (in this case, the DAISY pagenum).

Schemas, Modules

A RelaxNG Schema Module for the DAISY pagenum element was written. This module declares one single element (pagenum) in a hypothetical future ZedNext namespace.

Two schemas were written to declare the desired document model and to enable validation of document instances.

The first schema is pure RelaxNG. This approach is based on making a custom version of the XHTML2 RelaxNG schema driver provided in the XHTML2 spec-provided RelaxNG schema. Note that while this approach is effective and powerful in terms of expressivity and restrictability, it does require that all modules that are to be included in a profile are written in RelaxNG.

The second schema uses NVDL. This schema demonstrates how to declare and validate the document model without any customization of the host grammar schema whatsoever. It is also applicable in the circumstance where module schemas are written in different schema languages.

Imposed Constraints

Below is a list of constraints enforced by the schemas. (Note - these are just select sample restrictions, in a completed profile additional ones would be added, and some of the below could be removed. We are also expecting that upcoming XHTML2 drafts will introduce some cleanup on its own.)

  • The section element must contain one and only one element from the heading class (i.e. h and h1 to h6).
  • The heading element must be the first child of the section element except for one and only one optional pagenum.
  • Further siblings are required after the heading.
  • The section element may not contain text.
  • The pagenum element may only be child of the section and p elements.
  • The pagenum element must have a page attribute with one of the values "special", "front" or "normal".
  • The pagenum element may have any of the Core Attributes as defined in the XHTML2 module; xml:id, class, layout and title.
  • If the value of the page attribute is "front" the content of the pagenum element must be a roman numeral in either upper- or lowercase.
  • If the value of the page attribute is "normal" the content of the pagenum element must be a positive integer.
  • There are no content restrictions if the value of the page attribute is "special". In that case, the pagenum element may even be empty.

Tested and Proven Features

With the sample document instance and the schemas, we successfully implemented the following required features of the AI format:

  • Show that it is possible to apply a reductionist approach and eliminate any unwanted content model features of the host grammar, while still being conformant to the host grammar (e.g. an NVDL dispatch to the canonical XHTML2 schema would validate)
  • Show that it is possible to inject additional elements in additional namespaces into chosen slots in the host model, effectively creating a compound document. The method does not depend on entity extension slots such as is common when trying to do the same with DTDs.
  • Show that it is feasible and economical to create schemas that declare the desired document model of the profile, and to validate document instances. Note that in none of the cases (RelaxNG, NVDL) have the source schemas been modified; both approaches allow an externalized and relatively agile apprach for content model modification.

Sample document and schemas download

A sample document instance, and the schemas to validate it, can be downloaded from The ZedNext Google Code Repository. Inline comments in the schemas and document instance provides further discussion of details. Please see readme.txt in the root of the downloadable archive for setup details.

RDF/Role Framework

A separate document, ZedAI_Roles_Report20081026, provides the report for the RDF/Role work done during this iteration.


Iteration 0 Deliverables Report Part B: W3C Modularization and XHTML2 Overview

A separate document, ZedAI_W3C_Modularization_and_XHTML2, provides an overview discussion on the percieved benefits of using W3C Modularization and XHTML 2 as the basis for the Z39.86 Authoring and Interchange Framework.


Preparatory research done during Iteration 0

Periodicals profile

As a sidetrack during this iteration, select WG members were tasked to prepare background documentation regarding the upcoming composition of a periodicals profile. The Wiki documents on NewsML and the RNIB DTDs constitute this deliverable.

IFLA/LBS WG creating a ZedAI metadata set

A working group consisting of members of the IFLA Cataloguing Steering Committee has been created to work on a set of metadata elements for the ZedAI format. This will eventually become a DAISY-namespaced module that can be injected into any of the ZedAI profiles. Matt Garrish of CNIB is leading the group. The group will deliver a first draft in February, which will likely coincide with the end of (and become a deliverable in) ZedAI iteration 3.

Risk analysis

XHTML2 Dependency

risk level: medium

By building on the XHTML2 host grammar and modularization framework, we are effectively creating a dependency on this spec being a final W3C Recommendation before the end of the ZedNext revision process. ZedAI WG members are communicating with XHTML2 WG members to try to establish an understanding of the timeline from a W3C perspective. At the W3C TPAC meeting held in end October 2008, it was noted that the W3C XHTML2 WG is targeting a Final Call of XHTML2 in January 2009. This is good news, but is not a guarantee that XHTML2 will be a final recommendation by the time we would need it to be.

This is reminiscent of Z39.86-2002, where we had to wait an additional 6 months(?) for SMIL2 (Boston) to become a final rec.

Complexity of Profile Creation

risk level: low

While the ZedNext spec will ship with a set of readymade profiles (where we target to cover 80% of the community needs), the ZedAI Framework will also allow organisations to create their own profiles, that address input type semantics or output formatting concerns that are not covered in the readymade profiles. The question becomes: how complex will profile creation be?

Even though the ZedAI framework will include a step-by-step guide on custom profile creation, creating a profile will as it seems require relatively proficient XML skills, including understanding of various XML schema languages, RDF, and the nature of compound document formats. The good news is that the use of modern schema technologies such as NVDL and RelaxNG makes at least parts of this process much easier and economical than previous technologies (DTD modularization in particular) have allowed.

Also, note that custom profile creation comes in different flavors:

  1. Create a profile by expressing additional restrictions on an existing profile. This is the least challenging approach, and would minimally require some Schematron rule authoring and/or RelaxNG/XSD modification.
  2. Create a profile by adding one module to an existing profile. (Example: add the Ruby module to the Leisure profile). This is a medium-challenging approach, which would minimally require some RelaxNG or NVDL modification.
  3. Create a profile by assembling an entirely new set of modules, some of which may be custom written for the profile in question. This is going to require more work and a higher skill set, requiring module authoring, schema authoring, and the authoring of other document types required for the Framework Conformance (RDF ontologies, Resource Directories, etc).

By going for an 80/20 rule in terms of needs coverage in the shipped profiles, we are hoping to create the best situation possible. Perhaps one can also assume that organisations that do not fall within the 80 category are organisations with a higher likelihood of having XML-savvy staff available.

Deliverables for the next iteration

Iteration time span: 3 November 2008 - 19 December 2008

ZedAI Profile Composition Guidelines and Conformance Requirements (first draft)

Responsible: MG.

This will be a section of the future spec. Target information items include:

  • Normative profile conformance requirements
  • Normative general principle for document instance format/version declaration
  • Normative general principle for document instance resource discovery
  • Informative composition step-by-step guide

A complete ZedAI Leisure Profile (first draft)

Group: PS, MG, SO. Responsible: PS.

This will be a simple but complete ZedAI profile. If included in the final spec, this profile can serve as a starting point for new users, and will fully cover the structural semantics of standard leisure books.

  • define what XHTML2 modules are to be included and how they are to be combined and restricted
  • define what DAISY-specific modules/elements are needed, are where they should be allowed in the host document model
  • discuss DTBook-current inheritance issues with entire committee: what aspects to keep, what aspects to drop (this will be fun)
  • define schema design pattern principles
  • create a schema that declares the content model
  • outline and demonstrate a principle for inline documentation in schema modules
  • author sample documents
  • demonstrate document creation in an existing authoring environment (typically by import of profile schema)
  • provide a Resource Directory for the Profile [MG] and means to reference it inline.

The Leisure profile is also expected to be the starting point for the creation of more advanced print book oriented profiles (at minimum through reuse of modules from the Leisure profile).

An RDF/Role taxonomy for the DAISY Semantics Leisure Profile (first draft)

Group: BG, MdM. Responsible: BG.

  • In parallell with the evolving Leisure Profile, compose RDF/Role constructs for all DAISY-specific constructs in the document model.

Periodicals Profile Document Model Proposal (first draft)

Group: KJ, OHA, SP. Responsible: KJ

  • Create (without going into the details of schema composition etc) a proposal for a Periodicals Profile document model.

The deliverable consists of a set of concrete document instances which demonstrate the desired document model, and a Wiki page that discusses the suggested approach.

  • Recommend whether the resulting profile (which will be put together in the following iteration) should be a compound document type importing modules/constructs from the NewsML and NITF grammars, or if we should create DAISY-namespaced modules, and use RDF/Role to map to NewsML/NITF constructs. (For the latter alternative, refer to the RNIB magazine DTD.)

An RDF/Role construct for the DAISY Periodicals Profile (outline)

Group: BG, MdM. Resonsible: BG.

  • In parallell with the Periodicals Profile document model work, compose at least one RDF/Role construct that demonstrate the expression of inheritance from foreign grammars (NewsML, NITF).

Iteration 0 Issue Log

The issues logged here are being brought along for subsequent solution during coming iterations.

Note - atomic grammar design issues are not mentioned here. These are tracked individually in the ZedZext Google Code issue tracker.

Categories are: Framework, Schemas, RDF.

0-1 Instance Profile Declaration [Category - Framework]
There needs to be a universal method for document instances to declare which profile they adhere to. One approach is to use the XHTML2 root element version attribute, or declare our own root element attribute in case we would skew the semantics of xhtml2:@version. This attribute is currently undefined, but there is talk in the XHTML2 WG of specifying it more clearly.
0-2 Schema Inline Documentation [Category - Schemas]
We need a principle for inline schema documentation, and tools to generate human-readable documents from this.
0-3 Always NVDL? [Category - Schemas]
Should we require that all profiles use NVDL for validation, or should we allow to use RelaxNG or WXS if that is viable for the particular profile?
0-4 Define a module/driver design pattern [Category - Schemas]
By following module design patterns (for those modules we create/own ourselves) we will maximize predictability and reusability. In the case of RelaxNG schemas, the use of design patterns can also be beneficial in terms of allowing conversion to W3C Schema.
0-5 xml[colon]id [Category - Framework]
Should we require xml[colon]id to be used to express IDness? The benefits are that IDness would have a universal form and name (e.g. not vary per module nor per Profile), and that processing agents can identify fragments in the absence of a schema.
0-6 Resource Discovery [Category - Framework]
We want to avoid inlining things like xhtml:role and concrete schema references. This is both to reduce instance verbosity, and (in terms of schemas) to adhere to the design approach of loose coupling. By using an approach akin to RDDL we can allow processing agents to discover resources - these resources may be RDF taxonomies, schemas, transformation resources, prosaic descriptions, etc. The discovery could be based on using a resolving URI as the basis for the Instance Profile Declaration.
0-7 Clarification of uses of roles vs. elements
We should agree on and document the distinction between what types of information is to be carried by roles, and what by XML elements and attributes.
Personal tools