ZedAI W3C Modularization and XHTML2
From zedwiki
This document discusses the working hypothesis of using W3C Modularization and W3C XHTML2 as the fundaments of the Z39.86 Authoring and Interchange Framework (hereafter ZAIF for short). This document is provided as a part of the ZedAI WG Iteration 0 deliverables.
Please refer to ZedAI_Terminology for explanation of terms used herein.
Contents |
On building a Modularization framework
In the ZAIF context, the Z39.86 committee has settled on the approach of using modularization as the underlying architecture for grammar composition. The following general principles form the rationale behind this approach:
- It enables us to create reusable building blocks (modules) that have well-scoped semantics and functionality. These components can be injected into any grammar (Z39.86-owned or other) where that particular functionality is needed.
- It enables us to "inherit" modules from other grammars and incorporate those into the ZAIF grammars where applicable. The ability to effectively reuse components created by other organisations yields a setup that is powerful in terms of expressivity and effective in terms of economy.
- Based on a dynamic collection of modules and a well-architected grammar composition algorithm, we become able to create an open-ended set of "tailor-made" grammars that can faithfully represent given types of input data, without introducing unwanted "noise" in terms of contextually irrelevant content models. The customization process can also include the incorporation of modules relating to output formatting (such as modules whose scope is pronounciation instructions for TTS engines, or Braille formatting). The customization process can be inter- or intraorganizational.
In XHTML™ Modularization 1.1, the following rationale is provided for W3C's modularization of XHTML:
The modularization of XHTML refers to the task of specifying well-defined sets of XHTML elements that can be combined and extended by document authors, document type architects, other XML standards specifications, and application and product designers to make it economically feasible for content developers to deliver content on a greater number and diversity of platforms.
[...]
Modularizing XHTML provides a means for product designers to specify which elements are supported by a device using standard building blocks and standard methods for specifying which building blocks are used. These modules serve as "points of conformance" for the content community. [...] By specifying a standard, either software processes can autonomously tailor content to a device, or the device can automatically load the software required to process a module.
Modularization also allows for the extension of XHTML's layout and presentation capabilities, using the extensibility of XML, without breaking the XHTML standard. This development path provides a stable, useful, and implementable framework for content developers and publishers to manage the rapid pace of technological change [...].
Within the W3C, the provision of modularized grammars has become a common modus operandi. Examples of W3C recommendations that use modularization include XHTML 1.1, XHTML 2.0, SVG 1.2, SMIL3 and CSS3. Outside the W3C, grammars such as TEI P5 are also designed using modularization principles.
Benefits of adopting the W3C modularization framework
- Adopting an existing modularization framework minimizes development and maintenance costs
- Common-sense principle: if there is an existing modularization framework that meets the ZAIF needs, we should adopt it instead of rolling our own.
- Adopting an established modularization framework maximizes technological compatibility
- By using an existing framework provided by an organisation with such wide reach and adoption rate as the W3C, we maximize the chances of being able to reuse mainstream processing agents with little or no customization needed. This argument is well aligned with the Z39.86 revision high-level requirement of maximizing DAISYs mainstream compatibility.
- Compatibility with W3C modularization leads to a clear inheritance chain
- As many of the W3C recommendations are intentionally being or being made compatible with XHTML modularization, we assure that we are architecturally well-aligned with a large body of contextually relevant XML specifications. This means that the incorporation of grammars such as MathML, SVG, ITS, etc into our document types should be anything from fairly straight-forward to trivial.
XHTML2 as the host grammar
XHTML2 is a fundamental rewrite of XHTML 1.x with a strong focus on structure, accessibility and internationalization. At the time of writing it is a W3C Working Draft, with Last Call expected early 2009.
The following properties of XHTML2 make it a viable candidate for election as the host grammar of the ZAIF:
- Sufficiently neutral
- XHTML 2 defines itself as a "general purpose markup language designed for representing documents for a wide range of purposes". This means that we adopt a host language that is sufficiently neutral semantically and structurally to be able to host the intents of various ZAIF Profiles without skewing or distorting their content models.
- Natively modularized
- XHTML2 is natively modularized and provides all necessary means for specialization and customization of the content model. Remember that for ZAIF purposes, a modularization framework must not only allow for injecting modules into the host grammar, but also for imposing additional restrictions on third party modules as well as modules of the host grammar itself.
- A good starting point for grammar composition
- XHTML2 provides a collection of modules that a Profile creator should be able to elect for reuse. Some of the XHTML2 modules are quite generic in nature, such as The Tables Module and The List Module, whereas others are more specialized (but still with a high likelihood of applicability for ZAIF), such as The Role Attribute Module, The Ruby Module, The XForms Module and RDF/A.
- Minimized duplication of effort
- By using XHTML2 as the host language, we will implicitly be recommending the reuse of XHTML2 modules where applicable. Through this, Profile creators can create grammars with minimized duplication of effort. It is easy to argue that DAISY (or any other ZAIF Profile creator) should not have to maintain its own Tables Module for example. And, if by chance the XHTML2 Tables Module should be deemed insufficient for the needs of a Profile being created, XHTML2 provides the means for changing it as opposed to forking and rewriting it from scratch. (Note that elements that XHTML2 does not define will still need to be created and incorporated into modules. The simplest example of this is the DAISY notion of the print page number).
- Support for complex content in processing agents
- Through using XHTML2 as the host language, we increase the chances of getting functional support in processing agents for complex content types. Classic examples include the rendering of grammars such as MathML, Ruby and XForms in authoring tool viewports.
- Increased compatibility with mainstream tools
- By using XHTML2 as the host we inherit an XML document framework whose constructs have well-known semantics and well-known behaviors. We adopt a host provided by an organisation with wide reach and high adoption rate. In the longer run, this is likely to increase the chances of being able to use mainstream processing agents with little or no customization. This argument is well aligned with the Z39.86 revision high-level requirement of maximizing DAISYs mainstream compatibility.
- Simplicity where needed
- Although ZAIF Profiles can vary indefinitely in terms of complexity (and vary to a large extent regarding the amount of XHTML2 constructs they include), we do open up for the opportunity of providing ZAIF Profiles that through the reuse of common XHTML constructs are remarkably simple to learn and use, thus serving as viable starting points for organisations who are entering the domain of accessible content, and organisations who do not have the economy to author content using more complex (and therefore more costly) ZAIF Profiles. It is an important aspect and selling point of the ZAIF to be able to say "If you know XHTML2, this is actually quite easy to do", whilst at the same time not creating hinderances for those organisations who have chosen to author content using more intricate ZAIF Profiles.
Questions and Answers
- Couldn't we do the same thing, but with a rewritten DTBook as the host grammar?
- In theory, yes we could. But in terms of the arguments above on semantic neutrality, the reuse of existing technologies and maximizing mainstream compatibility, there seems to be little reason to do so. In order to rewrite DTBook to do what we need in a profiling context, we would have to rewrite it to look a lot like XHTML2.
- I am content with DTBook-current. What will happen to it?
- Some organisations have found DTBook-current sufficient for their needs, others have not. In particular, the need to use DAISY for other content-types than the print text-book have made us move towards a modularization framework and profiles as opposed to DTBook-current (which is a static, print-text-book-centric grammar). One of the ZAIF Profiles that will ship with the Z39.86 revised standard is likely to be a functional replacement of DTBook-current in that it will focus in capturing print text-book semantics. How much it will resemble DTBook-current we do not know at the moment; but we will do our best to carry over the good parts (and discard the less stellar bits).
- Doesn't XHTML introduce a lot of overly loose content models and web-related noise that we are not interested in?
- First, XHTML2 does so less than XHTML1.x. Second, the content models defined by XHTML2 can be restricted, mildly or heavily, by a ZAIF Profile. The ZAIF WG successfully elaborated on this during Iteration 0.
- How much structural bias will it impose to use XHTML2 as the host grammar? What will be the minimally required XHTML2 modules to use in a profile?
- We do not know yet. It is clear that if the XHTML1.1 host conformance rules remain for XHTML2, we cannot adopt them per se, because they require too much of minimal inclusion, and they dont allow restricting content models of imported XHTML modules. The ambition is to find the balance between minimalistic module inclusion requirements (which would guarantee expressive freedom for ZAIF Profile creators) and maintain a certain degree of predictability of any given profile (which would reduce processing complexity).
- Are there any alternative Modularization/Profiling frameworks that form alternatives to XHTML Modularization as provided by the W3C?
- Yes. An example is TEI P5 ODD.
