ZedAI RDFa Usage Guidelines

From zedwiki

Jump to: navigation, search

Contents

Overview

The ZedAI specification introduces many new and powerful means of attaching metadata to documents. The increase in attributes also introduces some potential new confusion in how to properly apply them, even for experience producers of DAISY formats, which is what this document seeks to redress.

Classes are not metadata

The class attribute is a perennial source of confusion when it comes to metadata. The attribute is not intended to carry semantic information about the elements it is attached to; it defines one or more classes to which the element belongs (typically a class represents a formatting definition in a CSS stylesheet or is used as a means to apply styles in transformation/rendering).

At best, class serve a general-purpose use for rendering/identification of elements, but there is no controlling the definition of names or what those names could mean in any other context except the one defined by the creator.

For the sake of Z39.86-AI document production, never look to this attribute when you need to attach metadata.

RDF is metadata

The RDF specifications were developed as part of the W3C's Semantic Web Activity to enable the markup of Web pages in a way in which the information they contained could be extracted and used in a machine-processable way.

RDF is too big a topic to cover in detail in this article, but full knowledge of how it works under the hood is fortunately not necessary in order to use and accrue its benefits in Z39.86-AI documents.

One of the deliverables of the ZedAI Working Group was to build a number of RDF vocabularies defining common metadata terms and allowing for their simplified expression within documents. There are three primary vocabularies that resulted from this work that content creators will encounter:

  • the Instance Metadata vocabulary - which defines terms for expressing the required metadata for documents (who created, the profile the document conforms to, etc.);
  • the Structural Semantics vocabulary - which defines terms for identifying properties of the document (author, title, etc.) and the function of elements within it; and
  • the Periodicals vocabulary - similar to the structure vocab, but defines a limited set of terms specific to the production of articles (specifically for the newsfeeds aggregator profile).

If you navigate to the vocabulary pages linked above, you will find xhtml pages with a lot of terms and definitions. That's the simplest presentation and form for the RDF terms, but don't mistake the pages for plain xhtml.

What those pages represent is the RDFa serialization of the RDF vocabularies that are maintained by the Working Group (a serialization being another way of representing the same information). Buried in the markup are RDFa attributes defining the association between each term and its definition, but how that is done is not central to understanding how to use them.

All you need to be aware of is the terms and their definitions. When creating your documents, you can use the terms as their definitions indicate to layer in the additional semantics they represent (but more on how to below).

An RDFa-aware processing agent would also able to read those vocabulary pages and make the same associations between the terms and their definitions (using the RDFa attributes in the markup as a guide), and in that way would know what it had found in your document when it comes the terms you've used.

And that's how you get predictable, meaningful metadata from RDF through RDFa. Now on to how to use the vocabularies and terms in your documents.

RDFa profiles

Every Z39.86-AI document must define a default RDFa profile using the profile attribute on the document root:

<document xmlns="http://www.daisy.org/ns/z3986/authoring/" xml:lang="en"
          profile="http://www.daisy.org/z3986/2011/vocab/profiles/default/">

If you copy and paste the link in the profile attribute into a browser, you will find another xhtml document with RDFa, like the vocabularies discussed above. This document, however, does not define terms for use in your documents, but is like a map for how to get back from your document to the definitions for terms you've used (otherwise your metadata would mean as little as a class name!).

The RDFa profiles serve two purposes: one is to define a default vocabulary for your document and the other is to define a set of prefixes for using terms from other common vocabularies. To understand the benefit of a default vocabulary or what purposes prefixes serve in simplifying your metadata needs, we first need to look at how terms are used in attributes.

The metadata attributes in the Z39.86-AI specification employ a data type called a CURIE for defining each term and where it comes from. A CURIE (or compact URI) typically has two parts: a prefix and a term. An example of a CURIE is z3986:profile (which you'll come across in every document as it is used when defining the Z39.86-AI profile a document conforms to). It may also seem very similar to prefixed element names in markup, which is not coincidental. Both operate in a similar fashion.

In markup, you can use the xmlns namespace declaration on your root element to indicate the default namespace all your elements belong to so that you don't have to prefix them all. This is the same feature that the default RDFa profile provides in declaring a default vocabulary.

You can use any of the terms from the default vocabulary without prefixing them. The profile also defines a number of default prefixes that you must use to reference terms from additional vocabularies, which also means you don't have to declare all the prefixes in each and every document you create.

You can only have one default vocabulary, but you are not limited to the pre-defined prefixes in the default profile. You can attach a prefix attribute to your document root and reference additional vocabularies, but more information on how to do this is available in the specification.

Metadata in use

So now that we've gone through all the details of what RDF metadata is, why it's important, and how it's defined through RDFa, we finally get to the practical part on how to use it.

The first confusion people have with metadata is which attributes to use to attach it.

Head metadata

In the case of required metadata in the document head, you will find a mixture of RDFa attributes such as in the following example:

<head>
    <meta rel="z3986:profile" 
          resource="http://www.daisy.org/z3986/2011/auth/profiles/book/1.0/" />
    <meta property="dcterms:identifier" content="daisy-z2011-exemplar-01" />
    <meta property="dcterms:publisher" content="DAISY Consortium" />
    <meta property="dcterms:date" content="2011-07-27T13:50:05-05:00" />        
  </head>

Although a little daunting to look at the first time, this format never changes and quickly becomes commonplace; these four metadata declarations must appear in every conformant document.

Each meta element is telling you (and a processing agent) something about the document. The first that there is a normative profile for the document at the specified URI (rel indicates the relationship of the resource to the current document). The other three are more straightforward in how they define document properties (anyone familiar with (x)html will notice that the name attribute does not exist for Z39.86-AI meta elements, as it is an imprecise means of identifying a relationship).

A metadata record can also be attached to a document via meta elements in the header:

<meta rel="decl:meta-record" resource="daisy-z2011-exemplar-01-mods.xml">
    <meta property="decl:meta-record-type" 
            about="daisy-z2011-exemplar-01-mods.xml" content="decl:mods" />
    <meta property="decl:meta-record-version"
            about="daisy-z2011-exemplar-01-mods.xml" content="3.3" />
</meta>   

Here we have a slightly more complicated example. The property attributes on the nested meta elements no longer apply to the document because they have explicitly declared what they apply to in their about attributes (in this case, the metadata record). But if you look at the rest of the metadata attributes, you will notice they are the same property/content and rel/resource declarations as before.

It will not be very common, if at all, that you will need to add metadata any more complex than in the header.

Body metadata

When it comes to adding metadata to your document body, the decision will primarily be between the property and role attributes. Although these attributes may seem confusingly similar, there is an easy way to keep them apart when it comes to simple metadata annotations:

  • property should only be used to express properties of the document (the author, editor, translator, title, subtitle, etc.);
  • role should only be used to express the function of an element within the document (a section representing a part of chapter or a quote serving as an epigraph).

Document metadata in the body (i.e., that uses the property attribute) typically will not employ content attributes as were found in the header; the text of the element is assumed to represent the content. For example:

<p property="author">Charles Darwin</p>

The role attribute, on the other hand, can be thought of more as setting the nature of its parent element. Although the content of the element is influenced by this property, it is not a direct value of it.

Metadata support

Although the Z39.86-AI specification requires a limited set of metadata for document identification, there is no requirement that a processing agent be able to do anything specific with the metadata it finds.

This should not discourage you from adding metadata to your documents, though. While not a requirement, it is expected that processing agents will be able to make use of property and role values, they are just not required to be able to resolve the terms back to their vocabularies or do anything specific from an RDF perspective.

Attaching metadata is not just an exercise in markup pedantry, but makes documents richer and fuller for processing agents and reading systems. The more comprehensively you annotate your document with metadata the better suited it will be for multi-format outputs now and in the future. The richer your documents are the greater reading experience your users will have.

RDFa metadata can also be used to build very complex information structures, which we have not covered in this guide (such as dramatis personae sections). At this point in time, any application beyond the basic uses described above should be considered experimental and no processing agent support should be assumed even at the value level.

Personal tools