DAISY Consortium Logo - Link to Home PageAppendix 1: Glossary of Terms

DAISY 3 Structure Guidelines
Last Revised: June 4, 2008

Note: Terms with an asterisk (*) have been Extracted from XML : A PRIMER written by Simon St. Laurent

Analog To Digital. The process of converting an analog audio signal, usually recorded on magnetic tape, into a digital representation for use within a computer. An analog recording may have structure indicated by low-frequency index tones, and an A-D conversion system may capture that information to automatically generate structured markup.


Application*
Either a program that does something (formats, sorts, imports, etc.) with XML or a set of markup tags created with XML. HTML, for example, is an application of SGML, defined with an SGML DTD.


ASCII
American Standard Code for Information Interchange. See http://www.ietf.org/rfc/rfc20


Attribute*
A source of additional information about an element. Attribute values may be fixed in the DTD or listed as namevalue pairs (name="value") in the start tag of an element.


Audio Book
A recording of a human voice reading the book.


Cascading Style Sheets (CSS)*
A standard that provides formatting control over elements using information contained in <STYLE> tags and STYLE attributes. Less powerful than XSL, it nonetheless looks like it has a bright short-term future as the only style mechanism already recommended by the W3C and (partially) implemented in major browsers.


Block Elements (Structures)
Block structures are discrete segments of text that are often separated from surrounding text by blank lines, indentation, etc. The most common block structure is the paragraph.


Character data (CDATA)*
Information in a document that should not be parsed at all. This allows the use of the markup characters & <, and > within the text, even though no elements or entities may appear in the section. CDATA declarations may appear in attributes, and CDATA-marked sections may appear in documents.


Child elements*
An element directly nested inside another element. In <FIRST><SECOND/></FIRST>, the SECOND element is the child element of FIRST element.


DAISY
Digital Accessible Information System. The DAISY Consortium - an international consortium made up of "members" which are organizations producing and or providing accessible reading materials, and "friends" which are companies developing and or manufacturing related services or products.


DAP
Digital Audio Project


Delivery media
That media received by the end user.


Document*
A "textual object." In HTML, documents (or "pages") were single files containing HTML. In XML, documents may contain content from several files or chunks and should include markup structures that make it valid or well-formed.


Document object model (DOM)*
A means of addressing elements and attributes in a document from a processing application or scripts. The W3C has a Document Object Model Working Group that is developing a standard model for HTML and XML documents.


Document type declaration*
In valid documents, the declaration that connects a document to its document type definition. The declaration may connect to an external file or include the definition within itself.


Document type definition (DTD)*
A set of rules for document construction that lies at the hearts of all SGML development and all valid XML document construction. Processing applications and authoring tools rely on DTDs to inform them of the parts required by a particular document type. A document with a DTD may be validated against the definition.


Double Byte Characters
Asian characters, for example, require two bytes to represent single characters. This is a common term used to describe computers' international representation of thousands of characters.


DVD
Digital Versatile Disc. The successor to Compact Disc (CD) as a storage medium.


Element*
The fundamental logical unit of an XML document. All content in XML documents must be contained within one or more elements.


Empty element*
An element that has no textual content. An empty element may be indicated by a start tag and end tag placed next to each other (<EMPTY><EMPTY/>) or by a start tag that ends with /> (<EMPTY/>). Empty elements may contain attributes only.


End tag*
A tag that closes an element. An end tag follows the syntax </Name>, where Name matches the element name declared in the start tag.


Entity*
A reference to other data that often acts as an abbreviation or a shortcut. By declaring entities, developers can avoid entering the same information in a document or DTD repetitively.


E-text
Books in electronic format


Extensible Markup Language (XML)*
A widely adopted W3C recommendation that provides a much simpler set of rules for markup than SGML, while offering more flexibility that HTML.


Extensible Style Language (XSL)*
A style sheet standard submitted by Microsoft, ArborText, and Inso Corporation to the W3C. An XML vocabulary for specifying formatting semantics. XSLT is a language for transforming XML documents into other XML documents.


External DTD subset*
The portion of a document type definition that is stored outside of the document. External DTDs are convenient for storing document type definitions that will be used b multiple documents, allowing them to share a centrally managed definition.


General entity*
An entity for use in document content. When used in documents, the name of a general entity must be preceded by an ampersand (&) and should be followed by a semicolon (;).


Hypertext Markup Language (HTML)*
The most popular markup language in use today, HTML is an application of SGML. HTML is one of the foundation of web development, providing formatting and basic structures to documents for presentation via browser applications.


Hypertext Transfer Protocol (HTTP)*
The protocol that governs communications between clients and servers on the World Wide Web. HTTP allows clients to send requests to servers, which reply with an appropriate document or an error message.


I/O Interface
Input/Output Interface


Instance*
The actual use of an element or document type in a document, as opposed to its definition. An instance may also refer to an entire document; a document may be an instance of a DTD if it can be validated under that DTD.


Internal DTD subset*
The portion of a document type definition that appears inside the document to which it applies. Internal DTD subsets can be hard to manage, but provide developers an easy way to test out new features or develop DTDs without disrupting other documents.


ISO*
The International Organization for Standardization (the acronym is derived from its French name), which sets industrial standard relating to everything from character sets to quality processors to SGML.


Keyboard Equivalents
These are keyboard options that are equal to a mouse click. CTRL+O, for example, is equal to using the mouse to open a file. Keyboard equivalents also known as key combinations, or hot keys, or accelerator keys. Normally these are two keys that are pressed together to perform a function. These are often found in pull-down menus and take the place of going through the menu by mouse clicks. These keys can also replace the need for pressing icons with a mouse.


Legacy Data
Data that was produced with previous technologies, for example analog recordings on tape are the legacy data.


A reference (link) from some point in one hypertext document to (some point in) another document or another place in the same document. A browser usually displays a hyperlink in some distinguishing way, e.g. in a different color, font or style. When the user activates the link (e.g. by clicking on it with the mouse), the browser will display the target of the link.


Markup*
Structural information stored in the same file as the content. Traditionally, structural information is separated from the content and isolated in elements (defined with tags) and entities.


Markup declaration*
The contents of document type declarations, which are used to define the elements, attributes, entities, and notations. They specify the kinds of markup that will be legal in a given document.


Name*
A name must begin with a letter or underscores, and full stops. (Full stops in Latin character sets are periods.)


Name characters*
Letters, digits, hyphens, underscores, and full stops. (Full stops in Latin character sets are periods.)


Name token*
Any string composed of name characters.


NCX - Navigation Control Center
The file in a DAISY DTB which provides a view of all the points in a text to which a user may navigate. Each navigation point in the NCX is linked through the SMIL file to the corresponding location in the audio and XML textual content files, providing direct access to that location.


NISO
National Information Standards Organization. A non-profit standards organization that develops, maintains and publishes technical standards related to bibliographic and library applications.


Notation*
An XML structure that identifies the type of content contained by an element and suggesting a viewer to present it.


Note reference
This is a link to a note


Note
Note is a marked footnote, endnote or some other piece of text in a document.


Parameter entity*
An entity used to represent information within the context of a document type definition. Parameter entities may be used to link the content of additional DTD files to a DTD, or as an abbreviation for frequently repeated declarations. Parameter entities are distinguished from general entities by their use of a percent sign (%) rather than a parent element of the SECOND element.


Parsed character data (#PCDATA)*
Parsed character data is text that will be examined by the parser for entities and markup. Parsed character data should not contain any &, <, or > characters; these need to be represented by the &amp; &lt;, and &gt; entities, respectively.


Phrase detection
A system that detects a pause in the speaker's voice and somehow marks that time for later use.


Processing application*
An application that takes the output generated by a parser (it may include a parser, or be a parser itself) and does something with it. That something may include presentation, calculation, or anything else that seems appropriate.


Processing instruction*
Directions that allow XML authors to send instructions directly to a processing that may be outside the native capacities of XML. A processing instruction is differentiated from normal element markup by question marks after the opening < and before the closing > (i.e. <? Instruction ?> ). The XML declaration is itself a processing instruction.


Prolog*
the opening part of a document, containing the XML declaration and any document type declarations or markup declarations needed to process the document.


Recursion*
A programming technique in which a function may call itself. Recursive programming is especially well-suited to parsing nested markup structures.


Root element*
The first element in a document. The root element is not contained by any other elements and forms the base of the tree structure created by parsing the nested elements.


Scrubbing
Ability to move forward/reverse in the audio while listening to the audio. Similar to cueing during wind/rewind on analog recorders.


Semantic structure
The relationship between a document's content and its structure.


SGML
see Standard Generalized Mark-up Language


Side Information
The elements within a document which are not part of the main body text. Often called side bars, notes, marginalia, margin notes.


Simple link*
A link that includes its target locator in an HREF attribute.


SMIL
Synchronized Multimedia Integration Language. SMIL supports the integration of independent multimedia objects into a synchronized multimedia set.


Spanning
"To extend across", e.g. a project may not fit on one CD-ROM so it is spanned to three CD-ROMs.


Standard Generalized Markup Language (SGML)*
The parent language of HTML and XML. SGML provides a complex set of rules for defining document structures, HTML uses structures defined under that set of rules, whereas XML provide a subset of the rules for defining document structures. SGML is formally standardized as ISO/IEC 8879-1986, although a series of later amendments have continued its development.


Start tag*
The opening tag that begins an element. The general syntax for a start tag is <Name attributes>, where Name is the name of the element being defined, and attributes is a set of name-value pairs. All start tags in XML must either have end tags or use empty element syntax, <Name attributes/>.


Style sheet*
A formatting description for a document. Style sheets may be stored in separate files from the documents they describe.


Style Sheets
Style sheets describe and define the presentation of a document, and can be used for visual and audio presentation.


Unicode*
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. A standard for international character encoding. Unicode support characters that are 2 bytes wide rather than the 1 byte currently supported by most systems, allowing it to include 65,536 characters rather than the 256 available to 1-byte systems. Visit http://www.unicode.org/standard/WhatIsUnicode.html for more information.


Valid*
A document is valid if it conforms to a declared document type definition (DTD) and meets the conditions for well-formedness. All elements, attributes, and entities must be declared in the DTD, and all data types must match their definition's requirement.


W3C*
The World Wide Web Consortium, the standard body responsible for many of the standard key to the functionality if the World Wide Web, including HTML, XML, and Cascading Style Sheets. The W3C site includes the latest public versions of their standards as well as other information about the web and standard processes. Visit http://www.w3c.org for more information.


WAI
Web Accessibility Initiative. The WAI is the international portion of the W3C with the goal of making the web usable by all persons.


Wav
A sound format developed by Microsoft.


Well-formed*
A well-formed XML document is syntactically correct. It does not have angle brackets that are not part of tags. (Entity references are used to embed angle brackets in an XML document.) In addition, all tags have an ending tag or are themselves self-ending. In addition, in a well-formed document, all tags are fully nested. They never overlap. A well-formed document can be processed. A well-formed document may not be valid however. To determine that, a validating parser and a DTD are required. .


XML*
see eXtensible Markup Language


XML declaration*
The processing instruction at the top of an XML document. It begins with <?XML, includes a version identifier, required markup declaration, and encoding identifier, and closes with ?>. (The XML declaration may be case-sensitive at some point; the standard at present is unclear on this issue.)


XSL*
see eXtensible Style Language.