Appendix I: Glossary Of Terms

Version 1.0
Last Revised: June 14, 2002

Note: Terms with an asterisk (*) have been Extracted from XML : A PRIMER written by Simon St. Laurent

Analog To Digital. The process of converting an analog audio signal, usually recorded on magnetic tape, into a digital representation for use within a computer. An analog recording may have structure indicated by low-frequency index tones, and an A-D conversion system may capture that information to automatically generate structured markup.


Application*
Either a program that does something (formats, sorts, imports, etc.) with XML or a set of markup tags created with XML. HTML, for example, is an application of SGML, defined with an SGML DTD.


ASCII
American Standard Code for Information Interchange


Attribute*
A source of additional information about an element. Attribute values may be fixed in the DTD or listed as namevalue pairs (name="value") in the start tag of an element.


Audio Book
A recording of a human voice reading the book.


Cascading Style Sheets (CSS)*
A standard that provides formatting control over elements using information contained in <STYLE> tags and STYLE attributes. Less powerful than XSL, it nonetheless looks like it has a bright short-term future as the only style mechanism already recommended by the W3C and (partially) implemented in major browsers.


Character data (CDATA)*
Information in a document that should not be parsed at all. This allows the use of the markup characters & <, and > within the text, even though no elements or entities may appear in the section. CDATA declarations may appear in attributes, and CDATA-marked sections may appear in documents.


Child elements*
An element nested inside another element. In <FIRST><SECOND/></FIRST>, the SECOND element is the child element of FIRST element.


DAISY
Digital Audio-based Information System. The DAISY Consortium -- A new digital talking book system has been developed as a result of a project by the Swedish Library of Talking Books and Braille, TPB.


DAP
Digital Audio Project


Delivery media
That media received by the end user.


Document*
A "textual object." In HTML, documents (or "pages") were single files containing HTML. In XML, documents may contain content from several files or chunks and should included markup structures that make it valid or well-formed.


Document object model (DOM)*
A means of addressing elements and attributes in a document from a processing application or scripts. The W3C has a Document Object Model Working Group that is developing a standard model for HTML and XML documents.


Document type declaration*
In valid documents, the declaration that connects a document to its document type definition. The declaration may connect to an external file or include the definition within itself.


Document type definition (DTD)*
A set of rules for document construction that lies at the hearts of all SGML development and all valid XML document construction. Processing applications and authoring tools rely on DTDs to inform them of the parts required by a particular document type. A document with a DTD may be validated against the definition.


Double Byte Characters
Asian characters, for example, require two bytes to represent the single characters. This is a common term used to describe the computers international representation of thousands of characters.


DVD
Digital Versatile Disc. The next generation of Compact Disc (CD).


Element*
The fundamental logical unit of an XML document. All content in XML documents must be contained within elements.


Empty element*
An element that has no textual content. An empty element may be indicated by a start tag and end tag placed next to each other (<EMPTY><EMPTY/>) or by a start tag that ends with /> (<EMPTY/). Empty elements may contain attributes only.


End tag*
A tag that closes an element. An end tag follows the syntax </Name>, where Name matches the element name declared in the start tag.


Entity*
A reference to other data often acts as an abbreviation or a shortcut. By declaring entities, developers can avoid entering the same information in a document or DTD repetitively.


E-text
Books in SGML format


Extensible Markup Language (XML)*
A standard under development by the W3C that provides a much simpler set of rules for markup than SGML, while offering considerably more flexibility that HTML.


Extensible Style Language (XSL)*
A style sheet standard submitted by Microsoft, ArborText, and Inso Corporation to the W3C. XSL allows developers to specify formatting far more precisely than Cascading Style Sheets permit. XSL seems promising, but is not yet a W3C working draft recommendation.


External DTD subset*
The portion of a document type definition that is stored outside of the document. External DTDs are convenient for storing document type definitions that will be used b multiple documents, allowing them to share a centrally managed definition.


General entity*
An entity for use in document content. When used in documents, the name of a general entity must be preceded by an ampersand (&) and should be followed by a semicolon (;).


Generalized Markup Language (GML)*
The predecessor to SGML, developed in 1969 by IBM in efforts led by Charles Goldfarb. GML originated the use of <, >, and / for markup and is still in use for document applications.


Hybrid book
This is a book, newspaper, journal etc. that contains both the text and the audio recording. (1) [SGML] marked up version of the document together with recorded version of the document. (2) Recorded version of a document together with a Braille index and tactile diagrams


Hypertext Markup Language (HTML)*
The most popular markup language in use today, HTML is an application of SGML. HTML is one of the foundation of web development, providing formatting and basic structures to documents for presentation via browser applications.


Hypertext Transfer Protocol (HTTP)*
The protocol that governs communications between clients and servers on the World Wide Wed. HTTP allows clients to send requests to servers, which reply with an appropriate document or an error message.


I/O Interface
Input/Output Interface


Instance*
The actual use of an element or document type in a document, as opposed to its definition. An instance may also refer to an entire document; a document may be an instance of a DTD if it can be validated under that DTD.


Internal DTD subset*
The portion of a document type definition that appears inside the document to which it applies. Internal DTD subsets can be hard to manage, but provide developers an easy way to test out new features or develop DTDs without disrupting other documents.


ISAT
International Structured Audio Team


ISO*
The International Organization for Standardization (the acronym is derived from its French name), which sets industrial standard relating to everything from character sets to quality processors to SGML.


Keyboard Equivalents
These are keyboard options that are equal to a mouse click. CTRL-O, for example, is equal to using the mouse to open a file. Keyboard equivalents also known as: key combinations, or hot keys, or accelerator keys. Normally two keys that are pressed that perform a function. You see these in pull down menus and take the place of going through the menu. These keys can also replace the need for pressing icons with a mouse.


Legacy Data
Data that is left over from previous technologies. The analog recordings on tape are the legacy data that the Consortium wants to address. This normally involves a conversion process.


A reference (link) from some point in one hypertext document to (some point in) another document or another place in the same document. A browser usually displays a hyperlink in some distinguishing way, e.g. in a different color, font or style. When the user activates the link (e.g. by clicking on it with the mouse,) the browser will display the target of the link.


Markup*
Structural information stored in the same file as the content. Traditionally, structural information is separated from the content and isolated in elements (defined with tags) and entities.


Markup declaration*
The contents of document type declarations, which are used to define the elements, attributes, entities, and notations. They specify the kinds of markup that will be legal in a given document.


Name*
A name must begin with a letter or underscores, and full stops. (Full stops in Latin character sets are periods.)


Name characters*
Letters, digits, hyphens, underscores, and full stops. (Full stops in Latin character sets are periods.)


Name token*
Any string composed of name characters.


NCX - Navigation Control Center
Part of the DAISY System which supports the linked navigation around a document.


NISO
National Information Standards Organization. The goal in using technical standards in information services, libraries, and publishing is to achieve compatibility and therefore interoperability between equipment, data, practices, and procedures. Using technical standards makes information services more productive.


Notation*
An XML structure that identifies the type of content contains by an element and suggesting a viewer to present it.


Note reference
This is a link to a note


Note
Note is a marked footnote, endnote or some other piece of text in a document.


Parameter entity*
An entity used to represent information within the context of a document type definition. Parameter entities may be used to link the content of additional DTD files to a DTD, or as an abbreviation for frequently repeated declarations. Parameter entities are distinguished from general entities by their use of a percent sign (%) rather than an parent element of the SECOND element.


Parsed character data (#PCDATA)*
Parsed character data is text that will be examined by the parser for entities and markup. Parsed character data should not contain any &, <, or > characters; these need to be represented by the &amp; &lt;, and &gt; entities, respectively.


Phrase detection
A system that detects a pause in the speaker's voice and somehow marks that time for later use.


Processing application*
An application that takes the output generated by a parser (it may include a parser, or be a parser itself) and does something with it. That something may include presentation, calculation, or anything else that seems appropriate.


Processing instruction*
Directions that allow XML authors to send instructions directly to a processing that may be outside the native capacities of XML. A processing instruction is differentiated from normal element markup by question marks after the opening < and before the closing > (i.e. <? Instruction ?> ). The XML declaration is itself a processing instruction.


Prolog*
the opening part of a document, containing the XML declaration and any document type declarations or markup declarations needed to process the document.


Recursion*
A programming technique in which a function may call itself. Recursive programming is especially well-suited to parsing nested markup structures.


Root element*
The first element in a document. The root element is not contained by any other elements and forms the base of the tree structure created by parsing the nested elements.


Scrubbing
Ability to move forward/reverse in the audio while listening to the audio. Similar to cueing during wind/rewind on analogue recorders.


Semantic structure
The relationship between a document's content and its structure.


SGML
see Standard Generalized Mark-up Language


Side Information
The elements within a document which are not part of the main body text. Often called side bars, notes, marginalia, margin notes.


Simple link*
A link that includes its target locator in an HREF attribute.


SMIL
Synchronized Multimedia Integration Language, SMIL allows integrating a set of independent multimedia objects into a synchronized multimedia


Spanning
"To extend across", e.g. a project may not fit on one CD-ROM so it is spanned to three CD-ROMs.


Standard Generalized Markup Language (SGML)*
The parent language of HTML and XML. SGML provides a complex set of rules for defining document structures, HTML uses structures defined under that set of rules, whereas XML provide a subset of the rules for defining document structures. SGML is formally standardized as ISO/IEC 8879-1986, although a series of later amendments have continued its development.


Start tag*
The opening tag that begins an element. The general syntax for a start tag is <Name attributes>, where Name is the name of the element being defined, and attributes is a set of name-value pairs. All start tags in XML must either have end tags or use empty element syntax, <Name attributes/>.


Style sheet*
A formatting description for a document. Style sheets may be stored in separate files from the documents they describe.


Style Sheets
A style sheet language offers a powerful and manageable way for authors, artists, and typographers to handle their special presentation needs by creating the visual effects they want.


Text block
Text block is a term used in SGML and HTML to describe various elements that are considered parts of normal text. These include: headings, paragraphs, ordered lists (letters or numbers), unordered lists (bullets), footnotes, endnotes, back matter notes (all notes), dictionary terms, dictionary definitions, etc. Note that tables may be considered text blocks, but their structure is much different and are treated separately from straightforward text. the relationship between a document's content and its structure. SGML allows document-based information to be shared and re-used across applications and computer platforms in an open, vendor-neutral format.


Unicode*
A standard for international character encoding. Unicode support characters that are 2 bytes wide rather than the 1 byte currently supported by most systems, allowing it to include 65,536 characters rather than the 256 available to 1-byte systems. Visit http://www.unicode.org for more information.


Valid*
A document is valid if it conforms to a declared document type definition (DTD) and meets the conditions for well-formedness. All elements, attributes, and entities must be declared in the DTD, and all data types must match their definition's requirement.


W3C*
The World Wide Web Consortium, the standard body responsible for many of the standard key to the functionality if the World Wide Web, including HTML, XML, HTTL, and Cascading Style Sheets. The W3C site includes the latest public versions of their standards as well as other information about the web and standard processes. Visit http://www.w3c.org for more information.


WAI
Web Accessibility Initiative. The WAI is the international portion of the W3C with the goal of making the web usable by all persons.


Wav
A sound format developed by Microsoft.


Well-formed*
A well-formed document may or may not have a DTD. Well-formed document must have with an XML declaration and contain properly nested and marked-up elements.


XML*
see eXtensible Markup Language


XML declaration*
The processing instruction at the top of an XML document. It begins with <?XML, includes a version identifier, required markup declaration, and encoding identifier, and closes with ?>. (The XML declaration may be case-sensitive at some point; the standard at present is unclear on this issue.)


XSL*
see eXtensible Style Language.


Copyright © 2002 DAISY Consortium