DAISY Pipeline Use Cases

The DAISY Pipeline is a dainty tool that solves part of the conundrum arising from issues between complexity and complication, where the complex part is the user's necessary decision of what he wants to do, but the complicated part of implementing that decision is taken care of by the Pipeline.

The following descriptions will show paths to what the Training Team has found out to be the most widely used transformation possibilities the DAISY Pipeline offers and these require but one possible starting file: DTBook. This is the *.xml file that could be derived from the Save as DAISY plug-in for Microsoft Word and Open Office. In fact, Save as DAISY is the recommended tool for DAISY editing and arriving at that file, but only that: All further transformations, including TTS narration, should be performed by the DAISY Pipeline until certain kinks in Save as DAISY have been addressed. You will be informed when this has been done.

Original reason for creating the DAISY Pipeline was the DAISY Consortium's guarantee to its members that content produced at any given stage of development of the DAISY standard would be future safe and thus could be migrated to the actual stage of said standard. This promise was not only kept, but dozens of other transformational possibilities have been added and are still being added by DAISY members.

As of today the present user interface is composed of 52 different of such possibilities, each of which is composed of several transformational steps that follow each other in a logical path up to the desired outcome.

Transformation example one: Validation

(to check the validity, the concordance with the DAISY Standard, of a finished DAISY book or file)

Validate everything: a DTBook file, a DAISY 2.02 finished book, a DAISY 3 finished book and then some…

To start: this is not a transformation, thus it is found under the section: 'Verify'. Validation is probably the single most important step in the production of anything related to DAISY. Valid input and outcome are of essential importance since almost all transformations in the DAISY Pipeline require valid input, but more importantly even is the validity of the resultant DAISY DTBs that are produced:

DAISY DTBs that don't conform to the standards:

  • may not be called that

  • are not guaranteed to be usable by any or all DAISY reading devices

  • are not future safe

So as content is handed from one step to the next along a DAISY production line it should be assured that it is validated after each step.

Here it goes:

Ctrl + N (or click: New Job Wizard)(you can find all keyboard shortcuts by Ctrl + Shift + L)

Image of 'Pipeline Wizard' icon in DAISY Pipeline

then choose=>Verify

Image of Pipeline wizard with highlighted path to DTBook Validator

There are now six different validators of which three will be explained below.

Case one: DTBook validation

(this is the *.xml file generated by Save as DAISY)

choose=>DTBook Validator

Here you need browse for the path to your *.xml DTBook file as Input

Image of Pipeline configuration window with highlited Parameters

The Optional Parameters available can be left out but if you like to get an more extensive report you should set the path to the folder where the report should end up and check the tick box 'Generate Context Info' which will give you details about the errors and where exactly they can be found.

Hit 'Finish'

After that run the job with Ctrl + F1 or by clicking here

Image of Pipeline 'run' icon

Since you should have validated the DTBook file when it was produced by Save as DAISY, you will be congratulated for your efforts:

Image of Pipeline message window with 'Congratulations' highlited

Case two: DAISY DTB 2.02 validation

Pretty much the same as above, with the difference that here you choose =>DAISY 2.02 DTB Light Validator

Image of Pipeline wizard with highlighted path to DAISY 2.02 Validator

After hitting 'Next' you set the path to the *.ncc file of your DAISY 2.02 book and run the job with Ctrl + F1

Case three: DAISY DTB 3 validation

(to check the validity of a finished DAISY book)

As above with the difference that the validator to choose is Z3986 (the official name of DAISY 3)

Image of Pipeline wizard with highlighted path to Z3986 DTB Validator

Transformation example two: Using the Narrator to transform a DTBook xml file to a DAISY book in two versions

This will get you a DAISY 2.02 and DAISY 3 book narrated with synthetic voice if you have such a voice (TTS) residing on your hard drive.

Input required: a valid(!) DTBook (DAISY XML) file

Path:

Ctrl + N (or click: New Job Wizard) [http://daisymfc.sourceforge.net/doc/enduser/01-gui-user-guide.html#jobsPerspNewJobWizard]

choose=>Create and Distribute=>DAISY Book=>TTS Narrator (DAISY XML to DAISY Book)

Image of Pipeline wizard with highlighted path to TTS Narrator

=>Next=>

Here you need browse for the path to your *.xml DTBook file as Input

AND

do the same for your output directory

Image of Pipeline configuration window with highlited Parameters

Then there are Optional Parameters:

  • DTBook fix: If you are sure of the validity of your xml then there is no need to choose this.

  • Apply sentence detection: This will increase local navigation possibilities for the end user

  • Multi-language support: If there are different languages present in the DTBook AND you have synthetic voices (TTS[link:FAQ]) installed that support those languages, this should be ticked.

  • MP3 Bitrate: there are several choices that influence the quality of the compressed audio, the higher the bit rate, the better the quality but the larger the file. [link:FAQ]

  • 2.02 href target: The Pipeline Narrator will automatically produce two DAISY Books: one in DAISY 2.02 and another in DAISY 3 (ANSI39.86). IF for any reason you intend to open the DAISY 2.02 book with SIGTUNA (an authoring software) afterwards you should choose "txt" here.

Image of Pipeline configuration window with highlited Parameters

After all this is done, hit 'Finish' and wait to be congratulated for a valid, synthetically narrated book as can be seen in the last lines of the Messages box:

Transformation example three: From DTBook to DAISY 2.02 file set

This transformation is used to convert a Save as DAISY derived *.xml (DTBook) to a DAISY 2.02 file set that then is ready for recording with human voice. As the production environment of most DAISY members is geared for studio recording with software that handles 2.02 file sets this transformer comes very handy since not only the advantages of DAISY editing in Save as DAISY can be used, but the source DTBook file can be used to transform to other output formats such as Braille and EPub, for example. Thus being true to the DAISY mantra: one source file - many output formats.

Input required: a valid(!) DTBook (DAISY XML) file

Path:

Ctrl + N (or click: New Job Wizard) [http://daisymfc.sourceforge.net/doc/enduser/01-gui-user-guide.html#jobsPerspNewJobWizard]

choose=>Create and Distribute=>DAISY Book=>Daisy 2.02 Text-Only Fileset Generator (from DTBook)

Image of Pipeline wizard with highlighted path to Daisy 2.02 Text-Only Fileset Generator

As before you now browse for the path to your *.xml file and select a path for your new DAISY 2.02 file set.

There are Optional Parameters:

  • DTBook fix: might be a good idea, it prevents you from having to do that in case you encounter errors

  • Output encoding: best left at 'utf-8'

  • Identifier: If your book doesn’t already have one, include it now

  • Abbreviation and acronym detection: If you have used that function in Save as DAISY you should tick this box

  • Sentence detection: This will increase local navigation possibilities for the end user

  • Word detection: This will further increase local navigation possibilities for the end user, but is mostly used for special books

  • href Targets: If you intend to record your book with Sigtuna you MUST select ‘text’

Image of Pipeline configuration window with highlited Parameters

Then hit ‘Next’ and run the transformer with Ctrl + F1

The resultant DAISY 2.02 file set is now ready to be opened by a DAISY 2.02 authoring software

Transformation example four: Error fixing

Errors do happen and the DTBook Fixer takes care of some common ones. If you do have errors in you DAISY files that can’t be fixed using this script (or other measures), you should report your problem to the appropriate DAISY Forum (http://www.daisy.org/forums/)

Input required: a valid(!) DTBook (DAISY XML) file

Path:

Ctrl + N (or click: New Job Wizard)

choose=>Modify and Improve=>DAISY XML (DTBook)=>DTBook Fixer

Image of Pipeline wizard with highlighted path to DTBook Fixer

Hit ‘Next’ and browse for the path to your *.xml Input File and select a path AND a name for your new (repaired) DTBook *.xml file.

Image of Pipeline configuration window with highlited Parameters

There are Optional Parameters:

  • Active Categories : ‘Repair, then Tidy’ is recommended.
    You might want to read more detailed on what is included in each category in the DTBookFix Categories.

  • Force Execution: This is best checked if your input file is not valid.

  • Simplify heading layout: This will simplify the structure of your book. In case of an automated mark-up there might be redundant levels of navigation in the book. Leave this unchecked if there is nothing wrong with the structure in your book.

  • Tidy inline whitespace: This will remove uneccessary spaces between words, but is normally only necessary for Braille conversion.

  • Fix Character set: If you had this error message: "invalid byte sequence" you should check this box. Errors of this type are frequently found when converting a text file from a format like pdf, or in right-to-left scripted languages like Arabic, Hebrew or Urdu. But don’t check this box if you don’t have any of the aformentioned issues!

  • Document language: If your document is in only one language you could enter the appropriate language symbol here. In case of English this would be: EN.

  • XML Validation Report: As a last item you can browse for the path where you would like to have that report.

Hit ‘Finish’ and run that job with Ctrl + F1

Please re-validate the result (see Transformation example one).

Transformation example five: From DAISY 2.02 to DAISY 3

DTB forward migrator is the original ‘raison d'être’ of the DAISY Pipeline: to guarantee that all DAISY books produced in one version of the DAISY Standard can be migrated to a later version of itself.

Input required: a valid(!) DAISY 2.02 Book

Path:

Ctrl + N (or click: New Job Wizard)

choose=>Modify and Improve=>DAISY XML (DTBook)=>DTB Forward Migrator

Image of Pipeline wizard with highlighted path to DTB Forward Migrator

Hit ‘Next’ and browse for the path to your *.ncc Input File and select a path for your new (migrated) DTBook *.xml file.

Image of Pipeline configuration window with highlited Parameters

There are Optional Parameters:

  • Input DTB Type : This is set by default as ‘Auto detect’ and can be left that way

  • Identifier : If nothing is entered here the identifier from the original DAISY 2.02 book will be used. But since it is not really a good idea to have books with identical identifiers in your library, you might want to change this. Most probably the library section of your organisation will provide an identifier.

  • Add NCX navLabel audio : navLabel is something only DAISY 3 books have, so it has to be added. The title and all headings in a DAISY 2.02 book should consist of one uninterupted section of audio, so when navigating through a book the user hears the name of that heading. In DAISY 3 books these are explicitly navigable. Checking this box will ensure that.

  • Audio clip length : This sets the length of audio for the navLabels, it therefore only makes sense to specify a value for this parameter, if the "Add NCX navLabel audio" check box is checked.

  • Transfer NCC metadata : By checking this you can import all the Meta Data from the 2.02 original, which might not be a good idea since you are creating a different and new book. If left unchecked nothing will be imported

  • Optional CSS : Your organisation might have a custom look to its books for which a CascadingStyleSheet (CSS) would have been created. Here you can import that style sheet.

  • Optional Resource File : The Resource File is again something not found in DAISY 2.02 books. It gives a lot more navigational possibilities to a DAISY 3 book. If no resource file is specified, no such file will be available in the resulting talking book.

Hit ‘Finish’ and run that job with Ctrl + F1

Please re-validate the result (see Transformation example one).

Transformation example six: Encoding all wav files in a book to mp3

The DTB Audio Encoder will do exactly that: compress all *.wav audio files into *.mp3 files.

Input required: a valid(!) DAISY 2.02 Book

Path:

Ctrl + N (or click: New Job Wizard)

choose=>Modify and Improve=>DAISY XML (DTBook)=>DTB Audi Encoder

Image of Pipeline wizard with highlighted path to DTB Audio Encoder

Hit ‘Next’ and browse for the path to your *.ncc OR *.xml Input File and select a path for your new (mp3) DTB.

Image of Pipeline configuration window with highlited Parameters

There is only one Optional Parameter:

  • Bitrate: That is the rate of compression of the resulting mp3 files. The higher the number, the better the quality and the larger the size. 48 kbit/s is a good compromise between these two factors.

Hit ‘Finish’ and run that job with Ctrl + F1

Please re-validate the result (see Transformation example one).

Transformation example seven: Enabling a DAISY 2.02 fileset to represent language scripts faithfully in SIGTUNA.

The Character Set Switcher is necessary for a much used but somewhat dated authoring software called Sigtuna [link?]. Whereas all newer authoring software represent text in utf-8, Sigtuna only ‘understands’ window-1252 (or any other language representation from the windows character set like Arabic, Russian, Chinese, Japanese)

Input required: a valid(!) DAISY 2.02 file set

Path:

Ctrl + N (or click: New Job Wizard)

choose=>Modify and Improve=>Multi-Format=>Character Set Switcher

Image of Pipeline wizard with highlighted path to Character Set Switcher

Hit ‘Next’ and browse for the path to your *.ncc Input File and select a path for your changed file set.

NOTE: The ‘Browse’ function might be set to look only for *.xml files. Change that to be looking for all (*.*) files for the *.ncc to be visible!

Image of Pipeline configuration window with highlited Parameters

There are Optional Parameters:

  • Output encoding: In the example above windows-1252 is inserted, if not set the output encoding will be the default utf-8.

  • Linebreaks : This is normally left as System default, but can be changed.

  • XML Validation Report: As a last item you can browse for the path where you would like to have that report.

Hit ‘Finish’ and run that job with Ctrl + F1

Please re-validate the result (see Transformation example one).

Transformation example eight: Create a playlist from your mp3 files

The Audio Tagger is a useful tool for organization where end users don’t have DAISY hard- or software playback devices but use mp3 player as an alternative.

Input required: any valid(!) DAISY 2.02 or DAISY 3 file set

Path:

Ctrl + N (or click: New Job Wizard)

choose=>Modify and Improve=>Multi-Format=>Audi Tagger

Image of Pipeline wizard with highlighted path to Audio Tagger

Hit ‘Next’ and browse for the path to your Input Files and select a path for your changed file set.

NOTE: The ‘Browse’ function might be set to look only for *.xml files. Change that to be looking for all (*.*) files for the *.ncc or *.opf file to be visible!

Image of Pipeline configuration window with highlited Parameters

There are Optional Parameters:

ID3 Tagging : IDR tags will give you Title, Album title, Artist and Track Number if such data can be retrieved from the meta data of your file set.

Playlist generation: This will give you these different playlists: M3U, M3U8, PLS, XSPF and WPL

Hit ‘Finish’ and run that job with Ctrl + F1 and see which playlist suits your purposes best.

The DAISY Pipeline Project Home is at www.daisy.org/pipeline. Visit this page for software downloads, documentation and the development site. You may also want to visit the Pipeline Forum or view the FAQs.

DAISYpedia Categories: 


This page was last edited by DAISY1 on Tuesday, June 26, 2012 12:39
Text is available under the terms of the DAISY Consortium Intellectual Property Policy, Licensing, and Working Group Process.