Save as DAISY: Troubleshooting common translation errors

Original Author(s): Prashant Ranjan Verma

This guide is for those who are using Save as DAISY add-in for production of DAISY and EPUB 3 files. Users may encounter validation or translation errors when trying to convert the document to DAISY XML or full DAISY using the Narrator –DTBook to DAISY script.

The following tips have been compiled from the feedback received from regular users. If you are using the Save as DAISY add-in or plan to integrate it into your book production workflow, then it is highly recommended that you read this guide. Also make sure you read the Text mark-up guidelines for Save as DAISY.

Beginners are advised to first check out the "Save as DAISY Introduction" and Tutorials and videos on DAISYpedia before using this guide.

1. Remove spaces in file and folder names

It is highly recommended that you do not use spaces while typing file and folder names. Such documents may translate successfully with Save as DAISY but you can run into problems when you use other tools later to convert the output to different formats. You may use underscores or camel case to construct file and folder names. If you are inserting images into your Word document, then you should apply this rule to the file names of those image files too.

If you encounter problems like missing images or wrong order of images in the output created by Save as DAISY add-in then carefully check all file and folder names, renaming them can resolve such issues.

2. Clear all formatting

It is safe to remove all formatting of text before applying styles. This is particularly important when text has been acquired by scanning and OCR or copied from different sources. The inconsistent formatting in such text often throws up translation errors which are difficult to locate and correct.

To remove all formatting, select the whole document and in the styles list, select "Clear all".

3. Take care of text orientation

Many users who have converted Asian language text with Save as DAISY add-in have reported translation errors with message like "The bdo tag does not match…..". Sometimes, some text inadvertently gets marked as a Bidirectional object. This often stops successful conversion to XML or DAISY book.

To avoid this error, apply "lTR run" or "RTL run" on the text as required. If the whole document is supposed to have "left to right" orientation, then it is a good idea to select the whole document and apply "ltr run" just before starting the DAISY conversion process. Note that you can add these commands to the Word Quick Access Toolbar. The Quick Access Toolbar customization is described here.

4. Assign text language

Make use of the Language button in the Save as DAISY Accessibility ribbon to make sure text gets appropriate language code upon translation. The need for this step has been felt particularly for languages other than English. Note that language is detected only at the level of the paragraph with this process.

5. Check page number formatting

When custom page numbering is used, the "page number DAISY" style needs to be applied on page numbers written in the document.

Often, people make mistakes in applying this style to blank lines, spaces, headings, images etc. This creates errors in translation and the typical error messages are like "the page number tag cannot contain …..".

It is a good idea to inspect all page numbers before translation to avoid such errors. In the Find dialog in Word, click on More button and then click Format button. Choose Styles and select Page number DAISY and then click Find next to check all instances where this style has been applied. Keep clicking Find next or just use keyboard shortcut CTRL + PGDN and check if you find anything suspicious having page number style. If you do find any, clear the formatting on it.

As per DAISY specification, page number within a table is not allowed. If you find any page number DAISY style inside a table, you will need to move it to the top or bottom of table or if acceptable, split the table at that instance.

When you are searching for page number DAISY style, look out for any heading with page number style and vice versa, this is a common mistake in mark-up.

6. Do not use too many or unknown styles

In the styles list you will find dozens of styles. Unless you know the correct use of the style, it is better to avoid applying too many styles on the text. For most documents, you can limit your mark-up to headings and the page number style.

Some of the styles which are often applied incorrectly are the caption and poem styles. Most playback software at this time support only the basic styles and therefore there may not be any value in applying too many styles.

You may need to use the Bodymatter style sometimes. See the article "Treatment of Table of Contents, Front matter, Body matter & Rear Matter in Save As DAISY" for more information on its correct usage.

7. Remove line breaks

Manual line breaks often create a nuisance. You can easily find them and replace with paragraph break if required. To find line breaks, in Find type "^l" and if you need to insert paragraph breaks in its place, type "^p" in the replace with field and click replace all.

8. Create a good structure using headings

It is very important to check all text that has been applied any heading style. One should keep the Navigation pane or the Outline view open at all times while applying styles in Microsoft Word. Before translating the document, have a close look at the list of entries in the navigation pane.

Some of the common mistakes are application of heading style on a blank line which gives error messages like "Ensure there is no same level of Heading with empty text in between…."

At times when the sections and sub-sections are numbered e.g. (1. Introduction, 1.1. Background), the numbers may be generated using the Word numbering feature. On translation of a document containing such mark-up, error messages are like "call to function list failed…."

Make sure the numbers prefixed with headings is not auto generated. This is not permitted when the text also has a heading style.

It is also important that large documents have several headings. When the document is converted to full DAISY, one audio file is created in TTS voice for each heading. If the content of each heading is extremely large e.g. 100 pages, then the system may fail to create such huge audio files and the conversion will stop.

9. Remove hyperlinks

It is recommended that you remove hyperlinks from the Word document. There is no need to delete the text which is a hyperlink, just select that text and in context menu click on Remove hyperlink. If this is not done, error messages like "Call to function anchor failed…" may be encountered. Note that hyperlinks in footnotes also need to be removed.

10. Take care of images

If you get an error message like "Cannot access a closed stream…." Then carefully check all images. This issue has been reported when some image files inserted in the document were corrupted.

Decorative images should also be removed from the document since they do not add value in the resulting XML and DAISY version of the document.

It is a good idea to compress the images to decrease the size of the output formats. Depending upon the Microsoft Word version, you will find the compress option in the SIZE dialog or in the FORMAT ribbon (Word 2010).

Finally, do remember to provide meaningful text descriptions to all images as ALT TEXT.

11. Use appropriate TTS

If you are converting the Word document to Full DAISY, before starting the translation process check the Text to speech settings in the Control Panel. Make sure you choose a TTS voice which can speak the language of the document which you are translating e.g. for a document in French you need to select a French TTS voice. Further, use the "Preview Voice" button to play the TTs and make sure that it is working properly.

Errors encountered at the Pipeline process are generally due to TTS issues.

DAISYpedia Categories: 


This page was last edited by VLuceno on Wednesday, April 1, 2015 18:49
Text is available under the terms of the DAISY Consortium Intellectual Property Policy, Licensing, and Working Group Process.