NCX text encoding
| Project: | EPUB Maintenance |
| Component: | Open Packaging Format (OPF) |
| Category: | bug report |
| Priority: | normal |
| Assigned: | PSorotokin |
| Status: | completed @ 2.0.1 |
Jump to:
In following the TeleRead comment thread on Michael Volz's new Firefox plug-in to render ePub, there were two comments about improper rendering of non-Basic Latin characters (such as em-dashes and accented characters) in the NCX.
Upon studying the OPF and DTBook specs regarding NCX, I realize that, unlike Content Documents and the Package, we apparently do not require the NCX to be UTF-8/16 encoded. That is, the NCX may be any encoding (so long as non-UTF-8/16 encodings are properly declared in the XML prolog).
Is this something we will want to firm up in the OPF spec? (Of course, if I missed anything from the OPF and DTBook specs, and we do now require UTF-8/16 encoding for the NCX, we should still firm it up in a more prominent way.)
Amend OPF specification section 1.4.1.2 item (vii) to read "an NCX must be included and either UTF-8 or UTF-16-encoded; and"
- Login to post comments

Comments
#1
Only allowing UTF-8/16 encoding in XML stems from interoperability requirement: this way all Reading System only need to be able to process mandatory XML encodings and do not need to carry encoding tables for all other languages (which would be a considerable burden). Allowing some non-optional XML-formatted content to escape from that restriction would defeat the whole point. My view is that not requiring NCX to be UTF-8/16 encoded is just an omission and this requireent, while not in the letter of the spec, is in its spirit.
#2
Peter, I agree that in spirit we intended the NCX must be UTF-8/16 encoded.
I propose that we explicitly state it in the updated specs.
#3
#4
Proposal:
Amend OPF specification section 1.4.1.2 item (vii) to read "an NCX must be included and either UTF-8 or UTF-16-encoded; and"
I do not see a good place to put blanket XML encoding requirement; probably we should put it in once all the specs are unified. At this point we can only express it as WG opinion on that planned annotated spec wiki.
#5
review time is up
#6