Go directly to main content.

CSS encoding

Project:EPUB Maintenance
Component:Open Publication Structure (OPS)
Category:bug report
Priority:normal
Assigned:PSorotokin
Status:completed @ 2.0.1

Like XML, CSS allows multiple encodings and explicitly say in the spec "This specification does not mandate which character encodings a user agent must support." I think it is an oversight that we did not explicitly mandate UTF-8 and UTF-16 support and did not limit possible encodings to these two.

Since this is not likely to be a controvertial issue and we want to finish this iteration quickly, I am taking libery of creating this issue, assigning this it to myself and proposing its resolution in on shot.

Description
Issue Id: 
40
Resolution: 

I propose we resolve it by adding section to OPS spec:

1.4.1.5 CSS Requirements

External CSS stylesheets reference by OPS documents must use UTF-8 or UTF-16 encoding.

and adding item "v" to section 1.4.2 (BOM language is needed to clarify ASCII compatibility):

v. correctly process CSS stylesheets in both UTF-8 and UTF-16 encoding with or without byte order mark (BOM).

Comments

#1

I strongly agree.

#2

Status:proposed resolution» errata

#3

Are encodings other than UTF-8 or UTF-16 disallowed?

#4

I agree also. If the other documents in epub have to be utf-8 or -16, then they all should be the same for the sake of our sanity. I am writing an epub creation program (for fun and personal use), and my mind was blown that my Sony Reader wouldn't use my stylesheet since it was encoded in utf-16.

#5

I agree that  UTF-8 and UTF-16 should be allowed and that no other encodings should be allowed.

However, I think that we have to further consider the BOM and the @charset rule.

 

First, UTF-16 has three charset names: "UTF-16LE", "UTF-16BE", and "UTF-16"., where "UTF-16BE"

and "UTF-16LE" disallow the BOM, but "UTF-16" allows it.  I believe that we can mandate the use

of the BOM for UTF-16 for CSS stylesheets in EPUB, but some Unicode expert might disagree.

 

Second, UTF-8 allows the BOM (EF BB BF).  I believe that we cannot mandate this BOM for UTF-8

CSS stylesheets, since US-ASCII will be rejected.   When it is not present, we have to rely on other

mechanisms.

 

Third, the link element of XHTML has the charset parameter.  I think that the use of this attribute

should be discouraged in EPUB.  (Again, some other expert might disagree.)

 

Here is my proposal: a CSS stylesheet in EPUB shall begin with

1) the UTF-16 BOM (FF FE or FE FF),

2) the UTF-8 BOM (EF BB BF), or

3) the @charset rule for UTF-8, and

the BOM or @charset is authoritative.

 

Note: We might want to contact W3C.

 

 

#6

I think allowing only "UTF-16" (and not "UTF-16LE", "UTF-16BE") and requiring BOM is the right approach.

I also agree that BOM for UTF-8 should be allowed, but cannot be mandated.

This can be accomodated by this slight language change:

"External CSS stylesheets reference by OPS documents must use UTF-8 or UTF-16 encoding. Byte order mark is mandatory for UTF-16 encoding and optional for UTF-8 encoding."

"v. correctly process CSS stylesheets in UTF-8 encoding with or without byte order mark (BOM) and in UTF-16 encoding with byte order mark".

#7

In 4.4 CSS style sheet representation of CSS2,

 

 

  1. An HTTP "charset" parameter in a "Content-Type" field (or similar parameters in other protocols)
  2. BOM and/or @charset (see below)
  3. <link charset=""> or other metadata from the linking mechanism (if any)
  4. charset of referring style sheet or document (if any)
  5. Assume UTF-8

#8

Status:errata» completed @ 2.0.1
Valid XHTML 1.0!

Powered by Drupal, an open source content management system