CSS encoding
| Project: | EPUB Maintenance |
| Component: | Open Publication Structure (OPS) |
| Category: | bug report |
| Priority: | normal |
| Assigned: | PSorotokin |
| Status: | completed @ 2.0.1 |
Jump to:
Like XML, CSS allows multiple encodings and explicitly say in the spec "This specification does not mandate which character encodings a user agent must support." I think it is an oversight that we did not explicitly mandate UTF-8 and UTF-16 support and did not limit possible encodings to these two.
Since this is not likely to be a controvertial issue and we want to finish this iteration quickly, I am taking libery of creating this issue, assigning this it to myself and proposing its resolution in on shot.
I propose we resolve it by adding section to OPS spec:
1.4.1.5 CSS Requirements
External CSS stylesheets reference by OPS documents must use UTF-8 or UTF-16 encoding.
and adding item "v" to section 1.4.2 (BOM language is needed to clarify ASCII compatibility):
v. correctly process CSS stylesheets in both UTF-8 and UTF-16 encoding with or without byte order mark (BOM).
- Login to post comments

Comments
#1
I strongly agree.
#2
#3
Are encodings other than UTF-8 or UTF-16 disallowed?
#4
I agree also. If the other documents in epub have to be utf-8 or -16, then they all should be the same for the sake of our sanity. I am writing an epub creation program (for fun and personal use), and my mind was blown that my Sony Reader wouldn't use my stylesheet since it was encoded in utf-16.
#5
I agree that UTF-8 and UTF-16 should be allowed and that no other encodings should be allowed.
However, I think that we have to further consider the BOM and the @charset rule.
First, UTF-16 has three charset names: "UTF-16LE", "UTF-16BE", and "UTF-16"., where "UTF-16BE"
and "UTF-16LE" disallow the BOM, but "UTF-16" allows it. I believe that we can mandate the use
of the BOM for UTF-16 for CSS stylesheets in EPUB, but some Unicode expert might disagree.
Second, UTF-8 allows the BOM (EF BB BF). I believe that we cannot mandate this BOM for UTF-8
CSS stylesheets, since US-ASCII will be rejected. When it is not present, we have to rely on other
mechanisms.
Third, the link element of XHTML has the charset parameter. I think that the use of this attribute
should be discouraged in EPUB. (Again, some other expert might disagree.)
Here is my proposal: a CSS stylesheet in EPUB shall begin with
1) the UTF-16 BOM (FF FE or FE FF),
2) the UTF-8 BOM (EF BB BF), or
3) the @charset rule for UTF-8, and
the BOM or @charset is authoritative.
Note: We might want to contact W3C.
#6
I think allowing only "UTF-16" (and not "UTF-16LE", "UTF-16BE") and requiring BOM is the right approach.
I also agree that BOM for UTF-8 should be allowed, but cannot be mandated.
This can be accomodated by this slight language change:
"External CSS stylesheets reference by OPS documents must use UTF-8 or UTF-16 encoding. Byte order mark is mandatory for UTF-16 encoding and optional for UTF-8 encoding."
"v. correctly process CSS stylesheets in UTF-8 encoding with or without byte order mark (BOM) and in UTF-16 encoding with byte order mark".
#7
In 4.4 CSS style sheet representation of CSS2,
<link charset="">or other metadata from the linking mechanism (if any)#8