Re: latin1 decoder implementation

From: Peter Krefting <peter_at_opera.com>
Date: Mon, 19 Nov 2012 09:23:14 +0100

Doug Ewell <doug_at_ewellic.org>:

> If he is targeting HTML5, then none of this matters, because HTML5 says
> that ISO 8859-1 is really Windows-1252.
>
> For example, there is no C1 control called NL in Windows-1252. There is
> only 0x85, which maps to U+2026 HORIZONTAL ELLIPSIS.

Windows-1252, does, however contain a number of undefined codepoints of
its own, and in the conversion tables provided by Microsoft, they are
really undefined. The new Encoding Standard for W3C defines a 1-to-1
mapping for those to the control characters (Windows-1252:0x81 to U+0081,
for instance) [1]. So doing the same for a "true" ISO 8859-1 map is
definitely not unreasonable.

-- 
\\// Peter Krefting - Core Technology Developer, Opera Software ASA
  [1] http://encoding.spec.whatwg.org/#windows-1252
Received on Mon Nov 19 2012 - 02:25:28 CST

This archive was generated by hypermail 2.2.0 : Mon Nov 19 2012 - 02:25:29 CST