Re: latin1 decoder implementation

From: Doug Ewell <>
Date: Sat, 17 Nov 2012 10:13:44 -0700

Martin J. Dürst wrote:

>> If he is targeting HTML5, then none of this matters, because HTML5
>> says that ISO 8859-1 is really Windows-1252.
> Yes. But unless Python wants to limit its use to HTML5, this should be
> handled on a separate level (mapping a "iso-8859-1" label to the
> Windows-1252 decoder logic), not by trying to change ISO-8859-1
> itself.

Normally I would agree. The HTML5 standard itself requires the mapping
of one encoding to another, but that could be handled at the appropriate
level, as you said. However, the redefinition of mapping algorithms and
tables is another matter.

The "Encoding Living Specification" appears not to be normative, but if
it is to be followed, it essentially requires all conforming encoders
and decoders to be either rewritten or at least reviewed for
conformance. It specifies every step that the software must perform,
which may not match the steps performed by existing software, and it
specifies mapping tables which may not match the tables previously
published by vendors and SDOs.

If this document is taken up as a standard part of HTML5, it is not hard
to imagine that languages like Python will need to implement two flavors
of encoders and decoders: "legacy" and "HTML5-compliant."

Doug Ewell | Thornton, Colorado, USA | @DougEwell ­ 
Received on Sat Nov 17 2012 - 11:17:12 CST

This archive was generated by hypermail 2.2.0 : Sat Nov 17 2012 - 11:17:14 CST