Re: latin1 decoder implementation from Doug Ewell on 2012-11-16 (Unicode Mail List Archive)

From: Doug Ewell <doug_at_ewellic.org>
Date: Fri, 16 Nov 2012 15:56:54 -0700

Buck Golemon wrote:

> Latin1 explicitly gives no semantics to several byte values (for
> example 0x81), but acknowleges that other standards will define their
> semantics.
> Unicode provides code-points with equally-undefined semantics so that
> these bytes can pass through without change.
> This allows a byte-level system using control codes in those ranges to
> interact with a unicode-aware system, without loss of information.
>
> Does that summarize well?

That should be good enough. It would be a poor process indeed that would
not convert the control characters 1-to-1, so that CR and LF would
become replacement characters or nulls or something.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell

Received on Fri Nov 16 2012 - 16:57:41 CST

This archive was generated by hypermail 2.2.0 : Fri Nov 16 2012 - 16:57:41 CST