Re: UTF-8, ISO C Am.1, and POSIX

From: Keld J|rn Simonsen (keld@dkuug.dk)
Date: Wed Aug 13 1997 - 06:57:31 EDT

Next message: odonnell@zk3.dec.com: "Re: UTF-8, ISO C Am.1, and POSIX"
Previous message: Martin J. Duerst: "Re: SGML DESCSET for XML, HTML (was: XML and ISO 10646 ...)"
Maybe in reply to: Markus G. Kuhn: "UTF-8, ISO C Am.1, and POSIX"
Next in thread: odonnell@zk3.dec.com: "Re: UTF-8, ISO C Am.1, and POSIX"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Asmus Freytag writes:

(I presume it was Sandra Martin O'Donnell that wrote the first cited words).
> Yes, yes, I know UTF-8 and Unicode/UCS are universal
> >encodings, but from POSIX's point of view, that's irrelevant.
> >They're just encodings.
>
> That's just what's wrong with POSIX from the perspective of an implementer
> of the Unicode Standard. Unicode has well defined character semantics that
> are considered a property of the character itself and therefore not locale
> dependent. A shorthand notation to kick the standard library into supporting
> these is indeed called for. In an indirect way, it's analogous to the 'C'
> locale, with its minimal guarantees. A "Unicode" locale (or more correctly,
> the character type subset of a locale) seems a reasonable extension.

This is worked upon in the forthcoming 14652 standard in ISO.
>
> BTW, there is nothing that prevents anybody from supporting the character
> semantics discovered and catalogued by Unicode for other character sets (for
> the corresponding characters). There have been more than one implementation
> of Unicode's bidi-algorithm on top of 8-bit character sets, to give just one
> example.

That is also the way 14652 does it, it is defined on the repertoire
of 10646 but also aaplies to subrepertoires thereof.

Keld

Next message: odonnell@zk3.dec.com: "Re: UTF-8, ISO C Am.1, and POSIX"
Previous message: Martin J. Duerst: "Re: SGML DESCSET for XML, HTML (was: XML and ISO 10646 ...)"
Maybe in reply to: Markus G. Kuhn: "UTF-8, ISO C Am.1, and POSIX"
Next in thread: odonnell@zk3.dec.com: "Re: UTF-8, ISO C Am.1, and POSIX"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT