Re: UTF-8 reg tags...

From: Alain LaBont/e'/ (alb@sct.gouv.qc.ca)
Date: Mon Sep 16 1996 - 16:25:30 EDT


At 12:53 16/09/1996 -0700, unicode@Unicode.ORG wrote:
>
>The UTC discussed the need for a generic designator along the lines
>of your argument. There was general agreement that a generic designator
>would be good for this purpose; howeve, we didn't come to complete
>closure on this. What we did agree was a need for the versioned
>designator exists due to the incompatibile changes from UC 1.1 to UC 2.0.
>The problem with using a generic designator to refer to UC 2.0 is that
>one can't be sure it isn't referring to UC 1.0, for which implementations
>may very well exist that haven't been upgraded to either 1.1 or 2.0.
>
>As for the difference between 10646 & Unicode, there are particular
>assumptions one must make with Unicode to remain conformant that don't
>apply to 10646; e.g., the default usage of level 3, the use of the
>canonical Unicode equivalence algorithm, the Unicode BIDI algorithm,
>the Unicode script shaping rules (not defined by 10646), the Unicode
>character semantics, the fact that Unicode doesn't directly make use
>of collection identifiers or level designators, etc. Unicode has its
>own conformance clause which does not apply to 10646. It is thus
>important to maintain the distinction at the level of MIME designation.

Imho it is important that even under the UCS, no filtering of level 3 be
made even if no full functionality is offered for rendering, ordering, or
processing in general.

I agree with Frangois that it does not matter what you do with the
characters that you receive but their integrity during the transfer should
never be a question, it shall be faithfully transmitted.

I am of the opinion that the tagging should be the same for the UCS and UNICODE.

The problem of UCS-4 support by UNICODE is also an issue, but in the same
way that a UCS level 1 should not touch level 3 character, I think UNICODE
implementations should not filter transmission of UCS-4 characters beyond
what is allowed for UTF-16... There should be a mechanism to quote these
extra theoretical characters that could occur, say, in 20 years fromnow. But
nevertheless transmission or MIME tagging should not care about these
implementation problems. I don't see the harm. Any process has to care about
how to process the data, and that can be under a UNICODE or a UCS,
non-UNICODE, implementation.

Alain LaBonti
Quibec



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT