Re: PUA convention ID tags

From: Ruszlan Gaszanov (
Date: Tue Jan 06 2009 - 03:12:34 CST

  • Next message: Andrew West: "Re: Emoji: swings of fashion"

    On Monday, January 05, 2009 6:24 AM, Peter Constable wrote:

    > Sounds like you want to re-invent ISO 2022. No thanks.

    No, I am not. There is a fundamental difference between ISO 2022, which was a permanent stateful encoding solution and my proposal to use plane 14 tags to help identify particular PUA convention used in a text stream.

    My reasoning is whether we like it or not, a number of PUA-based encoding solutions are widely used around the World, but there is no way for a system encountering PUA codepoints in a text stream to know how to interpret then without some sort of out-of-band information. On the other hand, whether some of us like it or not, we already have the tagging mechanism in Unicode which could be easily used to solve this problem.

    Basically, an application or user could tag chunks of text containing PUA codepoints with specific convention ID, and other users or applications, if they want to, might use this information to interpret the PUA codepoints they encounter in text stream more meaningfully. Otherwise, those tags could be simply ignored, as they would have no effect on interpretation of non-PUA characters.

    I am by no means suggesting this as any sort of permanent encoding solution - this mechanism should only be used as a helper for ad-hoc PUA solutions for characters not-yet-encoded or not-worth-encoding.


    This archive was generated by hypermail 2.1.5 : Tue Jan 06 2009 - 03:15:06 CST