RE: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Philippe Verdy (
Date: Thu Nov 01 2007 - 09:16:34 CST

  • Next message: John H. Jenkins: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"

    Ed Trager wrote:
    > In the particular case of the glyph for Ben's name, is it not already
    > within the realm of possibility to construct and OpenType font that,
    > upon seeing the sequence "⿵門龍" would substitute a single "ligtaure"
    > glyph via the GSUB table or some other similar mechanism?

    Did you investigate the possibility of new security risks when using a system that transparently builds a custom ideograph that may be confusable with another existing encoded ideographic character? For Unicode, the IDS and the encoded ideographic character ARE distinct (even with compatibility equivalences, or with UCA).

    So either you implement such system, but make sure that NO text will be interchanged using any IDS sequence that is confusable with an existing encoded character, or you can't use the system reliably. Such transparent transform should only be performed where there's no security risk in automated processings.

    But, there's no long term guarantee that a character currently represented by an IDS will never be encoded. If it gets encoded, then the system described above will not detect the difference and won't be able to revert to the linear (not composed) presentation of the IDS.

    For example, it should not be tolerated when displaying for example an URL in the address bar of a browser that supports IDN.

    For almost all security critical applications, rendering an IDS in a precomposed form should be disabled. May be this should be added in UTR#36 (Unicode security issues), even though it is a tolerated rendering described in TUS.

    This archive was generated by hypermail 2.1.5 : Thu Nov 01 2007 - 10:05:02 CST