Re: Level of Unicode support required for various languages

From: Kenneth Whistler (
Date: Mon Oct 29 2007 - 19:39:50 CST

  • Next message: John H. Jenkins: "Re: Level of Unicode support required for various languages"

    > I'm sure that John has never suggested that IDS sequences should be a
    > substitute for encoding, merely that given what the Unicode Standard
    > currently says, it would be a feasible interim solution.
    > The question is just what exactly the intent of that paragraph in the
    > Unicode Standard was. It sure sounds to me as if it is suggesting (and
    > Unicode is sanctioning) a mechanism for component based represention
    > of unencoded ideographs --

    For component-based *description* of unencoded ideographs.

    And given such a description, an implementation *may* do what it
    will with it.

    But while focussing on "that" paragraph, don't forget the 4th
    paragraph of the section:

    "In particular, support for the characters in the Ideographic
    Description block does *not* require the rendering engine to
    recreate the graphic appearance of the described character."

    Translated, the combination of those two paragraphs means that
    if you use IDS, you can't depend on anybody *else* to draw
    a Han character for you -- nobody is required to do so, and
    in fact the whole section is so hedged with the difficulties
    and interpretative issues involved, that nobody is even
    encouraged to do so.

    But if you insist, the standard isn't going to say that you *cannot*
    write an implementation to do so for your own edification -- you
    "may" display an IDS as a constructed glyph.

    In principle, this is little different from the mini-markup
    conventions used by many types of forum software, which
    let you type strings like ":smile:" or ":wink:" or the like
    and autoconvert them into some privately represented smiley
    graphic. If such strings get out in the "wild", so to speak,
    you can't depend on anybody else's software to do anything
    meaningful with them, but you "may" use them in a controlled
    context to describe/designate a smiley which then gets rendered *as*
    a smiley.

    > if the character was already encoded why
    > would you want the rendering system to render an IDS as a single glyph
    > and treat it as a single unit for editing purposes?
    > I guess it must have been written at a time when people didn't worry
    > so much about security and spoofing issues. I would suggest that the
    > UTC should consider removing the offending paragraph at the earliest
    > opportunity, and replace it with a statement that IDS sequences are
    > intended to be rendered as a visible sequence of IDC characters and
    > ideographic components, and not composed into a single glyph. But then
    > maybe it is too late for that now ?

    Seems to me it already says that.


    > Andrew

    This archive was generated by hypermail 2.1.5 : Mon Oct 29 2007 - 19:42:09 CST