Re: Level of Unicode support required for various languages

Date: Wed Oct 31 2007 - 18:47:37 CST

  • Next message: John Hudson: "Re: Stix beta fonts released"

    Quoting Andrew West <>:

    >> > I'm not quite sure what the point of the exercise is.
    >> To demonstrate that the whole process is non-trivial -- particularly
    >> for the kinds of characters, especially variant forms, taboo
    >> forms, personal names, and the like, that one would most
    >> likely have to resort to IDS in order to describe. Taboo
    >> forms, which remove a stroke, would tend to be particularly
    >> problematical for a component-based description.
    > Yes, indeed. Which is why some have called for an IDC "subtraction
    > operator", so that for example U+4E4C ? could be described as ?[-]?
    > <9E1F - 4E36>. However this could be ambiguous (which dot is to be
    > subtracted, the top one or the one in the middle ?).

    For a good component based system you need to have the right set of
    components to paly with. The present CJKV was not done with this in
    mind, so one has both the problem of multiple ways of writing the same
    component and also the problem of some components not being
    representes. there is a need for more radicals/components to be
    encoded for the IDS work of the IRG on duplicates to run smoothly.

    The - operator has been suggested in at least two context,

            (1) with regards to taking away one stroke in names, mention
    in connection with gplyph genration.

            (2) in connection with IDS and how to unambigiuously write
    components found in existing encoded characters but the components
    themselves are not seperately encoded (any high frequency components
    would be good candidates for encoding)

    For some characters one may simply need more than IDS to describe, or
    conversely call such characters components.


    > Andrew

    This message sent through Virus Free Email

    This archive was generated by hypermail 2.1.5 : Wed Oct 31 2007 - 18:49:35 CST