variants and code-page --> unicode conversion

From: Ostermueller, Erik (Erik.Ostermueller@fnf.com)
Date: Wed May 07 2003 - 14:01:37 EDT

  • Next message: Kenneth Whistler: "Re: Still can't work out whats a "canonical decomp" vs a "compatibility decomp""

    We're running some test data through
    two VB6 text controls on Win2k.
    We copy the unicode ideographs from
    a browser and paste them into the text control.

    One text control supports unicode the other doesn't.
    The font for each is set to Arial Unicode MS.

    The unicode control renders all the ideographs
    we throw at it. The non-unicode text control (I presume dbcs)
    is rendering 80-90% of our chinese ideographs.
    The following little set of unicode scalars render as a single question mark
    instead of chinese ideographs.

    9ec4, 53F7, 8d38, 533A, 53F7

    With the help of unicode.org, we've found
    variants of these that do indeed render correctly (thanks).

    I could use some help understanding this behavior.

    1) We provided the control with the unicode scalar,
    which is in the font (I looked it up in the font file).
    What is keeping the control from rendering
    the scalar correctly? I would understand if the scalar was
    bigger than 0xffff, but it's not.
    I suppose that the control expects code points to be
    in a particular code page and then it all goes down
    hill from there. But why do the variants
    (and the other 80-90%) render correctly?

    2) Can anyone refer me to a decent tutorial on how fonts
    map between unicode and different code pages?

    3) How did variants come into existence?

    4) Do application vendors (in this same pickle) advise users
       to not user particular scalars? Do any IME's help out in this task?

    Thanks,

    --Erik O.



    This archive was generated by hypermail 2.1.5 : Wed May 07 2003 - 14:45:39 EDT