Re: Unicode characters List instead of hexadecimal equivalent

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Aug 30 2006 - 09:22:00 CDT

  • Next message: Philippe Verdy: "Re: Unicode characters List instead of hexadecimal equivalent"

    Clearly, you're just missing font support; Unicode is not a font, and does not provide fonts; it's just defining how chracters should be encoded and handled in applications and fonts, and it proposes a standard codification for that.

    If you want to see *sample* glyphs, look at the code charts, then look at the "Resources" section to find links related to application and font support by third parties;
    Unicode Compliance support does not define how characters will be vizualised (and even the glyphs in the PDF code charts are not normative: they just *inform* about the standard character identity, nothing else).

    So Unicode is not even a standard for fonts (fonts use other sets of conventions, which *may* and *should* conform to Unicode to make the font usable with internationalized texts, but there are lots of alternative for designers).

    But Unicode compliance in your application is necessary if you want to support many international scripts in your database; but remember that your database will not store images of characters in the text items it will store. For such compliance, you don't absolutely need to be able to render the individual characters (note also that some sequences of characters combine with each other to create graphical entities that are visually distinct from the mere aggregation of the individual glyphs).

    Note that the database engine has to conform to Unicode for some operations if you don't want the text to be completely garbled, becoming meaningless even with the presence of correct font support to render them. Operations like: substring operators, sorting results, case folding, conversion from text to value, ... are not based on the apparent glyph you may see on the rendered text, but it is performed only from the *encoded* text.

    Unicode is the wrong place to look for fonts, and there's no such "Unicode font" (a common terminology used elsewhere, but that just indicates that the font will behave correctly to render plain-text encoded with Unicode, and the fact that the font supports a large *subset* of the common Unicode and ISO/IEC 10646 character repertoire).

    ----- Original Message -----
    From: "Adisesha Neelaiahgari" <adisesha@humaninference.com>
    To: "Dean Harding" <dean.harding@dload.com.au>; "James Cloos" <cloos@jhcloos.com>
    Cc: <unicode@unicode.org>
    Sent: Wednesday, August 30, 2006 7:08 AM
    Subject: RE: Unicode characters List instead of hexadecimal equivalent

    > Hi All,
    > Thanks for your suggestions.
    > I tried Convert.ToChar(System.Convert.ToInt32(value, 16)), working well for some unicodes. For ex: 29CA;TRIANGLE WITH DOT ABOVE;Sm;0;ON;;;;;N;;;;;
    > 29CB;TRIANGLE WITH UNDERBAR;Sm;0;ON;;;;;N;;;;; I could see only ⧊.
    > Requirement(FYI):
    > I need to provide a set of unicode chars to admin in my application, where admin will select some chars as restricted. If any user wants to add new name having restricted chars then warning is displayed "Special character is not allowed as part of name, contact administrator". Now initial filling of unicode chars is required to display in GUI for admin, from sqlserver Database. Since I don’t have unicode DB, I am trying to create a table having unicode characters. I got the hex numbers from http://www.unicode.org/Public/UNIDATA/UnicodeData.txt . These numbers must be converted to unicodes and then inserted in to database, to use them in my application. Below mentioned conversions are not working well for all unicodes to generate.
    >
    > --> Instead of reading this, converting and then inserting in to database, as James Cloos suggested http://jhcloos.com.nyud.net:8080/utf8s.txt (I could not see many chars for 02E5,02E6,02E7,02E8,02E9,02EA [ ˪]MODIFIER LETTER YIN DEPARTING TONE MARK) if we have any .txt file having unicode characters, then it will be imported directly in to database with out conversion logic.
    >
    > Please let me know whether requirement is clear.
    >
    > Thanks
    > Adisesha.N



    This archive was generated by hypermail 2.1.5 : Wed Aug 30 2006 - 09:24:43 CDT