Re: Need encoding conversion routines

From: Pim Blokland (pblokland@planet.nl)
Date: Fri Mar 14 2003 - 07:57:03 EST

  • Next message: Werner LEMBERG: "pinyin syllable `rua'"

    askq1 askq1 schreef:

    > Character U+4321 is the unicode code-point but to store this
    character into
    > a file we need to use a certain encoding format.

    Yes. That depends on the implemention. If your character is kept in
    memory as a 16 bits type, that's simply an short integer with the
    hex value 0x4321, or decimal 17185. (Whether this is signed or
    unsigned, little-endian or big-endian doesn't matter.) Now if you
    want to convert this, you call the appropriate conversion routine
    from the CVTUTF library.
    E.g. if you need UTF-8 output, you supply the ConvertUTF16toUTF8
    function with pointers to this character and your output buffer, and
    you end up with the bytes 0xE4, 0x8C, 0xA1 in your output buffer.
    You can then dump this buffer to the file you mentioned.
    However, you have said this is not what you want!
    So what is it that you do want?

    Pim Blokland



    This archive was generated by hypermail 2.1.5 : Fri Mar 14 2003 - 08:36:32 EST