Re: EUC-UTF8 is possible!

From: Doug Ewell (
Date: Sat Mar 17 2007 - 15:15:22 CST

  • Next message: Alexej Kryukov: "Re: Vista Fonts"

    Dan Kogai <dankogai at dan dot co dot jp> wrote:

    > I am really surprised to find that EUC and UTF-8 can be mashed up
    > easily.
    > The secret is \xFF. This byte NEVER appears in EUC or UTF-8. So you
    > can define the combo character as follow;

    No no no no. Please don't do this. Nobody else will implement it and
    you will be effectively limited to using it internally within your own

    Just use UTF-8, or if saving bytes is that important to you, use SCSU or
    a general-purpose compression technique. See UTN #14 for more on
    Unicode text compression.

    As someone who has created a number of alternative encoding schemes, I
    assure you that a scheme that "looks like" EUC or "looks like" UTF-8
    will cause you much more trouble than a completely new scheme that can't
    be confused for anything else.

    Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14

    This archive was generated by hypermail 2.1.5 : Sat Mar 17 2007 - 15:17:48 CST