Re: How do I type unicode characters?

From: Mike Ayers (mayers@celequest.com)
Date: Tue Apr 11 2006 - 16:12:47 CST

  • Next message: tom.kirkpatrick@virusbtn.com: "Re: How do I type unicode characters?"

    tom.kirkpatrick@virusbtn.com wrote:

    > Which one of these looks like a proper UTF-8 character: é or é ?

            Neither. There is no such thing as a "UTF-8 character", just "UTF-8
    encoded Unicode data". In most cases I would be nitpicking to point
    this out, but in this case I think it is the cause of your problem:

    Characters: é é
    Unicode code points: 233 195 169
    Unicode hex points: E9 C3 A9

            It is interesting to note that C3 A9 is the UTF-8 encoding of E9.

    > Basically, if I enter the character 'é' (egrave) into my database, when
    > trying to display it on a webpage, it displays as a '?'. If I try to enter
    > it as 'é' It displays ok. So does this mean that the correct way to type
    > an 'é' is to actually type 'é'?

            No. It means that you should not handle text as binary. What you are
    doing is entering ISO 8859-1 characters (bytes) from one end, then
    interpreting the same stream as UTF-8 encoded Unicode at the other,
    which is why you have to enter gobbldeygook in order to get the result
    you desire.

            My guess is that your database is in ISO 8859-1 format, and your web
    page declares UTF-8 (there are many ways to get this particular error,
    so I guess). What you need to do is verify that your data is being
    extracted from the database as UTF-8 data. The storage fields are of
    type N* (e.g. NVARCHAR), correct?

            HTH,

    /|/|ike



    This archive was generated by hypermail 2.1.5 : Tue Apr 11 2006 - 16:23:06 CST