Re: Problems encoding the spanish o

From: pepe pepe (
Date: Mon Nov 17 2003 - 09:10:37 EST

  • Next message: Philippe Verdy: "Re: Problems encoding the spanish o"


       My knowledge about encoding is very poor and you seem to know a lot abou
    this. could you explain a bit more what you have said. I have made the

    This is the problematic sequence 11110011-01101110-00100000-01001101
    (F3-6e-20-4d) if I follow the instructions that appaear in the question(What
    is UTF-8?) in the UTf-8 fAQ i obtain the following
    011101110100000001101 instead 1EE80D 111101110100000001101(Have I made a
    mistake?) Following the utf-16 encoding from my result all works well. so to
    finalize who do you think that is the responsible for this strange situation
    the client for saying that the doc is utf-8 or the parser.


    >From: Pim Blokland <>
    >To: Unicode mailing list <>
    >Subject: Re: Problems encoding the spanish o
    >Date: Mon, 17 Nov 2003 13:26:19 +0100
    >pepe pepe schreef:
    > > We have the following sequence of characters "...ización Map.."
    >that is
    > > the same than "...ización Map..." that after suffering some
    > > transformations becomes to "...izaci&#56186;&56333;ap...."
    > > AS you can see the two characters 56186 and 56333 seem to
    >represent this
    > > sequences "ón M". Any idea?.
    >Yes, your input text obviously gets flagged as being in UTF-8
    >format, even if it is Latin-1 (or any codepage that has a ó at index
    >Not only that, but the process making the mistake of thinking it is
    >UTF-8 also makes the mistake of not generating an error for
    >encountering malformed byte sequences, AND of outputting the result
    >as two 16-bit numbers instead of one 21-bit number.
    >If you take the byte sequence (hex) F3 6E 20 4D and treat it as
    >UTF-8 and don't care it's not valid, this maps to the value
    >(hex)1EE80D. Again, not caring this is not a valid codepoint,
    >turning this into UTF-16 would yield U+DB7A U+DC0D, which is what
    >you got in your output.
    >Pim Blokland

    Dale rienda suelta a tu tiempo libre. Encuentra mil ideas para exprimir tu
    ocio con MSN Entretenimiento.

    This archive was generated by hypermail 2.1.5 : Mon Nov 17 2003 - 10:10:36 EST