Re: UTF-8 can be used for more than it is given credit ( Re: UTF-7 - is it dead? )

From: Doug Ewell (dewell@adelphia.net)
Date: Sun Jun 04 2006 - 11:52:38 CDT

  • Next message: Doug Ewell: "Re: are Unicode codes somehow specified in official national linguistic literature ? (worldwide)"

    Theodore H. Smith <delete at elfdata dot com> wrote:

    > What's that? Like levenshtein? (EditDistance) If you are talking about
    > a levenshtein-like thing on Unicode, well you can't do it with
    > codepoint processing, because a character is not a codepoint, a
    > character is a string of codepoints. So if your "cells" must now be
    > strings intead of bytes or UInt32s... you might as well use a string
    > of UTF-8 instead of a string of UTF-32.

    A character is a code point that has an assignment.

    Some "letters" consist of a string of characters, and some characters
    can be decomposed into a string of characters. But it is not correct to
    say that a character is a string of code points.

    --
    Doug Ewell
    Fullerton, California, USA
    http://users.adelphia.net/~dewell/
    


    This archive was generated by hypermail 2.1.5 : Sun Jun 04 2006 - 11:58:01 CDT