Re: what kind of this unicode?

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Sun Mar 11 2007 - 10:17:09 CST

  • Next message: vunzndi@vfemail.net: "Re: Unihan/CJK Radical bone character slightly mirrored?"

    On Sun, 11 Mar 2007, Ernest Hann wrote:

    > I want to know, what the meaning of this characters:
    >
    > cœHœJɼ$]Mȹ

    Most likely, it is meaningless gibberish as such and needs to be
    interpreted as other than plain text.

    > And what kind of this unicode?

    As such, as a string included into an email message declared to be
    ISO-8859-1 encoded, it is not Unicode at all. It is a meaningless string
    of characters from the ISO-8859-1 repertoire.

    > I found this characters from MSAccess document (.mdb)

    It's probably data in MSAccess format and need not be textual data at all,
    or it might contain some text character and some other data. People
    familiar with the MSAccess format might be able to make some educated
    guesses on what it is, but generally you cannot expect to be able to
    process data extracted from a binary file as if it were text.

    The œ parts are somewhat odd, but they might result from the use of
    some program can encodes characters using character references as in HTML,
    SGML, and XML. In those languages, œ refers to the character with
    code number 399 in decimal (and the reference character code is usually
    Unicode in this context). But the notation has such a meaning by special
    conventions only.

    -- 
    Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
    


    This archive was generated by hypermail 2.1.5 : Sun Mar 11 2007 - 10:19:12 CST