RE: Unicode conformant character encodings and us-ascii

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Thu May 15 2003 - 10:19:19 EDT

Next message: Marco Cimarosti: "RE: how to sort by stroke (not radical/stroke)"

Previous message: Marco Cimarosti: "RE: how to sort by stroke (not radical/stroke)"
Maybe in reply to: Yael.Aharon@nokia.com: "Unicode conformant character encodings and us-ascii"
Next in thread: Kenneth Whistler: "Re: Unicode conformant character encodings and us-ascii"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Yael Aharon wrote:
> I see now why you thought the question was odd. I actually
> meant to ask about the various iso (e.g. 8859 variants) and
> windows character encodings.

OK, but those encodings do not "conform to Unicode specs": they are simply
different encodings, which can be *converted* to Unicode because Unicode
contains all the characters that they contain.

However, the answer to your question is "yes" for all ISO 8859 and Windows
encoding. However, it is "no" for most DOS encodings (which are still
sometimes used in Windows) and for some Japanese encodings (also used in
Windows in, e.g., Internet or e-mail).

You can check this from the mapping files found here:

http://www.unicode.org/Public/MAPPINGS

Each line in those files contains the mapping between a 3rd-party encoding
character (1st column) and Unicode (2nd column):

        ...
        0x41 0x0041 # LATIN CAPITAL LETTER A
        ...
        0xC7 0x0627 # ARABIC LETTER ALEF
        ...

You could do a quick script to check whether any 3rd-party character in
range 0x00 to 0x7F maps to a different Unicode value.

_ Marco

Next message: Marco Cimarosti: "RE: how to sort by stroke (not radical/stroke)"
Previous message: Marco Cimarosti: "RE: how to sort by stroke (not radical/stroke)"
Maybe in reply to: Yael.Aharon@nokia.com: "Unicode conformant character encodings and us-ascii"
Next in thread: Kenneth Whistler: "Re: Unicode conformant character encodings and us-ascii"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu May 15 2003 - 11:08:52 EDT