Re: Languages supported by UTF8 and UTF16

From: Peter Kirk (
Date: Sat Sep 10 2005 - 11:37:56 CDT

  • Next message: Rein: "Re: [ATypI] IJ"

    On 10/09/2005 15:20, Jukka K. Korpela wrote:

    > ...
    > All living languages, and many dead languages, can be written in their
    > normal writing system(s) using Unicode characters. However, some
    > of their characters cannot be represented as single Unicode characters
    > but as combinations. ...

    In principle this should be true. But in practice it is NOT TRUE. This
    is very clear in that there are a number of characters proposed for
    Unicode 5.0, which are not yet in the standard, which are required for
    writing living minority languages in their normal writing systems - see And it would be arrogant to
    suppose that this process will be complete even with Unicode 5.0,
    especially as many orthographies of minority languages are being developed.

    That is why it is misleading for António to try to insist that "what
    Unicode does "cover" are not languages, but writing systems." The cases
    I am talking about are ones where Unicode does cover the writing system,
    but not the specific character repertoires required for certain languages.

    > ...
    > Well, that's not very short, really. Neither is it very
    > understandable, since it lacks examples. ...

    I will be more explicit and give examples: the proposed Cyrillic
    characters for ranges 04FA..04FF and 0510..0513, which are required as
    part of the normal orthography of various languages of Russia, as
    proposed at
    - these characters are not yet in Unicode. As a result the languages
    Nivkh, Itelmen, Enets, Chukchi and Khanty are not yet supported by Unicode.

    > ... The point, anyway, is that "support to a language" can mean much
    > more than just presence of all characters used in a language. It's
    > also debatable, since people may disagree on what really belongs to a
    > language, even at the character level. ...

    True, but most languages have at least one official or semi-official
    orthography, and if these orthographies include characters not in
    Unicode, that is enough to show that Unicode does not "support" the

    Peter Kirk (personal) (work)
    No virus found in this outgoing message.
    Checked by AVG Anti-Virus.
    Version: 7.0.344 / Virus Database: 267.10.19/93 - Release Date: 08/09/2005

    This archive was generated by hypermail 2.1.5 : Sat Sep 10 2005 - 12:15:16 CDT