Re: Languages supported by UTF8 and UTF16

From: Peter Kirk (peterkirk@qaya.org)
Date: Sat Sep 10 2005 - 11:37:56 CDT

Next message: Rein: "Re: [ATypI] IJ"

Previous message: Michael Everson: "Re: Old Norse orthography"
In reply to: Jukka K. Korpela: "Re: Languages supported by UTF8 and UTF16"
Next in thread: Anto'nio Martins-Tuva'lkin: "Re: Languages supported by UTF8 and UTF16"
Reply: Anto'nio Martins-Tuva'lkin: "Re: Languages supported by UTF8 and UTF16"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 10/09/2005 15:20, Jukka K. Korpela wrote:

> ...
>
> All living languages, and many dead languages, can be written in their
> normal writing system(s) using Unicode characters. However, some
> of their characters cannot be represented as single Unicode characters
> but as combinations. ...

In principle this should be true. But in practice it is NOT TRUE. This
is very clear in that there are a number of characters proposed for
Unicode 5.0, which are not yet in the standard, which are required for
writing living minority languages in their normal writing systems - see
http://www.unicode.org/alloc/Pipeline.html. And it would be arrogant to
suppose that this process will be complete even with Unicode 5.0,
especially as many orthographies of minority languages are being developed.

That is why it is misleading for António to try to insist that "what
Unicode does "cover" are not languages, but writing systems." The cases
I am talking about are ones where Unicode does cover the writing system,
but not the specific character repertoires required for certain languages.

> ...
>
> Well, that's not very short, really. Neither is it very
> understandable, since it lacks examples. ...

I will be more explicit and give examples: the proposed Cyrillic
characters for ranges 04FA..04FF and 0510..0513, which are required as
part of the normal orthography of various languages of Russia, as
proposed at
http://scripts.sil.org/cms/scripts/render_download.php?site_id=nrsi&format=file&media_id=Cyrillic_2005-08-09&filename=N2933.pdf&ei=yAgjQ_jGAsbeRO3RzcwH
- these characters are not yet in Unicode. As a result the languages
Nivkh, Itelmen, Enets, Chukchi and Khanty are not yet supported by Unicode.

> ... The point, anyway, is that "support to a language" can mean much
> more than just presence of all characters used in a language. It's
> also debatable, since people may disagree on what really belongs to a
> language, even at the character level. ...

True, but most languages have at least one official or semi-official
orthography, and if these orthographies include characters not in
Unicode, that is enough to show that Unicode does not "support" the
language.

-- 
Peter Kirk
peter@qaya.org (personal)
peterkirk@qaya.org (work)
http://www.qaya.org/
-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.10.19/93 - Release Date: 08/09/2005

Next message: Rein: "Re: [ATypI] IJ"
Previous message: Michael Everson: "Re: Old Norse orthography"
In reply to: Jukka K. Korpela: "Re: Languages supported by UTF8 and UTF16"
Next in thread: Anto'nio Martins-Tuva'lkin: "Re: Languages supported by UTF8 and UTF16"
Reply: Anto'nio Martins-Tuva'lkin: "Re: Languages supported by UTF8 and UTF16"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Sep 10 2005 - 12:15:16 CDT