Re: Level of Unicode support required for various languages

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Oct 30 2007 - 18:25:47 CST

Next message: Mark E. Shoulson: "Re: Level of Unicode support required for various languages"

Previous message: vunzndi@vfemail.net: "Re: Level of Unicode support required for various languages"
Maybe in reply to: Timothy Armes: "Level of Unicode support required for various languages"
Next in thread: Kenneth Whistler: "Re: Level of Unicode support required for various languages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

James Kass said:

> >Yes, but remember that you have two PUA planes to use, planes 15 and
> >16. Unless you're anticipating more than 100K total, I think you'll
> >be OK.
>
> While I agree that the PUA might be easier to display initially,
> it should be noted that the advantages of IDSequences include
> the idea that they are *standard*

A crucial splitting of hairs is required here.

The IDC's (Ideographic Description Characters, U+2FF0..U+2FFB)
are standardized. And the unified ideographs and radical
symbol characters that can be used with them are also
standardized.

The IDS's (Ideographic Description Sequences) are decidely
*NOT* standardized.

Which is part of the main point John Jenkins has been making.
All an IDS tells you is (roughly) what the intended appearance
is of some Han ideographic shape for a character. It tells
you nothing about the *identity* of that character, nor does
it tell you whether somebody else's related IDS is or is not
the "same" character.

*Instances* of IDSs have no status whatsoever in the standard.
All that has status is the *concept* of an IDS (and the
syntax for expressing them).

In terms of information content, an IDS is one step up
from a PUA character. For a PUA character, in the absence
of a detailed mutual agreement, you know nothing about
the character other than it is a character. For an
IDS, you know the intended approximate shape of the
character and that the intent is to describe a Han
ideographic character (as opposed to an Ethiopic letter
or an unencoded Vai syllable). But in the absence of a
detailed mutual agreement, you can't even know if the
IDS is describing an unencoded character or an *encoded*
character, nor if it is encoded, which one.

--Ken

> and that input of sequences
> using standard characters is already supported.
>
> Best regards,
>
> James Kass
>
>
>

Next message: Mark E. Shoulson: "Re: Level of Unicode support required for various languages"
Previous message: vunzndi@vfemail.net: "Re: Level of Unicode support required for various languages"
Maybe in reply to: Timothy Armes: "Level of Unicode support required for various languages"
Next in thread: Kenneth Whistler: "Re: Level of Unicode support required for various languages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Oct 30 2007 - 18:27:29 CST