From: Andrew C. West (email@example.com)
Date: Thu Jun 05 2003 - 05:23:57 EDT
On Wed, 4 Jun 2003 18:11:48 -0500 , "Mount, Rob (Robert F)" wrote:
> I am investigating differing behavior in various environments of the
> wide-character version of the C function isAlpha with respect to
> character U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK.
> The UNICODE documents seem abiguous on this point: the General
> Catetory is "Lm" which, although informative instead of normative,
> would seem to imply that it is alphabetic; likewise
> DerivedCoreProperties-4.0.0.txt indicates that it is alphabetic; but
> PropList-4.0.0.txt contains two records - one indicating that it is
> a diacritic, one that indicates it is an extender.
U+30FC (KATAKANA-HIRAGANA PROLONGED SOUND MARK) is, I would say, identical in
function to U+02D0 (MODIFIER LETTER TRIANGULAR COLON) that is used to indicate a
long vowel in IPA. Both U+30FC and U+02D0 are signs that are appended to a
character representing a vowel to indicate that it is a long vowel sound.
Both U+30FC and U+02D0 have a General Category of "Lm" (Modifier_Letter), and in
PropList.txt are included under the Extender property. However only U+30FC is
also included under the Diacritic property. Likewise, U+1843 (MONGOLIAN LETTER
TODO LONG VOWEL SIGN), which has a similar function to U+30FC, is classified as
an Extender but not as a Diacritic.
The definition of "Extender" in UCD.html is :
"Characters whose principal function is to extend the value or shape of a
preceding alphabetic character. Typical of these are length and iteration marks."
U+30FC, U+02D0 and U+30FC are indeed all "length marks", and are rightly
classified as Extenders.
But why then is U+30FC alone also classified as a Diacritic (according to
UCD.html "Characters that linguistically modify the meaning of another character
to which they apply") ? As far as I am aware U+30FC does not "linguistically
modify the meaning of another character" other than lengthen a preceding vowel.
This archive was generated by hypermail 2.1.5 : Thu Jun 05 2003 - 06:15:03 EDT