Re: Unicode for Malayalam Language.

From: Glenn Adams (
Date: Fri Nov 21 1997 - 21:33:03 EST

I have reviewed your comments on the encoding of Malayalam script in Unicode
and ISO/IEC 10646 in:
I do not believe any action is required. In particular,

(1) Characters cannot be removed from the standard; you may choose, in your
user community, to deprecate the usage which decomposes U+0D4A through U+0D4C
into their component glyphic parts; however, a receiving agent should be prepared
to accept these as alternative representations. Personally, I agree with you
on this point, but they can't be removed now.

(2) The 5 "partial" (half-consonant) forms and the 1 conjunct form are encoded
in Unicode as follows (the equivalent ISCII encoding is shown to the right):

Form Unicode ISCII

LA(h) => 0D32 0D4D 200D D1 E8 E9
           (LA, VIRAMA, ZWJ) (LA, HALANT, NUKTA)

LLA(h) => 0D34 0D4D 200D D3 E8 E9

NNA(h) => 0D23 0D4D 200D C1 E8 E9
           (NNA, VIRAMA, ZWJ) (Hard NA, HALANT, NUKTA)

NA(h) => 0D28 0D4D 200D C6 E8 E9
           (NA, VIRAMA, ZWJ) (Soft NA, HALANT, NUKTA)

RA(h) => 0D30 0D4D 200D CF E8 E9
           (RA, VIRAMA, ZWJ) (RA, HALANT, NUKTA)

RRA(h) => 0D32 0D4D 0D32 D1 E8 D1
           (RRA, VIRAMA, RRA) (Hard RA, HALANT, Hard RA)

The above ISCII encoding of these Malayalam forms is prescribed by
IS 13194:1991 (ISCII 91) on page 19, under "Inscript Overlay for
Malayalam". The Unicode encoding of half forms and conjunct consonants
and the relation to the ISCII encoding is documented in the Unicode
Standard, Version 2.0, pg. 6-35 and 6-36. See "Conjunct Consonants"
and "Explicit Half-Consonants".

(3) Characters cannot be renamed in the standard. Additional information
on characters can be added to the character description information in
the Unicode Standard and/or in Annex P of ISO/IEC 10646. While from the
perspective of Malayalam, more indigenous names might have been preferred,
the current names are based on a naming unification amongst the primary
Indic scripts.

Glenn Adams

