South Asia Subcommittee Report
February 2, 2009
Recommendation: wait for the next UTC meeting to discuss this.
- wait for further input from the Government of India
- ask the Government of India to submit L2/09-012 to WG2
3.1 Malayalam Dot Reph
There is a general consensus that a DOT REPH character should be encoded. It seems that the comments of the Govt. Of India are the result of a misunderstanding, and that the concerns with regards to the keyboard layout can be accomodated in a sufficiently adequate manner (noting that this is a rare character).
Recommendation: create a formal proposal for the encoding of a DOT REPH character, to be acted on the May UTC meeting. This proposal should concentrate on justifying the need for the encoding of a separate character, explaining the similarity with <RA, VIRAMA>, and discussing the keyboard layout concerns.
3.2 Malayalam Chandrakkala
- all occurrences of the chandrakkala sign are represented using U+0D4D MALAYALAM SIGN VIRAMA, including when used for samvruthokaram
- if a chandrakkala is used in conjunction with a vowel sign, (e.g. U+0D41 MALAYALAM VOWEL SIGN U), then the vowel sign comes first in the representation: <..., U+0D41, U+0D4D>
- the Malayalam base characters (including vowel letters), digits, dash, dotted circle and no-break space are appropriate bases for VIRAMA.
3.3 Malayalam Sorting
Recommendation: change the DUCET in the Unicode Collation Algorithm (UCA) so that it accomodates the needs of the Malayalam language with as little tailoring as possible.
4.1 Annotation of U+0FD5
The requested annotation is present in FPDAM 6. No action.
4.2 Encoding of Vedic
- add a separate annotation “used for insertion of characters” to U+A8FA.
- change the name of U+1CD4 to VEDIC SIGN MID-CHARACTER SVARITA
L2/08-277R was already approved at the August UTC meeting and is in PDAM 7. No modifications are requested at this point.
- add the two characters proposed in L2/09-070; submit a specific proposal to WG2
- submit a revised 08-071 to WG2, with the following changes:
- §2: mention the two characters of L2/09-070
- §4: rephrase along the lines of “the usual mechanism for hyphenation in Indic script apply to Kaithi”
- §5: the most common method to indicate an abbreviation is to use a circle-like shape, encoded as U+110BB KAITHI ABBREVIATION SIGN; other scribal traditions are considered to be idiosyncratic and out of scope for encoding
- §6: use the two already proposed characters 110BC and 110BD.