L2/18-207

TAMIL homograph sequences for addition to chapter 12

Submitted by: A. Freytag

Date: 2018-06-25

The following are being discussed as homograph variants for purposes of the Label Generation Rules for the DNS Root Zone for Tamil.

Unlike all the other "exact" lookalikes considered there, they  appear not to be covered in Unicode 11.0.0 chapter 12. Their common feature is the presence of Tamil letter LLA.

These might be useful additions for some table in chapter 12. Generally, users not deeply familiar with a script would not expect that kind of presentation overlap. It is of particular concern to identifiers in public zones (any place where many people can register a public identifier with minimal constraints) and of course, of particular importance in the Root Zone of the DNS.

(Note that all other homograph cases for Tamil are documented in Chapter 12).

In addition, this might be useful to add to the data for UTS # 39 for the sequence that doesn't exist there already.

 

1.     TAMIL LETTER AU with TAMIL LETTER O followed by TAMIL LETTER LLA:

This variant pair involves the pure vowel TAMIL LETTER AU ( U+0B94) which looks exactly similar to the vowel + Consonant TAMIL LETTER O + TAMIL LETTER LLA (ஒள U+0B92 U+0BB3) combination. These two cases can cause confusion even to a careful observer and hence are being proposed as variants.

 

Variant 1

Variant 2

U+0B94

ஒள

U+0B92 U+0BB3

Table 15: Proposed Variants - Set 1


2         TAMIL VOWEL SIGN AU with TAMIL VOWEL SIGN E followed by TAMIL LETTER LLA:

This variant pair involves the split Matra TAMIL VOWEL SIGN AU ( U+0BCC) having left and right side catenators which sit on the preceding consonant. It looks exactly alike to a combination of Matra TAMIL VOWEL SIGN E (U + 0BC6) followed by consonant TAMIL LETTER LLA (U+0BB3).

Variant 1

Variant 2

U+0BCC

ெள

U+0BC6 U+0BB3

Table 16: Proposed Variants - Set 2