From: James Kass (email@example.com)
Date: Mon Nov 05 2007 - 04:49:05 CST
There does not appear to be a "rra" equivalent in Grantha, so
my speculation about the basis of formation of a possible
"shRii" ligature vs. the conventional "shrii" based on Grantha
letters seems unlikely.
> We are in the process of defining the standards for TAMIL (தமிழ்)
> CHARACTER CODE FOR INFORMATION INTERCHANGE in Sri Lanka.
> I would like to clarify few things as summary.
> In summary Unicode wanted to encode the Grantha Shri in Tamil
> following way (image1)
> 0BB6 + 0BCD + 0BB0 + 0BC0
> ஶ + ் + ர + ீ
> No more old way is suggested in the future encoding
> 0BB8 + 0BCD + 0BB0 + 0BC0
> What will be encoding for image2?
> Since each new sequence added to the Unicode standard will "break"
> existing data, The Unicode wanted to keep the encoding to the
> Grantha க்ஷ in Tamil following way.
> 0B95 + 0BCD + 0BB7
> க + ் + ஷ
For point (1), I think you are correct.
(1a) The preferred method of encoding "shrii" is
0BB6 + 0BCD + 0BB0 + 0BC0 (ஶ + ் + ர + ீ)
(1b) The old way for "shrii" is not recommended
for future text encoding, but is supported on Windows
starting with Windows 2000. Legacy data exists using
the old way, so the old way may be supported as long
as there is concern about backward-compatibility.
(1c) Many operating systems do not support the new way.
(2) I don't know how or if the difference between image1
and image2 should be represented in plain text.
(2) If image2 is only a variant of image1, then the difference
can not be distinguished in plain text, and would require
rich text and a font change. (Using a variation sequence
here does not seem possible under the current language
of the standard because a variation selector character can
only be applied to a single base character.)
(2) If image2 represents a different letter/ligature than
image1, then the difference should be distinguishable at
the character/plain text level. Even if these different
letter/ligatures at some point became conflated with each
other and from then on became used interchangeably.
(2) If image2 is a special form of image1 which represents
the god Luxmi and related concepts, and if image1 does not
represent the god Luxmi and related concepts, then image2
could be encoded as a symbol. As a symbol, users may then
wish to use it in running text in place of image1 wherever
the users deem appropriate.
(3) Unicode wanted to keep the encoding of kssa as
0B95 + 0BCD + 0BB7 instead of adding a new character for
kssa because kssa can already be expressed in plain text
using Unicode. Adding a new character which can already
be expressed in Unicode would, for one thing, increase the
opportunities for spoofing/"phishing"/internet fraud.
(3) Please see the page
for a more detailed explanation. The section under "Proposal
Guidelines" starting with "Often a proposed character can be
When I spoke of "breaking" applications for many users
when new sequences are added, I meant sequences like:
TAMIL LETTER SHRII;0BB6 0BCD 0BB0 0BC0
...which Peter Constable mentioned as a provisional named
sequence 2006/07/26 on the public Unicode list.
The proposal for letter SHA
( http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2617.pdf )
mentions that "... SHA may also form ligatures in combination with
MA, YA, [R]RA, and VA. However, these ligatures are archaic and
are not widely recognized. Contemporary publications use only
Since the SHA character isn't well supported yet, and since
a combination like 0BB6 + 0BCD (ஶ்) displays on this system
with a dotted circle, any problem with existing data display
suddenly changing from contemporary disjointed forms into
archaic ligatures seems unlikely. (There probably isn't too
much existing data using the SHA character.)
But, if archaic ligatures exist for other letter combinations
(with letters other than SHA), and they become "named
sequences", then existing data would result in a "broken"
I apologize for my lengthy answers. Many of the concepts
involved are complex. I hope this is truly helpful. I'm also
sorry that I don't know the answer to point (2).
This archive was generated by hypermail 2.1.5 : Mon Nov 05 2007 - 04:52:02 CST