Date: Fri Nov 07 2003 - 05:34:05 EST
Peter Jacobi wrote,
> So, which codepoint sequence will imply the disjoint form and
> which will imply the ligated form? If 'Indic unification' still
> holds, the conjunct form always is the default and the disjoint
> form needs ZWNJ.
> IMHO this doesn't fit well actual Tamil use and raises a lot of
> practical problems.
> Either there must be an accepted list of these ligatures (but
> lists of archaic usage tend to grow), or one is bound to put a
> preemptive ZWNJ after every SHA VIRAMA in modern use, to prevent
> conjunct consonant forming.
> If this archaic ligature problems extends to other grantha
> consonants, even more preemptive ZWNJs are necessary for
> contempary Tamil.
The Unicode string U+0BB2, U+0BC8 will display differently, depending
on which font is used. (லை)
Code2000 will display an old-fashioned ligature glyph, Latha will
show a more modern alternative, and TabAvarangal2
( http://www.geocities.com/avarangal )
will render the string in a proposed Tamil script-reform style.
Yet, the underlying encoded character string is constant.
It may be possible and desirable to treat these archaic ligature
forms similarly. Fonts designed for modern Tamil simply won't
include these archaic ligature glyphs, so it shouldn't be necessary
to insert ZWNJs all over the place in existing files.
Anyone seeking to reproduce a Tamil classic would need to specify
an appropriate font which includes the archaic ligatures. Users
whose systems lacked the appropriate font would still be able
to read the document, however.
IMHO, it's important to preserve options for users to explicitly
control ligation in plain text. With these archaic Tamil ligatures,
an author *may* elect to insert ZWNJs and other appropriate
formatting characters to preserve such distinctions where
I'm still concerned about the SHRII ligature encoding, though.
Of course, it makes sense to treat the ligature as a conjunct
of SHA + RA + II, but since SA + RA + II seems to have been
the "official" way to encode the ligature -- the proposed
change will break existing implementations.
It might be best to add the new SHA character without changing
the existing SHRII encoding (SA + RA + II).
This archive was generated by hypermail 2.1.5 : Fri Nov 07 2003 - 06:06:51 EST