Re: [indic] Re: Feedback on PR-104

From: Sinnathurai Srivas (sisrivas@blueyonder.co.uk)
Date: Fri Sep 07 2007 - 08:31:42 CDT

  • Next message: Mark E. Shoulson: "Re: [icu-support] complete binary/utf mapping"

    Peter,

    >>
    I will not debate the assertion that /ksh/ in Tamil phonology is a single phoneme.

    >
    Does this mean you will at least be ready to analyse if ksh are two consonants (half plus half matrai) while x is one consonant (half-matrai).

    >>
    In Unicode, the entity of Tamil *writing* “க்ஷ” is given an encoded representation as a sequence of three encoding elements: 0B95 + 0BCE + 0BB7.

    >
    It is wrong.
    We use two consonant forms and x form.
    ie, we use actual k.sh form and x form.

    When it is two consonant forms, it automatically equates to Devanagari conjuncts and automatically equates to Latin/English conjuncts such as KSH, PR, TR etc..
    Tamil deals with half Matrai (half consonants) in this way. (Note: Matrai in Tamil is not the same as Matra in Devanagari)

    Ie, you definitely NEED NOT DO ANYTHING to get Tamil conjunct equivalent.

    >
    When you say "Tamil *writing* “க்ஷ” is given ... 0B95 + 0BCE + 0BB7"
    What happens to split form,
    What happens to X form
    and why is it given a conjunct treatment, when the natural conjunct equivalent is 0B95 + 0BCE + 0BB7 without any glyph shape variation.

    லக்ஷ்மி, லக்.ஷ்மி, றிக்.ஷா
    Luxmi, Lukshmi, Riksha

    We use all of these forms.
    Conjunct is already taken care in advanced usage of anti-virama. There is no conjunct in Tamil, as Unicode is trying to impose.
    There is x form in Tamil.

    Additional References:
    P5: Devanagari conjuncts: http://unicode.org/~emuller/iwg/p5/index.html
    Some common Devanagari conjuncts have shapes that differ vastly from the combinations of their parts, and users do not understand how to represent text that renders with those conjuncts.

    Question:

    Can you let me know if Devanagari do not use both two consonant forms and conjunct forms interchangeably? I'm asking this because even if you mistakenly impose conjunct to Tamil, Tamil then would use both two consonant forms and conjunct form of same characters.

    Note: Even if your answer is yes, Tamil does not need complicated conjunct processing as the functionality is already taken care by advanced usage of anti-Virama (puLLI).

    P5:

    There is no evidence that those letters represent anything more than what can already be represented with sequences. The sequences for those three conjuncts are already recorded in the draft NamedCompositeEntities.txt:

    >

    This sitting of evidence by Unicode is an untrue statement.

    see

    லக்ஷ்மி, லக்.ஷ்மி, றிக்.ஷா
    Luxmi, Lukshmi, Riksha

    And you are only trying to impose one and only one single character x as conjunct in Tamil. You could not even find another example of conjunct in Tamil to enforce your deliberate modification of Tamil writing system.

    Atomic Character. A character that is not decomposable.

    In Tamil, the fors in use are X, ksh, do you mean ksh is not valid? Do you state that Tamil should not use ksh in it's decomposed form. Is this mean Unicode want to change Tamil Grammar for writing.

    Are you able to grasp the significance that Tamil has Grammar for writing and Tamil has grammar for Alphabet. Unicode is dealing with writing. Probably the languages that you are familiar with may not use Grammar for writing. But Tamil do and I think Unicode need to give due consideration for such a differing scenario.

    You also wrote
    >>
    but because of pre-existing standards

    >
    No not because of pre-existing standard, but because Unicode pretended misunderstanding and deliberate mistreatment of Tamil (to toe the line). This is a deliberate attempt to make Tamil toe your lines.

    You know verywell that puLLI is not the same as Virama. But you want to impose the functionality of Virama to Tamil, like dictators. You know verywell that conjunct like results are easily obtained by Virama and yet you want to IMPOSE non EXISTENT conjunct theories to Tamil.

    We have enough onslaught to change Tamil by outside forces. We do not want another attempt by Unicode to change and undermine Tamil by forcing it to suit line by line of alien theories.

    Even Alphabet in Tamil defines symbols for places of articulation. Unicode defines Alphabet every other way except the Tamil way. If asked, you will say we do not define Alphabet, but represent graphic forms.

    Thre is no conjunct in Tamil.

    Sinnathurai

      ----- Original Message -----
      From: Peter Constable
      To: Unicode Mailing List ; indic@unicode.org
      Sent: 19 August 2007 14:31
      Subject: RE: [indic] Re: Feedback on PR-104

      Srivas:

       

      You’ve been writing a number of messages with statements like “Unicode breaks Tamil Grammar and linguistic science” or “Unicode does not understand pulli… or virama either”. I believe you have a mistaken expectation.

       

      In your message below, you say that a spectrum analysis will reveal that there is no such thing as a conjunct and that /ksh/ is two consonants. Spectra analyses and Unicode have no connection whatsoever. A spectrum analysis is applied to a wave phenomenon in nature. In this case, you are relating it to speech utterances. Unicode does not deal with speech sounds. Rather, Unicode pertains to the real of graphic characters – the visual realm, not the audio realm.

       

      The notion of conjunct may not be relevant to phonetics and phonology, but it certainly is relevant in descriptions of writing, and even more relevant in descriptions of computer implementations of text.

       

      I will not debate the assertion that /ksh/ in Tamil phonology is a single phoneme. But note that Unicode does not encoded phonemes. In encodes abstract text units that can be used to provide an encoded representation of text. In Unicode, the entity of Tamil *writing* “க்ஷ” is given an encoded representation as a sequence of three encoding elements: 0B95 + 0BCE + 0BB7. That encoded representation is not intended to be a commentary on Tamil grammar or phonology. It is nothing more than an encoded representation of the writing element for use in electronic text.

       

      In this case, the elements in Unicode’s encoded representation for an entity of Tamil writing, “க்ஷ”, do not correspond in a one-to-one manner with the Tamil phoneme represented by that written entity. But there is no requirement that Unicode does so. The only requirement on Unicode is that it be able to represent that written entity and to distinguish it from other written entities.

       

      Now, Unicode *could* have used a different encoded representation in which there is a one-to-one relationship; but because of pre-existing standards, it doesn’t. That doesn’t mean that Unicode is wrong, or that it breaks linguistic science. It is what it is, and it is perfectly adequate for its intended purposes in relation to representation of Tamil *text*.

       

      As long as you expect Unicode to reflect Tamil phonology or grammar, you will be disappointed, and you will have misunderstood the purpose. It is a wrong expectation.

       

       

      Peter

       

       

      From: indic-bounce@unicode.org [mailto:indic-bounce@unicode.org] On Behalf Of Sinnathurai Srivas
      Sent: Thursday, August 16, 2007 5:02 PM
      To: N. Ganesan
      Cc: Unicode Mailing List; indic@unicode.org
      Subject: [indic] Re: Feedback on PR-104

       

      There is no conjunct in Tamil. There is no conjunct in science. Run an spectrum analysis and it will reveal that there is no such thing as conjunct. ksh is two consonants, will naturally occure with vowels. x is another one. So Tamil can write what islam or English or sanskrit tries to write.

       

      There is no conjunct in Tamil. Unicode breaks Tamil Grammar and lingustic science.

       

      "N. Ganesan" <naa.ganesan@gmail.com> wrote:

    > There is no conjunct in Tamil

        The Grantha conjunct, k.ss exists in Tamil like all other Indian scripts.
        It is made up of two consonants, KA & SSA, hence a conjunct.
        We use ZWNJ to break the conjunct for Islamic, English loan words
        into Tamil.

        Pl. refer to TUS for the conjunct k.ss in India's scripts.

        N. Ganesan

       

        

    ------------------------------------------------------------------------------

      Which email service gives you unlimited storage?



    This archive was generated by hypermail 2.1.5 : Fri Sep 07 2007 - 08:35:46 CDT