Re: Visarga, ardhavisarga and anusvara -- combining marks or not?

From: verdy_p (verdy_p@wanadoo.fr)
Date: Tue Sep 08 2009 - 03:03:47 CDT

  • Next message: AndrĂ© Szabolcs Szelp: "Re: What justification for separately encoding two forms of lowercase sigma"

    > "Asmus Freytag" wrote:
    > > On 9/7/2009 6:13 PM, verdy_p wrote:
    > > > "Peter Constable"
    > > > wrote:
    > > >
    > > >> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Shriramana Sharma
    > > >>> To my mind, a combining mark is *usually* (though not always) something that
    > > >>> qualifies what is represented by a base character.
    > > >>>
    > > >> Nothing in Unicode dictates what function in relation to reading or linguistic
    > > >> interpretation a combining mark should have.
    > > >>
    > > >
    > > > Yes, but I still think that the main justification of the classification of a character as a combining mark
    > (spacing
    > > > or not) must be looked for within collation, i.e. the top-level of analysis of text, where some differences
    are
    > > > considered less essential and then not given primary weights for searches, sorting,
    > > There are plenty of languages where there's no primary difference
    > > between some otherwise ordinary letters when it comes to sorting.

    Well, another way to think about the "letter or combining mark" nature if the three marks discussed initially in
    this thread:

    Would you just write the visarga, ardhavisarga and anusvara marks in a Indic crossword game, or in an Indic version
    of Scrabble (which cannot have a lot of distinct tiles) ?

    It's highly probable that they won't be written at all, because they won't match between horizontal and vertical
    words, and because crosswords use a simplified script where these signs are still seen with lower importance (in
    fact it's also probable that there will not be any occurence of other vowel signs or viramas, and that the
    differences between dead and live cononnants will not be used as well, (so no need to exhbit the differences between
    the various forms of dead consonnants, in conjuncts, half forms or halant forms, or to exhibit the reph forms
    implied by the presence of virama after ra in most Indic scripts).

    But I may be wrong, and some crosswords may still attempt to match the vowels or the absence of ther implicit
    vowels, without trying to match the variant forms of dead consonnants (halant forms would be used throughout the
    grid, where appropriate, without creating conjuncts or using reph and half forms, and the additional visarga,
    ardhavisarga and anusvara will also be missing). I'm not sure however that crosswords game even exist using Indic
    scripts; may be they exist but only use full syllabic clusters (but words would become very short and the grid would
    probably contain too many black cells, reducing the interest of the game).

    [OT]
    Well, I searched a bit on Internet, and found this free online game (programmed in Java) using conjuncts (with
    freely playable vowels signs and halant signs, named there 'swar', that you can add to your small list of playable
    tiles using only full consonnants) in a Devanagari version of a Scrabble-like game for Hindi:

    http://www.imgnsoft.com/sabdvyauh/SabdvyaUh-H.html

    (note that it does not use the Unicode encoding for handling words and letters, but some hacked version of ASCII and
    some "high bytes" characters. I'm not sure that this is effectively an ISCII encoding; the specific Devanagari font
    it uses is not Unicode-encoded; also there's a problem in the licencing terms, according to the borrowed GPL code
    that it uses in its installed version, however a part of the Java source code is provided online)

    The author admits that the principles of crossing meaningful words generates difficulties to create a playable game,
    and when he tried to adapt it other Indic scripts than Devanagari. No need to say that controling the various forms
    for dead consonnants is ignored here, where these letters are combined to create conjuncts (so there's no use of ZWJ
    here, they are just displayed in their default form), and he had to limit the complexity of consonnant conjuncts);
    it's also quite difficult to put words on the grid that will cross several words (as they'll fill most often only 2
    to 4 squares per word: it will be hard to get scores differences as randomness in tiles selection will play a
    greater role than with the alphabetic scripts), and the game can be easily locked before all playable tiles are put
    on the grid (this can restrict a lot the Hindi vocabulary that can be effectively played in this game).
    [/OT]



    This archive was generated by hypermail 2.1.5 : Tue Sep 08 2009 - 03:07:41 CDT