Re: Hebrew script in IDN (was Exemplar Characters)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Nov 19 2005 - 06:56:55 CST

  • Next message: Philippe Verdy: "Re: Exemplar Characters"

    From: "Richard Wordingham" <richard.wordingham@ntlworld.com>
    > Well, that rules out about half the words in Burmese! I suppose there's
    > the work around of replacing the virama - U+1039 U+200C ('VIRAMA' ZWNJ) -
    > by U+1039 U+005F ( 'VIRAMA' LOW LINE) - extremely unnatural for a
    > language that doesn't have spaces between words.

    Is the space separation really a problem for IDN usage, where it is arguable
    that explicit word separation is effectively needed at least to avoid
    colision of name spaces?

    After all, the normal space is also forbidden in Latin domain names, so we
    use an hyphen: this hyphen does not have the traditional semantics found in
    normal language (where it is used for compound words), but it is a syntaxic
    feature that decomposes labels into lists of non-compound word tokens to be
    used in domain names.

    What I mean there: does Burmese need ZWNJ in the *middle* of a word or only
    between words to avoid collisions with the next word? If this occurs in the
    middle of a word, does it create a sort of compound word which would be
    interpreted differently if they word was broken into two tokens separated by
    a space? If this does not change the semantic, then even that ZWNJ can be
    excluded from IDN: you can use the syntaxic ASCII hyphen to separate the two
    tokens, instead of using ZWNJ.



    This archive was generated by hypermail 2.1.5 : Sat Nov 19 2005 - 06:59:02 CST