Re: non-latin hyphenation?

From: Timothy Partridge (timpart@perdix.demon.co.uk)
Date: Mon Apr 06 1998 - 14:09:29 EDT


In message <9804061705.AA06296@unicode.org> you recently said:

> Perhaps some of you know, or know of a good source for this info:
>
> The background:
>
> In most languages that use Latin script, words may be broken (at
> certain allowed places) at the end of a line. We call this
> hyphenation, and the hyphen character is usually displayed at the end
> of the line to indicate the word is broken.
>
> I know that in some non-Latin scripts, words can be broken across a
> line but no symbol is used to indicate this. I believe many languages
> based on Ethiopic script behave this way.
>
> The question:
>
> What other conventions exist for denoting an unusual (e.g.,
> middle-of-word) line break? In particular, are there any natural
> languages that denote a word break by a mark at the beginning of the
> *second* line (rather than at the end of the first)?

Tibetian see page 6-59 of Unicode standard (three occurances of normal end of
word indicator)

Thai uses hyphen in same way as English (The Thai System of Writing, Mary R. Haas).

Japanese no indication, but prohibits line breaks at certain places reducing
ambiguity (Understanding Japanese Information Processing, Ken Lunde).

Italian (I know it's a Latin script!) Underline last character on line
instead of using hyphen (ECMA Standard 48 / ISO 6429). I have never seen
this used, but I live in England.

Arabic I have never seen anything resembling a hyphen in samples, but don't
know any language using this script. I suspect that kashida (stretching of
words using lengthened horizontal lines) is used to avoid anything as ugly
as a break inside a word.

    Tim

-- 
Tim Partridge. Any opinions expressed are mine only and not those of my employer



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT