Re: Zero Width Word Boundary

From: Doug Ewell (
Date: Thu Jan 29 2009 - 22:59:26 CST

  • Next message: verdy_p: "Re: Urgent call for clarification of Armenian numbering rules"

    ɹɐzlnƃ ɟıʇɐ <atif dot gulzar at gmail dot com> wrote:

    > I have checked and could not find any Unicode character for word
    > separator (zero width space as WORD separator). This character/code is
    > needed for languages where space is not used as word separator. The
    > available zero width characters are incapable to address this issue.
    > e.g.
    > U+200B Zero Width Space: This character is intended for line break
    > control (In Lao language lines can be broken at syllable levels, Lao
    > uses U+200B to mark syllable boundaries).
    > ...

    According to Section 11.1 on Thai in TUS 5.0 (p. 376), and Section 16.2
    on layout controls (p. 535), U+200B ZERO WIDTH SPACE is the right
    character for marking word boundaries in languages like Thai which don't
    use visible spaces between words. I don't see why this would be
    different for Lao.

    Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14  ˆ

    This archive was generated by hypermail 2.1.5 : Thu Jan 29 2009 - 23:02:31 CST