Re: Questions on ZWNBS

From: Chris Jacobs (chris.jacobs@freeler.nl)
Date: Sat Aug 02 2003 - 15:27:33 EDT

  • Next message: Chris Jacobs: "Re: Questions on ZWNBS"

    ----- Original Message -----
    From: "Theodore H. Smith" <delete@elfdata.com>
    To: <unicode@unicode.org>
    Sent: Saturday, August 02, 2003 12:32 PM
    Subject: Questions on ZWNBS

    > Hi list,
    >
    > I have some questions on the ZWNBS. While I don't actually need this
    > myself, someone I know needs this.
    >
    > > Where? Specifically, where does it say FEFF shouldn't be in a string?

    It does not say that.

    > > Certainly, FEFF shouldn't be considered a BOM anywhere but at the start
    > > of a string, but does it say you just can't use that value? And if so,
    > > how are you supposed to use a ZWNBSP?!
    > I'm thinking that 0xFEFF shouldn't be in a UTF16BE string, except at
    > the start right?

    Wrong!

    U+FEFF has two different uses, ZWNBS and BOM

    In a UTF-16BE string (and also in a UTF-16LE string) it is _always_ a ZERO
    WIDTH NO-BREAK SPACE, and _never_ a BOM, regardles if it is at the beginning
    of the file or not.

    Not that there is much use for a ZWNBS at the beginning of a file, but
    suppose that jou have a routine that removes BOM's at the beginning of
    files. Then it should _not_ remove a ZWNBS at the beginning of a UTF-16BE
    text, even though a ZWNBS there makes no sense.

    > For other kinds of UTF, I'm not sure if it is allowed or not. I know it
    > is allowed in UTF16LE. although discouraged.
    >
    > Instead of "can't use ZWNBS", I think that char is discouraged. Where
    > is the rule that discourages it?

    The use of U+FEFF as ZWNBS is afaik not discouraged.

    As for the use UTF-16 with BOM I cannot cite a rule which
    discourages it, but it is something I would expect to be discouraged. Using
    UTF-16BE or UTF-16LE instead is much simpler.



    This archive was generated by hypermail 2.1.5 : Sat Aug 02 2003 - 16:04:49 EDT