RE: Unicode "no-op" Character?

From: Shawn Steele via Unicode <unicode_at_unicode.org>
Date: Sat, 22 Jun 2019 23:56:11 +0000

Assuming you were using any of those characters as "markup", how would you know when they were intentionally in the string and not part of your marking system?

-----Original Message-----
From: Unicode <unicode-bounces_at_unicode.org> On Behalf Of Richard Wordingham via Unicode
Sent: Saturday, June 22, 2019 4:17 PM
To: unicode_at_unicode.org
Subject: Re: Unicode "no-op" Character?

On Sat, 22 Jun 2019 17:50:49 -0400
Sławomir Osipiuk via Unicode <unicode_at_unicode.org> wrote:

> If faced with the same problem today, I’d probably just go with U+FEFF
> (really only need a single char, not a whole delimited substring) or a
> different C0 control (maybe SI/LS0) and clean up the string if it
> needs to be presented to the user.

You'd really want an intelligent choice between U+FEFF (ZWNBSP) (better
U+2060 WJ) and U+200B (ZWSP).

> I still think an “idle”/“null tag”/“noop” character would be a neat
> addition to Unicode, but I doubt I can make a convincing enough case
> for it.

You'd still only be able to insert it between characters, not between code units, unless you were using UTF-32.

Richard.
Received on Sat Jun 22 2019 - 18:56:38 CDT

This archive was generated by hypermail 2.2.0 : Sat Jun 22 2019 - 18:56:39 CDT