Re: Corner cases (was: Re: UTF-16 Encoding Scheme and U+FFFE)

From: Steffen Nurpmeso <>
Date: Fri, 06 Jun 2014 13:14:47 +0200

"Doug Ewell" <> wrote:
 |Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
 |> Not necessarily true.
 |> [602 words]
 |This has nothing to do with the scenario I described, which involved
 |removing a "BOM" from the start of an arbitrary fragment of data,
 |thereby corrupting the data because the "BOM" was actually a ZWNBSP.
 |If you have an arbitrary fragment of data, don't fiddle with it.
 |If you know enough about the data to fiddle with it safely, it's not

E.g., on the all-UTF-8 Plan9 research operating system:

  ?0[9front.update_bomb_git]$ git ls-files --with-tree=master --|wc -l
  ?0[9front.update_bomb_git]$ git grep -lI "`print '\ufeff'`" master|wc -l
  ?0[9front.update_bomb_git]$ git grep -lI "`print '\ufeff'`" master

