Re: Corner cases (was: Re: UTF-16 Encoding Scheme and U+FFFE)

From: Steffen Nurpmeso <sdaoden_at_yandex.com>
Date: Fri, 06 Jun 2014 13:14:47 +0200

"Doug Ewell" <doug_at_ewellic.org> wrote:
 |Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
 |> Not necessarily true.
 |>
 |> [602 words]
 |
 |This has nothing to do with the scenario I described, which involved
 |removing a "BOM" from the start of an arbitrary fragment of data,
 |thereby corrupting the data because the "BOM" was actually a ZWNBSP.
 |
 |If you have an arbitrary fragment of data, don't fiddle with it.
 |
 |If you know enough about the data to fiddle with it safely, it's not
 |arbitrary.

Yeah!
E.g., on the all-UTF-8 Plan9 research operating system:

  ?0[9front.update_bomb_git]$ git ls-files --with-tree=master --|wc -l
     44983
  ?0[9front.update_bomb_git]$ git grep -lI "`print '\ufeff'`" master|wc -l
        12
  ?0[9front.update_bomb_git]$ git grep -lI "`print '\ufeff'`" master
  master:9front.hg/lib/font/bit/MAP
  master:9front.hg/lib/glass
  master:9front.hg/sys/lib/troff/font/devutf/0100to25ff
  master:9front.hg/sys/lib/troff/font/devutf/C
  master:9front.hg/sys/lib/troff/font/devutf/CW
  master:9front.hg/sys/lib/troff/font/devutf/H
  master:9front.hg/sys/lib/troff/font/devutf/LucidaSans
  master:9front.hg/sys/lib/troff/font/devutf/PA
  master:9front.hg/sys/lib/troff/font/devutf/R
  master:9front.hg/sys/lib/troff/font/devutf/R.nomath
  master:9front.hg/sys/src/ape/lib/utf/runetype.c
  master:9front.hg/sys/src/libc/port/runetype.c

--steffen
_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Fri Jun 06 2014 - 06:16:08 CDT

This archive was generated by hypermail 2.2.0 : Fri Jun 06 2014 - 06:16:09 CDT