Re: UTF-8N?

From: Robert A. Rosenberg (bob.rosenberg@digitscorp.com)
Date: Fri Jun 23 2000 - 13:50:21 EDT

Next message: Robert A. Rosenberg: "RE: How to distinguish UTF-8 from Latin-* ?"
Previous message: Robert A. Rosenberg: "RE: UTF-8 BOM Nonsense"
Maybe in reply to: Masahiko Maedera: "UTF-8N?"
Next in thread: John Cowan: "Re: UTF-8N?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 10:54 PM 06/22/2000 -0800, Doug Ewell wrote:
>Now that Unicode plans to deprecate the use of U+FEFF as ZWNBSP,
>programs that *expect* UTF-8 instead of SBCS will be able to throw away
>an initial U+FEFF with even greater confidence. It may even be possible
>for operating system developers to build this in at the OS level: open
>a UTF-8 text file; read characters; if the very first character in the
>file was U+FEFF then eat it. Applications would never even see it.
>How cool would that be?

It would be very UNCool unless the application can tell the operating
system that it wants this done for it. Otherwise it will have no way of
KNOWING that the edited stream that the operating system is passing it IS
UTF-8 (and was so identified by the deleted BOM) and not some other
character-set that the program will fail on if it tries to parse it as
UTF-8. Letting the application SEE the BOM acts as a sanity check.

Next message: Robert A. Rosenberg: "RE: How to distinguish UTF-8 from Latin-* ?"
Previous message: Robert A. Rosenberg: "RE: UTF-8 BOM Nonsense"
Maybe in reply to: Masahiko Maedera: "UTF-8N?"
Next in thread: John Cowan: "Re: UTF-8N?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT