RE: The future of UTF-8

From: Paul Dempsey (Exchange) (paulde@exchange.microsoft.com)
Date: Thu Jul 22 1999 - 23:39:52 EDT


> -----Original Message-----
> From: Gianni Mariani [mailto:gianni@corp.webtv.net]
>
> The issue I have with BOM's is that if I have 2 "plain text"
> files and I do this kind of operation:
>
> type appendfile >> oldfile
>
> It's not guarenteed to work unless the consuming application
> processes multiple BOMS ...

The reason this is not guaranteed to work is because the command processor
that's doing "type" with redirection doesn't know about the file formats.
It's the command processor that's defective, NOT the use of BOM/file
signature.

It is a trivial matter to write a process that correctly concatenates files
with BOMs. I'm sure that someone on this list can promptly cough up a few
lines of perl that does it.

Your argument is not much different than expecting to be able to do a
byte-wise concatenation of a Shift+JIS file with a codepage 1252 (Windows
Western) file. These are both "plain text" files, but it fails miserably.

I think that transparent byte-wise concatenation of files is a minor
consideration when designing the file format.

--- Paul



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT