Re: Byte-order markers question

From: Will Howery (whowery@ix.netcom.com)
Date: Fri Jun 06 1997 - 02:40:46 EDT


Cary,

At 04:42 PM 6/5/97 -0700, you wrote:
>I have a question about byte-order markers which I think I know the answer
>to, but I wanted to run it by the discussion group. The question is,
>suppose I have a file composed not of one large stream of text, but multiple
>separate strings, ie. strings contained within separate text boxes, such as
>a desktop publishing file, would then the byte-order indicator, which is
>also a indication that the text is Unicode as I understand it, be necessary
>at the front of each string? I would guess one of two options:
>
>1. yes, it is required in front of each *separate* string
>2. yes or no, depending on the application
>
>Comments? I hope the question is clear enough.

This question really implies a higher level protocol for the storage of
text. If the text in question are going to be displayed in boxes etc. then
there would also be additional encoding to denote the boxes, format, and
position of the text. The higher level protocol would most likely prescribe
the type of text along with the position and format information. I would
assume that all text inside this protocol would be in Unicode, if you had
mixed text, then the BOM could be useful. But, why mix the text you have to
process it to apply the format, and Unicode is designed to eliminate many of
the hoops we used to jump through in handle multilingual text.

In an application such as Ventura Publisher pre version 7 where the text was
stored in external files to the application, the BOM would be required. But,
if the text is stored internal to the applications file structure, it would
be more efficient to read the text into one endian system and record any
differences along with the format and placement information required by the
higher level protocol.

Will Howery
idioma Ltd.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT