Fwd: Re: Byte Order Marks

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Thu Apr 19 2001 - 17:00:33 EDT

Next message: Edward Cherlin: "Unicode motivation/horror stories (was RE: benefits of unicode)"
Previous message: Yves Arrouye: "RE: Byte Order Marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

>Date: Thu, 19 Apr 2001 12:59:43 -0700
>To: Tomas McGuinness <tomas.mcguinness@cmg.nl>
>From: Asmus Freytag <asmusf@ix.netcom.com>
>Subject: Re: Byte Order Marks
>
>At 02:58 PM 4/19/01 +0200, you wrote:
>>If its absent is it safe to assume any particular order (i.e. Big or
>>Little Endian?)

The default order is Big endian, but I wouldn't call that a 'safe'
assumption. In the most general case I would attempt an autorecognition in
the unlabelled case. Where a particular protocol's specification reinforces
that the default order SHALL apply for the unlabelled case, the assumption
becomes that much stronger, of course.

A./

PS: as an aside: the SCSU encoder can be used to do this form of
autorecognition. If text shows much better compression in one byte order
than the other, that byte order is overwhelmingly likely to be the true
one. The exception would be strings of pure Han ideographs. For these it's
necessary to

Next message: Edward Cherlin: "Unicode motivation/horror stories (was RE: benefits of unicode)"
Previous message: Yves Arrouye: "RE: Byte Order Marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT