Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8 from Alastair Houghton via Unicode on 2017-05-16 (Unicode Mail List Archive)

From: Alastair Houghton via Unicode <unicode_at_unicode.org>
Date: Tue, 16 May 2017 17:13:33 +0100

On 16 May 2017, at 17:07, Hans Åberg <haberg-1_at_telia.com> wrote:
>
>>>> HFS(+), NTFS and VFAT long filenames are all encoded in some variation on UCS-2/UTF-16. ...
>>>
>>> The filesystem directory is using octet sequences and does not bother passing over an encoding, I am told. Someone could remember one that to used UTF-16 directly, but I think it may not be current.
>>
>> No, that’s not true. All three of those systems store UTF-16 on the disk (give or take).
>
> I am not speaking about what they store, but how the filesystem identifies files.

Well, quite clearly none of those systems treat the UTF-16 strings as binary either - they’re case insensitive, so how could they? HFS+ even normalises strings using a variant of a frozen version of the normalisation spec.

Kind regards,

Alastair.

--
http://alastairs-place.net

Received on Tue May 16 2017 - 11:13:54 CDT

This archive was generated by hypermail 2.2.0 : Tue May 16 2017 - 11:13:54 CDT