Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

From: Alastair Houghton via Unicode <>
Date: Tue, 16 May 2017 17:13:33 +0100

On 16 May 2017, at 17:07, Hans Åberg <> wrote:
>>>> HFS(+), NTFS and VFAT long filenames are all encoded in some variation on UCS-2/UTF-16. ...
>>> The filesystem directory is using octet sequences and does not bother passing over an encoding, I am told. Someone could remember one that to used UTF-16 directly, but I think it may not be current.
>> No, that’s not true. All three of those systems store UTF-16 on the disk (give or take).
> I am not speaking about what they store, but how the filesystem identifies files.

Well, quite clearly none of those systems treat the UTF-16 strings as binary either - they’re case insensitive, so how could they? HFS+ even normalises strings using a variant of a frozen version of the normalisation spec.

Kind regards,


Received on Tue May 16 2017 - 11:13:54 CDT

This archive was generated by hypermail 2.2.0 : Tue May 16 2017 - 11:13:54 CDT