From: Julian Bradfield (firstname.lastname@example.org)
Date: Tue Dec 15 2009 - 13:17:23 CST
>On 12/15/2009 2:31 AM, Julian Bradfield wrote:
>> On 2009-12-14, Michael Everson <email@example.com> wrote:
>>> On 14 Dec 2009, at 20:56, Julian Bradfield wrote:
>> As Asmus has pointed out, the question then is, do you ask users to
>> understand this, and magically know that two apparently different
>> strings are actually the same?
>This is where the disconnect is, and where you may be misquoting me. The
>typical user knows a writing system but not the code sequence.
>Programmers have tools that make code sequences visible to them, so they
>can distinguish them. Correctly formatted and displayed, ordinary users
>cannot tell the difference between alternative code sequences for the
>same abstract character. That is as it should be, because what is
>encoded is the abstract character.
Yes - but how many users can distinguish the different abstract
characters (Latin) o, (Greek) ο and (Cyrillic) о ? I certainly
can't. Is this inherently different from the distinction between
precomposed and combining characters?
>Unix users have inherited the mess created by the design approach that
>was based on "character set independence". That approach seemed a nice,
>value-neutral way to handle competing character sets, until it became
>clear that it would in many instances lead to the creation of
>effectively uninterpretable byte-streams. Hence Unicode. But all of that
>is, of course, history.
I wonder why we didn't settle on IS2022 encoded filenames before
Uniocde came along? Just because of the overhead? Or just because of
the timeline of non-ASCII use of computers?
>How the encoding relates an abstract character to code sequence(s), on
>the other hand, is well defined in the Standard.
But the definition of abstract character doesn't necessarily match
what users think!
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
This archive was generated by hypermail 2.1.5 : Tue Dec 15 2009 - 13:20:35 CST