Re: Why Work at Encoding Level? from Daniel Bünzli on 2015-10-21 (Unicode Mail List Archive)

From: Daniel Bünzli <daniel.buenzli_at_erratique.ch>
Date: Wed, 21 Oct 2015 14:16:07 +0100

Le mercredi, 21 octobre 2015 à 04:37, Mark Davis ☕️ a écrit :
> If you're not, the question is relevant.

I'm not disputing the question, I'm disputing trying to give it a defined answer. Even if your string is UTF-16 based these problems can be solved by providing proper abstractions at the library level and ask clients to handle the problem *once* when you inject the UTF-16 strings in your abstraction which can then operate in a "clean" world where these questions do not arise.

Besides programming languages do evolve and one should at least make sure that new languages provide adequate abstractions for handling Unicode text. Looking at the recent batch of new languages I don't think this is happening. I'm sure language designers are keen on taking off-the shelf designs for this rather than get into the details and but I would say that TUS by defining notions of Unicode strings at the encoding level is not doing a very good job at providing one.

FWIW when I got into the standard around 2008 by reading that thick hard-copy of TUS 5.0, I took me quite some time to actually understand and uncover the real structure behind Unicode which are the scalar values.

Best,

Daniel
Received on Wed Oct 21 2015 - 08:17:37 CDT

This archive was generated by hypermail 2.2.0 : Wed Oct 21 2015 - 08:17:38 CDT