Re: Concise term for non-ASCII Unicode characters

From: Daniel Bünzli <daniel.buenzli_at_erratique.ch>
Date: Tue, 29 Sep 2015 20:27:28 +0100

Le mardi, 29 septembre 2015 à 19:50, Ken Whistler a écrit :
> I agree that "scalar values greater than U+007F" doesn't just trip off the tongue,
> and while technically accurate, it is bad terminology -- precisely because it
> begs the question "wtf are 'scalar values'?!" for the average engineer.

And an average engineer knows how to lookup definitions, that one being precise and exceptionally well defined in the Unicode glossary — in stark contrast to the shady (and deceiving for the newbie) notion of "character" that you use subsequently in your message.

This is not "bad terminology", it's *precise* terminology and what I would like to see used in protocols and standards.

Many programmers I talk to are confused by Unicode because their notion of Unicode "character" is a chaotic mix of scalar values, code points and their various *encodings* (i.e. byte level considerations).

Introducing more terminology to talk about that confused idea of Unicode is not going to help. Educating about the difference between scalar values, code points and their various encodings will.

Best,

Daniel
Received on Tue Sep 29 2015 - 14:28:40 CDT

This archive was generated by hypermail 2.2.0 : Tue Sep 29 2015 - 14:28:40 CDT