graphemes (was: "textels")

From: Janusz S. Bień <jsbien_at_mimuw.edu.pl>
Date: Mon, 19 Sep 2016 08:40:05 +0200

On Sun, Sep 18 2016 at 21:40 CEST, christoph.paeper_at_crissov.de writes:
> Janusz S. Bien <jsbien_at_mimuw.edu.pl>:
>>
>> From the Unicode glossary:
>>
>>> Grapheme. (1) A minimally distinctive unit of writing in the context of a particular writing system.[...] (2) What a user thinks of as a character.
>>
>>> User-Perceived Character. What everyone thinks of as a character in their script.
>>
>> […] the definitions are language/locale dependent.
>
> A writing system is (usually) language-dependent, a script is not,
> although some scripts have been used exclusively (or prominently) in a
> single writing system with a single language.

It depends of course what do you mean exactly by script, and which
meaning of term is intended in the definition of User-Perceived
Character. But "a user" is definitely language/locale dependent :-)

> So definition (1) of ‘grapheme’ would be appropriate for linguistics,
> (2) maybe for typography and computer science, but it’Í extremely
> vague.

I think that 'grapheme' (2) in the present wording is simply
incorrect. I suspect it is not used in the standard at all.

Searching the Unicode site I found only one use of 'grapheme' alone:

http://www.unicode.org/L2/L2000/00274-N2236-grapheme-joiner.htm

        Graphemes are sequences of one or more encoded characters that
        correspond to what users think of as characters.

I guess the intention of 'grapheme' (2) was to describe it without any
reference to computer encoding, which is definitely an extremely
difficult task.

Best regards

Janusz

-- 
                           ,   
Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/
Received on Mon Sep 19 2016 - 01:40:20 CDT

This archive was generated by hypermail 2.2.0 : Mon Sep 19 2016 - 01:40:21 CDT