Re: The rules of encoding (from Re: Missing geometric shapes)

From: Mark E. Shoulson <>
Date: Thu, 08 Nov 2012 19:39:53 -0500

On 11/08/2012 01:48 AM, William_J_G Overington wrote:
> Michael Everson <> wrote:
> < ... collect examples of these in print ...
> Mark E. Shoulson <> wrote:
>> We don't encode "it would be nice/useful." We encode *characters*, glyphs that people use (yes, I know I conflated glyphs and characters there.)
> ...
>> Unicode isn't a system for encoding ratings. It's a system for encoding what people write and print.
> I have at various times, as research has progressed, deposited with the British Library pdf documents that I have produced and published and I have deposited with the British Library TrueType fonts that I have produced and published and I have received email receipts for them.
> Some of the pdf publications contain new symbols, used intermixed with text in a plain text situation. I have used Private Use Area encodings for the symbols.
> Yet the publications have not been published in hardcopy form.

I think you may be taking me too literally. A PDF document which is
essentially a proxy for a printed page (only cheaper to copy and
produce) would count, to me, as usage "in print." I don't make the
rules, but I think some of the Unicoders who do would agree. The charge
of the rules being "out of date" because they demand usage is not an
accurate one, and pointing to printing vs electronic usage is a red herring.

I have long complained about another writing system which I felt had
trouble being encoded due to chicken-and-egg issues (Klingon), but even
so people have been using it in the PUA; see
(now defunct, apparently, but the site is still there), and the KLI's
collection of Qo'noS QonoS is available in Latin letters or in pIqaD in PUA.

I agree that there is something to the charge of chicken-and-egg issues
with encoding writing systems (you can't write it until it's encoded,
you can't encode it until it's written), but probably more with the
amount of usage that has to be seen, not with the requirement that there
be SOME usage.

I stand by it: we don't encode what would be cool to have. We encode
what people *use*.

Received on Thu Nov 08 2012 - 18:41:25 CST

This archive was generated by hypermail 2.2.0 : Thu Nov 08 2012 - 18:41:29 CST