RE: Request for Information

From: Whistler, Ken <>
Date: Thu, 24 Jul 2014 17:45:25 +0000

Fantasai asked:

> I would like to request that Unicode include, for each writing system it
> encodes, some information on how it might justify.

Following up on the comment and examples provided by Richard
Wordingham, I'd like to emphasize a relevant point:

Scripts may be used for *multiple* (different) writing systems.

Rules for justification of text are aspects of writing systems,
orthographies, and typographical conventions -- and are not
inherent properties of scripts.

So while there may be strong tendencies for certain scripts to
fall into certain typographical practices, including behavior for
text justification, I don't think that information is inherent
to scripts per se. And it would be misleading and gardenpathy
for the Unicode Standard to try to treat justification as
somehow inhering to scripts.

Note also that there are many cases where there is even argumentation
over the edge cases for script identity -- where one script's
behavior bleeds into another's historically, or where the status
of certain elements as borrowed elements from another script
into a certain orthography or as nativized elements borrowed
from another script *into* a script (thereby requiring
separate encoding).

I think it would make more sense to turn fantasai's query on its
head, as it were: First categorize what kinds of systems of
justification there are, and then start filling in, from best
understood out to the fringes of knowledge of practice, what
writing systems (using what script or combination of scripts)
are attested as regularly using each system. Lacunae are
inevitable, however.

I think it is just a mistake to assume from a query on the Script
property identity of a character, what justification rule should
apply to it in text.

Note also that for many scripts there is no established modern
typographical practice, so it is basically unknown or meaningless
to ask what the justification rules are for it. Modern typographers
setting old material will eventually make up the rules, and those
will *become* the answer, but the Unicode Consortium cannot
look at pictures of fragmentary Byzantine seals or fragments of
papyri and *determine* what some normative (or even informative)
property of justification should be for the script in such a


