RE: But E0000 Custom Language Tags Are Actually Required For Use By Unicode

From: Peter Constable (petercon@microsoft.com)
Date: Wed Mar 02 2005 - 17:04:30 CST

Next message: Peter Kirk: "Re: Unicode Stability"

Previous message: Kenneth Whistler: "Re: teh marbuta"
Maybe in reply to: UList@dfa-mail.com: "But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode"
Next in thread: UList@dfa-mail.com: "Re: But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
On Behalf
> Of UList@dfa-mail.com

> 4. The sole correct way to access this local variation of the
Cyrillic script
> is with a "language tag".

There is not *one, sole* correct way. There's absolutely nothing wrong
with someone creating a font that is specific to Serbian.

> 5. The local versions of the Greek script are *identical* in nature
to this
> local version of the Cyrillic script. They are particular, local
styles of the
> same international script system, as used by different Ancient Greek
states.
> They are in fact referred to by the technical term "epichoric",
meaning
> literally, "local".

The fact that there are typographic variations is common to both cases.
It is *not* obvious that they are otherwise identical, however. I know
that the Serbian italic variants are a conventional distinction from
Russian and other languages that is needed across all Serbian users. In
the Greek case, all I know is that there are a number of variations in
palaeographic texts. I have no idea to what extent sets of conventions
(iso-glyph categories) can be defined across all palaeographic Greek
texts, or to what extent sets of conventions can be defined in terms of
what modern users (the palaeographers of today) are looking for. It may
seem obvious to you that these two situations are identical, but that
won't be obvious to me until I have seen input from the Greek
palaeographic community as a whole.

> 7. The only correct way to access local variations of the Greek
script is
> with "language tags".

Another invalid assumption (see #4).

> 9. The only correct way to access local variations of scripts,
including
> variations of Cyrillic script, of Greek script, of Berber script,
etc., is
> with "language tags".

Yet another invalid assumption.

> 10. *But* I have previously demonstrated, fairly obviously, that it
is hardly
> practical for Microsoft to add long lists of OpenType "language tags"
for
> something as obscure as extinct local variations of Greek script. It
is
> certainly not practical for Microsoft to add lists of every of
possible local
> variation of every obscure script such as Berber.
>
> 11. *Therefore*, some kind of "custom language tag" system is a
> *requirement*, for Unicode to function as it is claimed it is
*intended* to function.

An invalid conclusion based on invalid premises. You might be able to
make a case that a custom language tagging system might be a useful
thing to do (on which I will reserve judgment for now), but it is *not*
the case that this is needed for Unicode to function as it is claimed it
is intended to function.

> 12. This is not an obscure, personal desire of mine. It is an
essential and
> inherent component of the approach Unicode itself has created (but
perhaps
> failed to think through to its conclusion).

At the moment, I see no indication that it is anything more than a
personal desire.

> 13. Unicode has in fact created exactly this custom language tag
system with
> the E0000 block. [LANGUAGE][x}[-][custom_language_name][END LANGUAGE].
But
> then this system has been "strongly disrecommended" and therefore is
not
> likely to be implemented by font technologies.

This is complete nonsense. Unicode encoded characters that can be used
as metadata would be but in the absence of markup mechanisms. They have
been called "language tag" characters because they were requested (and
reluctantly granted) specifically for a protocol that wanted to add
language identification to text, but that does not entail that they must
be used for that purpose only -- it doesn't specify anything about
*where* they should be used or with what meaning. This does not imply
that Unicode has created a language tagging system, however, and much
less a custom language tag system.

> 14. THEREFORE, in order to make it actually possible to use Unicode's
*own*
> stated and vigorously defended philosophy on the sole correct means of
> accessing local script variants -- for local script variants which are
too
> obscure to receive official language tags -- Unicode must do one of
the following:

I don't think there's any need to comment on these conclusions when the
argumentation that lead to them is flawed.

> - you may have noticed from this discussion that it seems "(local)
script
> tags" are more appropriate than "language tags" for all these matters,
> including Serbian 't';

You may have noticed that I clarified from the outset (when you first
asked about this on the OpenType list) that OpenType "Language-System"
tags are not the same as language tags.

Peter Constable

Next message: Peter Kirk: "Re: Unicode Stability"
Previous message: Kenneth Whistler: "Re: teh marbuta"
Maybe in reply to: UList@dfa-mail.com: "But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode"
Next in thread: UList@dfa-mail.com: "Re: But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Mar 02 2005 - 17:05:36 CST

RE: But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode

RE: But E0000 Custom Language Tags Are Actually Required For Use By Unicode