Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

From: Denis Jacquerye <moyogo_at_gmail.com>
Date: Fri, 5 Jul 2013 08:04:35 +0100

On Thu, Jul 4, 2013 at 12:07 PM, Michael Everson <everson_at_evertype.com> wrote:
> On 4 Jul 2013, at 03:56, "Phillips, Addison" <addison_at_lab126.com> wrote:
>
>> I don't disagree with the potential need for changing the decomposition. That discussion seems clear and is only muddled by talking about variant, language sensitive rendering. That isn't the only consideration, right?
>
> No, Addison, we can't change the decomposition, That would invalidate all the data everywhere in Latvia.
>
>> I disagree that language tagging is not a valid means of getting language specific shaping (which could solve a specific problem). This is hardly confined to CJK or Latvian. Minority languages can, in fact, take advantage of it, within reason (documentation is a problem and it presupposes that glyph support is available). In fact, in some ways, language based glyph selection is possibly easier to achieve because the number of implementations is relatively small.
>
> The problem is in pretending that a cedilla and a comma below are equivalent because in some script fonts in France or Turkey routinely write some sort of undifferentiated tick for ç. :-)

Sure they are not equivalent, but stop pretending it is only in some
script fonts, the page http://typophile.com/node/49347 has plenty of
examples where it is not in script fonts. In some languages the
cedilla can have a shape similar to that of a comma, it's a fact.
Any native speaker will tell you the comma-like form and others are
acceptable. Just look at lemonde.fr or zaman.com.tr, both very popular
newspapers use webfonts with non classic cedilla (Le Monde uses TheMix
—even in print it uses TheAntiqua with their comma-like cedilla— and
Zaman uses a custom font with an attached tick-like cedilla).
This is not the majority but it is frequent enough.

> As far as I can see the only solution is:
>
> Mandate that only the comma-below shape is appropriate for Ḑḑ Ģģ Ķķ Ļļ Ņņ Ŗŗ despite their decomposition to cedilla.
> Encode a set of undecomposable Dd Gg Kk Ll Nn Rr with invariant cedilla for display of that glyph with those base letters.
>
> The only strangeness here is that D̦d̦ G̦g̦ K̦k̦ L̦l̦ N̦n̦ R̦r̦ with genuine combining comma below are confusable with the Latvian/Livonian letters, but that is already the case.
>
>> None of this addresses the problem of pain text representation or the potential need to represent what are apparently different characters with a single encoding. But if it is just presentation we're talking about... how does this differ from, for example, Serbian vs Russian?
>
> What, the italic lowercase т? That is really not comparable to this issue.
>
> Michael Everson * http://www.evertype.com/
>
>
>

--
Denis Moyogo Jacquerye
African Network for Localisation http://www.africanlocalisation.net/
Nkótá ya Kongó míbalé --- http://info-langues-congo.1sd.org/
DejaVu fonts --- http://www.dejavu-fonts.org/
Received on Fri Jul 05 2013 - 02:09:15 CDT

This archive was generated by hypermail 2.2.0 : Fri Jul 05 2013 - 02:09:16 CDT