Re: Why people still want to encode precomposed letters

From: Karl Pentzlin (
Date: Sun Nov 23 2008 - 16:48:19 CST

  • Next message: Doug Ewell: "Re: Why people still want to encode precomposed letters"

    Am Sonntag, 23. November 2008 um 22:01 schrieb philip chastney:

    pc> A couple of quick questions. First, about how long would the list of
    pc> combinations be?
    pc> if we take 32-ish Latin characters, 24 Greek and 36-ish Cyrillic
    pc> characters, and double that for upper and lower case, we have 144 potential base characters
    pc> Combining Diacritical Marks (0300~036F) lists 112 characters
    pc> ...
    pc> we can refine that figure
    pc> Latin characters use about 40 marks, Greek perhaps half-a-dozen
    pc> (if we count the cases where 2 marks are used) and Cyrillic about 12
    pc> ( 32 × 40 ) + ( 24 × 6 ) + ( 32 × 12 ) = 1808 potential
    pc> combinations per case, which gives us a tighter limit of 3,600 combinations

    If you take into account that:
    - a lot of people (e.g. linguists and writers of North American indigenous
      languages) use to attach 3 diacritical marks onto a base letter,
    - there are "double diacritics" which attach to arbitrary pairs of base letters,
    - there possibly will be "triple diacritics" which attach to arbitrary
      triplets of base letters,
    this number gets somewhat higher.

    - Karl Pentzlin

    This archive was generated by hypermail 2.1.5 : Sun Nov 23 2008 - 16:50:37 CST