Re: Tentative Definition of Casefolding

From: Mark Davis (
Date: Wed Jun 14 2006 - 14:03:34 CDT

  • Next message: John Hudson: "Re: Glyphs for German quotation marks"

    There was discussion of this in the UTC a few meetings ago, and as I recall,
    the conclusion was that it was more appropriate for CLDR, since it is a
    language-sensitive issue. (However, I don't think the discussion was
    captured in the minutes.)

    In any event, I filed a placeholder bug at Participants in
    this email discussion can file a Reply there with any comments or


    On 6/14/06, Keutgen, Walter <> wrote:
    > Jeroen,
    > I understood that your contribution was informative, so was mine. It is
    > Richard who is searching rules for title casing. I.e. he wants to avoid
    > that "ff" at the begin of a word change because some English aristocratic
    > names begin by "ff". And he used the Dutch "IJ" example, which you
    > contributed, for that purpose. And he considered also the case change of
    > propor nouns generally.
    > Title case applied to a whole sentence as a title, in the way I have seen
    > it in some American or English texts, just capitalizes the 1st letter of
    > each word. As we do not follow this usage in continental Europe, we would
    > select the words manually and we would have no problem with surnames,
    > provided we know the rules, which I did not for the Dutch letter. Is Mr.
    > "van Oostergem" a noble or not?. Of course a software "title-casing"
    > correctly a word in Dutch should know the "IJ" rule. I believe word
    > processors implement this rule as a typing assistance.
    > You wrote: "Realistically we should be using the special glyph, but
    > almost everyone I know doesn't even realise we have a specialised glyph for
    > this". Authors care about letters, not glyphs, for them "ij" no longer
    > exists. People involved in the reproduction of text as an artwork do.
    > Collation sequences vary with time. A German dictionnary of 1962 shows
    > the hand written "alphabet" both in Süterlin and Latin script with "ä", "ö"
    > and "ü" after the "z" like in the Scandinavian collation sequences, whereas
    > the dictionnary itself sorts these letters like "a", "o" and "u"
    > respectively and just behind them in case of collision of the full words.
    > Does "ij" at the place of "y" imply that there was no "y"? Did the Dutches
    > not just put a diaresis on the "y"?
    > If true, this is interesting. In the French speaking part of Belgium,
    > people tend to write their Dutch or Flemish rooted names with an "y" instead
    > of "ij". In manual writing "ÿ" would be the same as "ij". When I say
    > people, one must know that long ago only the teachers, priests, civil state
    > servants and army secretaries wrote authoritatively names and sometimes they
    > made mistakes, "corrected" or just transcribed what they heart. So my
    > mother's first name on the birth certificate, a document one does not read
    > too often, is "Catharina" whilst she was tought at school to write
    > "Katharina" – her teachers did not read the birth certificate. She was not
    > aware of "Catharina" until a civil state servant stopped her when signing
    > "Katharina ...". Similarily the father of a colleague of 11 years ago
    > suddenly got a letter of the ministery that he illegally had changed his
    > name to "Pattyn" and was urged to revert to "Pattijn" including to reprint
    > his business paper forms!
    > Best regards
    > Walter
    > MATERIAL and is thus for use only by the intended recipient. If you received
    > this in error, please contact the sender and delete the e-mail and its
    > attachments from all computers.
    > -----Original Message-----
    > From: Jeroen Ruigrok/asmodai []
    > Sent: Wednesday, 14 June 2006 18:37
    > To: Keutgen, Walter
    > Cc: Richard Wordingham;
    > Subject: Re: Tentative Definition of Casefolding
    > Walter,
    > -On [20060614 18:16], Keutgen, Walter (
    > wrote:
    > >Surnames with words beginning with a lower case letter would not have
    > this
    > >letter capitalized, except as 1st letter of the title being a sentence of
    > its
    > >own.
    > You are, I think, forgetting about the case where you have a Johan van
    > Oostergem and a letter will address him as:
    > Geachte heer Van Oostergem [...].
    > The lower case start of the surname is in fact uppercased. Not sure if you
    > meant that with title as well, if so, mea culpa.
    > >Is it really the aim of Unicode to cover this all for the benefit of some
    > >universal routine? I doubt.
    > I doubt that too. My information was more informative I guess. ;)
    > >If I remember well what I have heard from a now retired Dutch colleague
    > and
    > >read elsewhere, before the spelling reform of 1946-1947, the ligature
    > "ij"
    > >[ει] was a letter on its own, between "i" and "j" in the collation
    > sequence.
    > I can show you old 19th century dictionary files where the ij ligature
    > (derived from ii) was in the place in the collation sequence where
    > nowadays y
    > is situated (u v w x y z).
    > >The ligatures exist in UNICODE, U+0132 (IJ) and U+0133 (ij). Like the
    > French
    > >"Œ" and "œ", they were not present in the typewriter. The decision was
    > taken
    > >to write henceforth "ij" and "IJ". The "IJ" instead of "Ij" in title
    > case
    > >could be the result of a victory of traditionalists (like the ending
    > "isch"
    > >instead of "is").
    > Some philologist would need to verify this, but it might also have to do
    > with
    > the double i which lies at the origin of ij.
    > But I wonder how much of that is relevant to this discussion. I do love to
    > hear of answers though. ;)
    > --
    > Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
    > イェルーン ラウフロック ヴァン デル ウェルヴェン
    > If I am telling you the Truth now, do you believe it..?

    This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 14:16:38 CDT