RE: Tentative Definition of Casefolding

From: Keutgen, Walter (
Date: Wed Jun 14 2006 - 12:49:09 CDT

  • Next message: Philippe Verdy: "Re: are Unicode codes somehow specified in official national linguistic literature ? (worldwide)"


    I understood that your contribution was informative, so was mine. It is Richard who is searching rules for title casing. I.e. he wants to avoid that "ff" at the begin of a word change because some English aristocratic names begin by "ff". And he used the Dutch "IJ" example, which you contributed, for that purpose. And he considered also the case change of propor nouns generally.

    Title case applied to a whole sentence as a title, in the way I have seen it in some American or English texts, just capitalizes the 1st letter of each word. As we do not follow this usage in continental Europe, we would select the words manually and we would have no problem with surnames, provided we know the rules, which I did not for the Dutch letter. Is Mr. "van Oostergem" a noble or not?. Of course a software "title-casing" correctly a word in Dutch should know the "IJ" rule. I believe word processors implement this rule as a typing assistance.

    You wrote: "Realistically we should be using the special glyph, but almost everyone I know doesn't even realise we have a specialised glyph for this". Authors care about letters, not glyphs, for them "ij" no longer exists. People involved in the reproduction of text as an artwork do.

    Collation sequences vary with time. A German dictionnary of 1962 shows the hand written "alphabet" both in Süterlin and Latin script with "ä", "ö" and "ü" after the "z" like in the Scandinavian collation sequences, whereas the dictionnary itself sorts these letters like "a", "o" and "u" respectively and just behind them in case of collision of the full words.

    Does "ij" at the place of "y" imply that there was no "y"? Did the Dutches not just put a diaresis on the "y"?
    If true, this is interesting. In the French speaking part of Belgium, people tend to write their Dutch or Flemish rooted names with an "y" instead of "ij". In manual writing "ÿ" would be the same as "ij". When I say people, one must know that long ago only the teachers, priests, civil state servants and army secretaries wrote authoritatively names and sometimes they made mistakes, "corrected" or just transcribed what they heart. So my mother's first name on the birth certificate, a document one does not read too often, is "Catharina" whilst she was tought at school to write "Katharina" – her teachers did not read the birth certificate. She was not aware of "Catharina" until a civil state servant stopped her when signing "Katharina ...". Similarily the father of a colleague of 11 years ago suddenly got a letter of the ministery that he illegally had changed his name to "Pattyn" and was urged to revert to "Pattijn" including to reprint his business paper forms!

    Best regards


    THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

    -----Original Message-----
    From: Jeroen Ruigrok/asmodai []
    Sent: Wednesday, 14 June 2006 18:37
    To: Keutgen, Walter
    Cc: Richard Wordingham;
    Subject: Re: Tentative Definition of Casefolding


    -On [20060614 18:16], Keutgen, Walter ( wrote:
    >Surnames with words beginning with a lower case letter would not have this
    >letter capitalized, except as 1st letter of the title being a sentence of its

    You are, I think, forgetting about the case where you have a Johan van
    Oostergem and a letter will address him as:

    Geachte heer Van Oostergem [...].

    The lower case start of the surname is in fact uppercased. Not sure if you
    meant that with title as well, if so, mea culpa.

    >Is it really the aim of Unicode to cover this all for the benefit of some
    >universal routine? I doubt.

    I doubt that too. My information was more informative I guess. ;)

    >If I remember well what I have heard from a now retired Dutch colleague and
    >read elsewhere, before the spelling reform of 1946-1947, the ligature "ij"
    >[ει] was a letter on its own, between "i" and "j" in the collation sequence.

    I can show you old 19th century dictionary files where the ij ligature
    (derived from ii) was in the place in the collation sequence where nowadays y
    is situated (u v w x y z).

    >The ligatures exist in UNICODE, U+0132 (IJ) and U+0133 (ij). Like the French
    >"Œ" and "œ", they were not present in the typewriter. The decision was taken
    >to write henceforth "ij" and "IJ". The "IJ" instead of "Ij" in title case
    >could be the result of a victory of traditionalists (like the ending "isch"
    >instead of "is").

    Some philologist would need to verify this, but it might also have to do with
    the double i which lies at the origin of ij.

    But I wonder how much of that is relevant to this discussion. I do love to
    hear of answers though. ;)

    Jeroen Ruigrok van der Werven <asmodai(-at-)> / asmodai
    イェルーン ラウフロック ヴァン デル ウェルヴェン
    If I am telling you the Truth now, do you believe it..?

    This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 13:07:50 CDT