RE: Titlecasing words starting with numeric glyphs and period as word separator

From: Koji Ishii (
Date: Wed Mar 02 2011 - 02:00:18 CST

  • Next message: Doug Ewell: "Re: Facepalm gesture/emoticon proposal"

    Thank you everyone for the responses.

    I was not aware that titlecasing is this complicated when I started discussion here and at CSS ML, sorry about that, but now I knew much better thanks to everyone.

    The text-transform property[1] in CSS is designed as a very limited, dumb and simple feature in CSS1, so what we want here is to improve its international support using what Unicode has defined while keeping the scope of the feature intact. It may not be as useful as many would expect, but is still usable for some cases, and can protect existing pages from broken.

    The scope of the feature, at least as of now, doesn't cover cases like "of" or "Of", which require language-dependent dictionaries.

    For cases like "a.m.", "'Tis", "L'Arbre", "!Kung", or "49ers", the spec took a little different approach from 5.18 Case Mapping in Unicode. As you can see, the spec says "first *character* of each word" and recommends UAX #29 for separating words, but allows implementations to change it to optimize for titlecasing. For instance, U+002E FULL STOP and U+0027 APOSTROPHE are MidNumLet in UAX #29 but implementations can use them as word separators for titlecasing to solve all the cases listed above. Doing so may have undesired side effects, I'm not sure, but the point is in the text-transform CSS property, the exact behavior for punctuation is UA dependent.

    I'm not suggesting or requesting anything here, but I posted the message here to let you know about the current activities in CSS, and to let you know that if Unicode can define something new, I will try to incorporate the efforts into the text-transform property.



    This archive was generated by hypermail 2.1.5 : Wed Mar 02 2011 - 02:05:10 CST