Operational Properties for Action 115A008

L2/09-219
Subject: Operational Properties for Action 115A008
Date: 2009-05-09
From: Mark Davis
To: UTC


I had the following action from the UTC:

115    A008    Mark Davis        Produce updated proposal for the "operationally X-cased" properties, with more background.    L2/08-157            2008-05-20    2008-05-20       

Here is the proposal.

DerivedCoreProperties.txt

Add the following 6 properties (the short name is in parens).

# Derived Property:   Cased (Cased)
#  As defined by Unicode Standard Definition D120
#  C has the Lowercase or Uppercase property or has a General_Category value of Titlecase_Letter.

0041..005A    ; Cased # L&  [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z
0061..007A    ; Cased # L&  [26] LATIN SMALL LETTER A..LATIN SMALL LETTER Z
00AA          ; Cased # L&       FEMININE ORDINAL INDICATOR
00B5          ; Cased # L&       MICRO SIGN
00BA          ; Cased # L&       MASCULINE ORDINAL INDICATOR
00C0..00D6    ; Cased # L&  [23] LATIN CAPITAL LETTER A WITH GRAVE..LATIN CAPITAL LETTER O WITH DIAERESIS
...

# Derived Property:   Case_Ignoreable (CI)
#  As defined by Unicode Standard Definition D121
#  C is defined to be case-ignorable if
#    Word_Break(C) = MidLetter or MidNumLet, or
#    General_Category(C) = Nonspacing_Mark (Mn), Enclosing_Mark (Me), Format (Cf), Modifier_Letter (Lm), or Modifier_Symbol (Sk).

0027          ; Case_Ignoreable # Po       APOSTROPHE
002E          ; Case_Ignoreable # Po       FULL STOP
003A          ; Case_Ignoreable # Po       COLON
005E          ; Case_Ignoreable # Sk       CIRCUMFLEX ACCENT
0060          ; Case_Ignoreable # Sk       GRAVE ACCENT
00A8          ; Case_Ignoreable # Sk       DIAERESIS
....

# Derived Property:   Operationally_Lowercased (OLC)
#  As defined by Unicode Standard Definition D124
#  isLowercase(X) is true when toLowercase(Y) = Y

0000..001F    ; Operationally_Lowercased # Cc  [32] <control-0000>..<control-001F>
0020          ; Operationally_Lowercased # Zs       SPACE
0021..0023    ; Operationally_Lowercased # Po   [3] EXCLAMATION MARK..NUMBER SIGN
0024          ; Operationally_Lowercased # Sc       DOLLAR SIGN
0025..0027    ; Operationally_Lowercased # Po   [3] PERCENT SIGN..APOSTROPHE
0028          ; Operationally_Lowercased # Ps       LEFT PARENTHESIS
0029          ; Operationally_Lowercased # Pe       RIGHT PARENTHESIS
002A          ; Operationally_Lowercased # Po       ASTERISK
002B          ; Operationally_Lowercased # Sm       PLUS SIGN
...

# Derived Property:   Operationally_Uppercased (OUC)
#  As defined by Unicode Standard Definition D125
#  isUppercase(X) is true when toUppercase(Y) = Y

0000..001F    ; Operationally_Uppercased # Cc  [32] <control-0000>..<control-001F>
0020          ; Operationally_Uppercased # Zs       SPACE
0021..0023    ; Operationally_Uppercased # Po   [3] EXCLAMATION MARK..NUMBER SIGN
0024          ; Operationally_Uppercased # Sc       DOLLAR SIGN
0025..0027    ; Operationally_Uppercased # Po   [3] PERCENT SIGN..APOSTROPHE
0028          ; Operationally_Uppercased # Ps       LEFT PARENTHESIS
0029          ; Operationally_Uppercased # Pe       RIGHT PARENTHESIS
...

# Derived Property:   Operationally_Titlecased (OTC)
#  As defined by Unicode Standard Definition D126
#  isTitlecase(X) is true when toTitlecase(Y) = Y

0000..001F    ; Operationally_Titlecased # Cc  [32] <control-0000>..<control-001F>
0020          ; Operationally_Titlecased # Zs       SPACE
0021..0023    ; Operationally_Titlecased # Po   [3] EXCLAMATION MARK..NUMBER SIGN
0024          ; Operationally_Titlecased # Sc       DOLLAR SIGN
0025..0027    ; Operationally_Titlecased # Po   [3] PERCENT SIGN..APOSTROPHE
...

# Derived Property:   Operationally_Casefolded (OCF)
#  As defined by Unicode Standard Definition D127
#  isCasefolded(X) is true when toCasefold(Y) = Y

0000..001F    ; Operationally_Casefolded # Cc  [32] <control-0000>..<control-001F>
0020          ; Operationally_Casefolded # Zs       SPACE
0021..0023    ; Operationally_Casefolded # Po   [3] EXCLAMATION MARK..NUMBER SIGN
0024          ; Operationally_Casefolded # Sc       DOLLAR SIGN
0025..0027    ; Operationally_Casefolded # Po   [3] PERCENT SIGN..APOSTROPHE
0028          ; Operationally_Casefolded # Ps       LEFT PARENTHESIS
0029          ; Operationally_Casefolded # Pe       RIGHT PARENTHESIS
002A          ; Operationally_Casefolded # Po       ASTERISK
002B          ; Operationally_Casefolded # Sm       PLUS SIGN
...

# Derived Property:   Operationally_Cased (OC)
#  As defined by Unicode Standard Definition D128
#  isCased(X) when isLowercase(X) is false, or isUppercase(X) is false, or isTitlecase(X) is false

0041..005A    ; Operationally_Cased # L&  [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z
0061..007A    ; Operationally_Cased # L&  [26] LATIN SMALL LETTER A..LATIN SMALL LETTER Z
00B5          ; Operationally_Cased # L&       MICRO SIGN
00C0..00D6    ; Operationally_Cased # L&  [23] LATIN CAPITAL LETTER A WITH GRAVE..LATIN CAPITAL LETTER O WITH DIAERESIS
00D8..00F6    ; Operationally_Cased # L&  [31] LATIN CAPITAL LETTER O WITH STROKE..LATIN SMALL LETTER O WITH DIAERESIS
00F8..0137    ; Operationally_Cased # L&  [64] LATIN SMALL LETTER O WITH STROKE..LATIN SMALL LETTER K WITH CEDILLA
0139..018C    ; Operationally_Cased # L&  [84] LATIN CAPITAL LETTER L WITH ACUTE..LATIN SMALL LETTER D WITH TOPBAR
...

DerivedNormalizationProperties.txt

Add the following 2 properties:

# Derived Property:   CaseCompatIgnorableFold (CCIF)
#  As defined by CaseFolding, removing Default_Ignorable_Code_Points, then transforming by NFKC; then repeating

#  All code points not explicitly listed for CaseCompatIgnorableFold
#  have a value equal to the code point.

0041  ; CaseCompatIgnorableFold; 0061           # L&  LATIN CAPITAL LETTER A
0042  ; CaseCompatIgnorableFold; 0062           # L&  LATIN CAPITAL LETTER B
0043  ; CaseCompatIgnorableFold; 0063           # L&  LATIN CAPITAL LETTER C
0044  ; CaseCompatIgnorableFold; 0064           # L&  LATIN CAPITAL LETTER D
0045  ; CaseCompatIgnorableFold; 0065           # L&  LATIN CAPITAL LETTER E
0046  ; CaseCompatIgnorableFold; 0066           # L&  LATIN CAPITAL LETTER F
0047  ; CaseCompatIgnorableFold; 0067           # L&  LATIN CAPITAL LETTER G
...
005A  ; CaseCompatIgnorableFold; 007A           # L&  LATIN CAPITAL LETTER Z
00A0  ; CaseCompatIgnorableFold; 0020           # Zs  NO-BREAK SPACE
00A8  ; CaseCompatIgnorableFold; 0020 0308      # Sk  DIAERESIS
00AA  ; CaseCompatIgnorableFold; 0061           # L&  FEMININE ORDINAL INDICATOR
00AD  ; CaseCompatIgnorableFold;                # Cf  SOFT HYPHEN
00AF  ; CaseCompatIgnorableFold; 0020 0304      # Sk  MACRON
00B2  ; CaseCompatIgnorableFold; 0032           # No  SUPERSCRIPT TWO
00B3  ; CaseCompatIgnorableFold; 0033           # No  SUPERSCRIPT THREE
00B4  ; CaseCompatIgnorableFold; 0020 0301      # Sk  ACUTE ACCENT
...

# Derived Property:   CaseCompatIgnorableFolded (isCCIF)
#  As defined by cp = CaseCompatIgnorableFold(cp)

0000..001F    ; CaseCompatIgnorableFolded # Cc  [32] <control-0000>..<control-001F>
0020          ; CaseCompatIgnorableFolded # Zs       SPACE
0021..0023    ; CaseCompatIgnorableFolded # Po   [3] EXCLAMATION MARK..NUMBER SIGN
0024          ; CaseCompatIgnorableFolded # Sc       DOLLAR SIGN
0025..0027    ; CaseCompatIgnorableFolded # Po   [3] PERCENT SIGN..APOSTROPHE
0028          ; CaseCompatIgnorableFolded # Ps       LEFT PARENTHESIS
0029          ; CaseCompatIgnorableFolded # Pe       RIGHT PARENTHESIS
002A          ; CaseCompatIgnorableFolded # Po       ASTERISK
...

Text

Add references to these properties under the corresponding definitions, plus in UAX #31.