L2/03-103 Title: Remove overlap between Default_Ignorable_Code_Point and Other_Default_Ignorable_Code_Point Source: Markus Scherer Date: March 4, 2003 Default_Ignorable_Code_Point has a formula that includes general categories, hardcoded ranges, and Other_Default_Ignorable_Code_Point. The formula elements overlap with Other_Default_Ignorable_Code_Point. For other properties XYZ, their formulas combine Other_XYZ mutually exclusively with general categories etc. The proposal is to clean up both the formula of Default_Ignorable_Code_Point and the contents of Other_Default_Ignorable_Code_Point to avoid overlap and hardcoded ranges, without changing the contents of Default_Ignorable_Code_Point. a) Change the formula for Default_Ignorable_Code_Point from # Derived Property: Default_Ignorable_Code_Point # Generated from <2060..206F, FFF0..FFFB, E0000..E0FFF> # + Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space) by removing the hardcoded ranges to # Derived Property: Default_Ignorable_Code_Point # Generated from # Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space) b) Change the contents of Other_Default_Ignorable_Code_Point from 00AD ; Other_Default_Ignorable_Code_Point # Cf SOFT HYPHEN 034F ; Other_Default_Ignorable_Code_Point # Mn COMBINING GRAPHEME JOINER 115F..1160 ; Other_Default_Ignorable_Code_Point # Lo [2] HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG FILLER 180B..180D ; Other_Default_Ignorable_Code_Point # Mn [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE 200B ; Other_Default_Ignorable_Code_Point # Zs ZERO WIDTH SPACE 2060..2063 ; Other_Default_Ignorable_Code_Point # Cf [4] WORD JOINER..INVISIBLE SEPARATOR 2064..2069 ; Other_Default_Ignorable_Code_Point # Cn [6] 206A..206F ; Other_Default_Ignorable_Code_Point # Cf [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES 3164 ; Other_Default_Ignorable_Code_Point # Lo HANGUL FILLER FE00..FE0F ; Other_Default_Ignorable_Code_Point # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 FFA0 ; Other_Default_Ignorable_Code_Point # Lo HALFWIDTH HANGUL FILLER FFF0..FFF8 ; Other_Default_Ignorable_Code_Point # Cn [9] FFF9..FFFB ; Other_Default_Ignorable_Code_Point # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR E0000 ; Other_Default_Ignorable_Code_Point # Cn E0001 ; Other_Default_Ignorable_Code_Point # Cf LANGUAGE TAG E0002..E001F ; Other_Default_Ignorable_Code_Point # Cn [30] E0020..E007F ; Other_Default_Ignorable_Code_Point # Cf [96] TAG SPACE..CANCEL TAG E0080..E00FF ; Other_Default_Ignorable_Code_Point # Cn [128] E0100..E01EF ; Other_Default_Ignorable_Code_Point # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 E01F0..E0FFF ; Other_Default_Ignorable_Code_Point # Cn [3600] by removing code points with Cf, Cc, Cs to 034F ; Other_Default_Ignorable_Code_Point # Mn COMBINING GRAPHEME JOINER 115F..1160 ; Other_Default_Ignorable_Code_Point # Lo [2] HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG FILLER 180B..180D ; Other_Default_Ignorable_Code_Point # Mn [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE 200B ; Other_Default_Ignorable_Code_Point # Zs ZERO WIDTH SPACE 2064..2069 ; Other_Default_Ignorable_Code_Point # Cn [6] 3164 ; Other_Default_Ignorable_Code_Point # Lo HANGUL FILLER FE00..FE0F ; Other_Default_Ignorable_Code_Point # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 FFA0 ; Other_Default_Ignorable_Code_Point # Lo HALFWIDTH HANGUL FILLER FFF0..FFF8 ; Other_Default_Ignorable_Code_Point # Cn [9] E0000 ; Other_Default_Ignorable_Code_Point # Cn E0002..E001F ; Other_Default_Ignorable_Code_Point # Cn [30] E0080..E00FF ; Other_Default_Ignorable_Code_Point # Cn [128] E0100..E01EF ; Other_Default_Ignorable_Code_Point # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 E01F0..E0FFF ; Other_Default_Ignorable_Code_Point # Cn [3600] c) Add to the documentation of Other_Default_Ignorable_Code_Point in PropList.txt: "Among other code points, Other_Default_Ignorable_Code_Point contains unassigned code points so that Default_Ignorable_Code_Point contains all of the ranges <2060..206F, FFF0..FFFB, E0000..E0FFF>." Note: This does not change the contents of Default_Ignorable_Code_Point. ---- Background - existing Unicode 4 beta properties: * DerivedCoreProperties.txt: # Derived Property: Default_Ignorable_Code_Point # Generated from <2060..206F, FFF0..FFFB, E0000..E0FFF> # + Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space) 0000..0008 ; Default_Ignorable_Code_Point # Cc [9] .. 000E..001F ; Default_Ignorable_Code_Point # Cc [18] .. 007F..0084 ; Default_Ignorable_Code_Point # Cc [6] .. 0086..009F ; Default_Ignorable_Code_Point # Cc [26] .. 00AD ; Default_Ignorable_Code_Point # Cf SOFT HYPHEN 034F ; Default_Ignorable_Code_Point # Mn COMBINING GRAPHEME JOINER 0600..0603 ; Default_Ignorable_Code_Point # Cf [4] ARABIC NUMBER SIGN..ARABIC SIGN SAFHA 06DD ; Default_Ignorable_Code_Point # Cf ARABIC END OF AYAH 070F ; Default_Ignorable_Code_Point # Cf SYRIAC ABBREVIATION MARK 115F..1160 ; Default_Ignorable_Code_Point # Lo [2] HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG FILLER 17B4..17B5 ; Default_Ignorable_Code_Point # Cf [2] KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT AA 180B..180D ; Default_Ignorable_Code_Point # Mn [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE 200B ; Default_Ignorable_Code_Point # Zs ZERO WIDTH SPACE 200C..200F ; Default_Ignorable_Code_Point # Cf [4] ZERO WIDTH NON-JOINER..RIGHT-TO-LEFT MARK 202A..202E ; Default_Ignorable_Code_Point # Cf [5] LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE 2060..2063 ; Default_Ignorable_Code_Point # Cf [4] WORD JOINER..INVISIBLE SEPARATOR 2064..2069 ; Default_Ignorable_Code_Point # Cn [6] 206A..206F ; Default_Ignorable_Code_Point # Cf [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES 3164 ; Default_Ignorable_Code_Point # Lo HANGUL FILLER D800..DFFF ; Default_Ignorable_Code_Point # Cs [2048] FE00..FE0F ; Default_Ignorable_Code_Point # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 FEFF ; Default_Ignorable_Code_Point # Cf ZERO WIDTH NO-BREAK SPACE FFA0 ; Default_Ignorable_Code_Point # Lo HALFWIDTH HANGUL FILLER FFF0..FFF8 ; Default_Ignorable_Code_Point # Cn [9] FFF9..FFFB ; Default_Ignorable_Code_Point # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR 1D173..1D17A ; Default_Ignorable_Code_Point # Cf [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE E0000 ; Default_Ignorable_Code_Point # Cn E0001 ; Default_Ignorable_Code_Point # Cf LANGUAGE TAG E0002..E001F ; Default_Ignorable_Code_Point # Cn [30] E0020..E007F ; Default_Ignorable_Code_Point # Cf [96] TAG SPACE..CANCEL TAG E0080..E00FF ; Default_Ignorable_Code_Point # Cn [128] E0100..E01EF ; Default_Ignorable_Code_Point # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 E01F0..E0FFF ; Default_Ignorable_Code_Point # Cn [3600] # Total code points: 6283 * PropList.txt: 00AD ; Other_Default_Ignorable_Code_Point # Cf SOFT HYPHEN 034F ; Other_Default_Ignorable_Code_Point # Mn COMBINING GRAPHEME JOINER 115F..1160 ; Other_Default_Ignorable_Code_Point # Lo [2] HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG FILLER 180B..180D ; Other_Default_Ignorable_Code_Point # Mn [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE 200B ; Other_Default_Ignorable_Code_Point # Zs ZERO WIDTH SPACE 2060..2063 ; Other_Default_Ignorable_Code_Point # Cf [4] WORD JOINER..INVISIBLE SEPARATOR 2064..2069 ; Other_Default_Ignorable_Code_Point # Cn [6] 206A..206F ; Other_Default_Ignorable_Code_Point # Cf [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES 3164 ; Other_Default_Ignorable_Code_Point # Lo HANGUL FILLER FE00..FE0F ; Other_Default_Ignorable_Code_Point # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 FFA0 ; Other_Default_Ignorable_Code_Point # Lo HALFWIDTH HANGUL FILLER FFF0..FFF8 ; Other_Default_Ignorable_Code_Point # Cn [9] FFF9..FFFB ; Other_Default_Ignorable_Code_Point # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR E0000 ; Other_Default_Ignorable_Code_Point # Cn E0001 ; Other_Default_Ignorable_Code_Point # Cf LANGUAGE TAG E0002..E001F ; Other_Default_Ignorable_Code_Point # Cn [30] E0020..E007F ; Other_Default_Ignorable_Code_Point # Cf [96] TAG SPACE..CANCEL TAG E0080..E00FF ; Other_Default_Ignorable_Code_Point # Cn [128] E0100..E01EF ; Other_Default_Ignorable_Code_Point # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 E01F0..E0FFF ; Other_Default_Ignorable_Code_Point # Cn [3600] # Total code points: 4150