L2/12-141
Source: Mark Davis
Date: April 27, 2012
Subject: Script / Script Extensions cleanup


Normally, characters are given the Script property value Inherited if they are GC=Mark, and would otherwise be Common. There are a few outliers listed below.

I propose the following changes:

Common

A. U+1CE1,F2,F3 be changed to sc=Deva. If we know now that it is commonly used with other Indic scripts, or we get that information later, we can change to sc=Inherited & scx={Deva Gujr Guru Kthi Takr} (or whatever that list would be).

B. U+1D165,66,6D-6F,70-72 be changed to sc=Inherited. While these are special cases, there isn't any good reason for them to be Common instead of Inherited.


Inherited 

For all of the following, if we know that it is used with any other scripts, we can add them now or later.

C. U+0342-45, U+1DC0,C1 be changed to sc=Greek.

D. U+0363-6F be changed to sc=Latn

E. U+0485,86 be changed to sc=Cyrillic

F. U+065F be changed to sc=Arabic

G. U+0951,52 be changed to sc=Devanagari

H. U+1CD0-D2, D4-DF, E0, E2-E8, ED, F4 be handled as in A.


gc=Format

I. U+06DD be changed to sc=Arabic

====================

Background data

Script_Extensions=Common

Vedic Extensions — Tone mark for the Atharvaveda
U+1CE1 ( ᳡ ) VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA

Vedic Extensions — Ardhavisarga
U+1CF2 ( ᳲ ) VEDIC SIGN ARDHAVISARGA
U+1CF3 ( ᳳ ) VEDIC SIGN ROTATED ARDHAVISARGA

Musical Symbols — Stems
U+1D165 ( 𝅥 ) MUSICAL SYMBOL COMBINING STEM
U+1D166 ( 𝅦 ) MUSICAL SYMBOL COMBINING SPRECHGESANG STEM

Musical Symbols — Augmentation dot
U+1D16D ( 𝅭 ) MUSICAL SYMBOL COMBINING AUGMENTATION DOT

Musical Symbols — Flags
U+1D16E ( 𝅮 ) MUSICAL SYMBOL COMBINING FLAG-1
U+1D16F ( 𝅯 ) MUSICAL SYMBOL COMBINING FLAG-2
U+1D170 ( 𝅰 ) MUSICAL SYMBOL COMBINING FLAG-3
U+1D171 ( 𝅱 ) MUSICAL SYMBOL COMBINING FLAG-4
U+1D172 ( 𝅲 ) MUSICAL SYMBOL COMBINING FLAG-5

Script_Extensions=Inherited

Combining Diacritical Marks — Additions for Greek
U+0342 ( ͂ ) COMBINING GREEK PERISPOMENI
U+0343 ( ̓ ) COMBINING GREEK KORONIS
U+0344 ( ̈́ ) COMBINING GREEK DIALYTIKA TONOS
U+0345 ( ͅ ) COMBINING GREEK YPOGEGRAMMENI

Combining Diacritical Marks Supplement — Used for Ancient Greek
U+1DC0 ( ᷀ ) COMBINING DOTTED GRAVE ACCENT
U+1DC1 ( ᷁ ) COMBINING DOTTED ACUTE ACCENT

Combining Diacritical Marks — Medieval superscript letter diacritics
U+0363 ( ͣ ) COMBINING LATIN SMALL LETTER A
U+0364 ( ͤ ) COMBINING LATIN SMALL LETTER E
U+0365 ( ͥ ) COMBINING LATIN SMALL LETTER I
U+0366 ( ͦ ) COMBINING LATIN SMALL LETTER O
U+0367 ( ͧ ) COMBINING LATIN SMALL LETTER U
U+0368 ( ͨ ) COMBINING LATIN SMALL LETTER C
U+0369 ( ͩ ) COMBINING LATIN SMALL LETTER D
U+036A ( ͪ ) COMBINING LATIN SMALL LETTER H
U+036B ( ͫ ) COMBINING LATIN SMALL LETTER M
U+036C ( ͬ ) COMBINING LATIN SMALL LETTER R
U+036D ( ͭ ) COMBINING LATIN SMALL LETTER T
U+036E ( ͮ ) COMBINING LATIN SMALL LETTER V
U+036F ( ͯ ) COMBINING LATIN SMALL LETTER X

Cyrillic — Historic miscellaneous
U+0485 ( ҅ ) COMBINING CYRILLIC DASIA PNEUMATA
U+0486 ( ҆ ) COMBINING CYRILLIC PSILI PNEUMATA

Arabic — Combining marks
U+065F ( ٟ ) ARABIC WAVY HAMZA BELOW

Devanagari — Vedic tone marks
U+0951 ( ॑ ) DEVANAGARI STRESS SIGN UDATTA
U+0952 ( ॒ ) DEVANAGARI STRESS SIGN ANUDATTA

Vedic Extensions — Tone marks for the Samaveda
U+1CD0 ( ᳐ ) VEDIC TONE KARSHANA
U+1CD1 ( ᳑ ) VEDIC TONE SHARA
U+1CD2 ( ᳒ ) VEDIC TONE PRENKHA

Vedic Extensions — Sign for Yajurvedic
U+1CD4 ( ᳔ ) VEDIC SIGN YAJURVEDIC MIDLINE SVARITA
U+1CD5 ( ᳕ ) VEDIC TONE YAJURVEDIC AGGRAVATED INDEPENDENT SVARITA
U+1CD6 ( ᳖ ) VEDIC TONE YAJURVEDIC INDEPENDENT SVARITA
U+1CD7 ( ᳗ ) VEDIC TONE YAJURVEDIC KATHAKA INDEPENDENT SVARITA
U+1CD8 ( ᳘ ) VEDIC TONE CANDRA BELOW
U+1CD9 ( ᳙ ) VEDIC TONE YAJURVEDIC KATHAKA INDEPENDENT SVARITA SCHROEDER
U+1CDA ( ᳚ ) VEDIC TONE DOUBLE SVARITA
U+1CDB ( ᳛ ) VEDIC TONE TRIPLE SVARITA
U+1CDC ( ᳜ ) VEDIC TONE KATHAKA ANUDATTA
U+1CDD ( ᳝ ) VEDIC TONE DOT BELOW
U+1CF4 ( ᳴ ) VEDIC TONE CANDRA ABOVE

Vedic Extensions — Tone marks for the Satapathabrahmana
U+1CDE ( ᳞ ) VEDIC TONE TWO DOTS BELOW
U+1CDF ( ᳟ ) VEDIC TONE THREE DOTS BELOW

Vedic Extensions — Tone mark for the Rigveda
U+1CE0 ( ᳠ ) VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA

Vedic Extensions — Diacritics for visarga
U+1CE2 ( ᳢ ) VEDIC SIGN VISARGA SVARITA
U+1CE3 ( ᳣ ) VEDIC SIGN VISARGA UDATTA
U+1CE4 ( ᳤ ) VEDIC SIGN REVERSED VISARGA UDATTA
U+1CE5 ( ᳥ ) VEDIC SIGN VISARGA ANUDATTA
U+1CE6 ( ᳦ ) VEDIC SIGN REVERSED VISARGA ANUDATTA
U+1CE7 ( ᳧ ) VEDIC SIGN VISARGA UDATTA WITH TAIL
U+1CE8 ( ᳨ ) VEDIC SIGN VISARGA ANUDATTA WITH TAIL

Vedic Extensions — Marks of nasalization
U+1CED ( ᳭ ) VEDIC SIGN TIRYAK

General_Category=Format

Arabic — Koranic annotation sign

U+06DD ( ۝ ) ARABIC END OF AYAH



See also http://unicode.org/Public/6.1.0/ucd/ScriptExtensions.txt