Database missing/erroneous information

From: J Decker via Unicode <>
Date: Wed, 12 Jul 2017 06:35:02 -0700

I started looking more deeply at the javascript specification. Identifiers
are defined as starting with characters with ID_Start and continued with
ID_Continue attributes.
I grabbed the xml database (ucd.all.grouped.xml ) in which I was able to
find IDS, IDC flags ( also OIDS,OIDC, XIDS,XIDC of which meaning I'm not
entirely sure of)

but I started filtering out to find characters that are NOT IDS|IDC....

Something simple like numbers 0x30-0x39 are marked with IDS='N' but have no
[ OX]IDC flags specified. Is a lack of flag assumed N or Y? documentation on the XML file format doesn't
specify. I see 'ID_Continue characters include
ID_Start characters, plus characters '

most languages do support identifiers like a1, a2, etc as valid
identifiers, so certainly numbers should have IDC even though they're not
Are there characters that are IDS without being IDC? There are certainly
characters that are IDC without IDS.

some examples.....

found char { cp: '0034', na: 'DIGIT FOUR', gc: 'Nd', nt: 'De', nv:
'4', bc: 'EN', lb: 'NU', sc: 'Zyyy', scx: 'Zyyy', Alpha: 'N', Hex:
'Y', AHex: 'Y', IDS: 'N', XIDS: 'N', WB: 'NU', SB: 'NU', Cased: 'N',
 CWCM: 'N', InSC: 'Number' }

(this has IDC notation but not IDS; since it says 'digit' I assume this is
a number type, and should not be IDS.)
found char { cp: '0F32', na: 'TIBETAN DIGIT HALF NINE', gc: 'No', nt:
'Nu', nv: '17/2', Alpha: 'N', IDC: 'N', XIDC: 'N', SB: 'XX', InSC:
'Number' }

This might be not IDS but is IDC?
found char { cp: '203F',
  na: 'UNDERTIE',
  gc: 'Pc',
  IDC: 'Y',
  XIDC: 'Y',
  Pat_Syn: 'N',
  WB: 'EX' }

this is sort of IDS but not IDC?
found char { cp: '309B', na: 'KATAKANA-HIRAGANA VOICED SOUND MARK', gc:
'Sk', dt: 'com', dm: '0020 3099', bc: 'ON', lb: 'NS', sc: 'Zyyy',
 scx: 'Hira Kana', Alpha: 'N', Dia: 'Y', OIDS: 'Y', XIDS: 'N', XIDC:
'N', WB: 'KA', SB: 'XX', NFKC_QC: 'N', NFKD_QC: 'N', XO_NFKC: 'Y',
 XO_NFKD: 'Y', CI: 'Y', CWKCF: 'Y', NFKC_CF: '0020 3099', vo: 'Tu' }
Received on Wed Jul 12 2017 - 08:35:34 CDT

This archive was generated by hypermail 2.2.0 : Wed Jul 12 2017 - 08:35:35 CDT