> For the confusables, the presumption is that implementations have already either normalized the input to NFKC or have rejected input that is not NFKC.

Agree with that as well, however the data is not consistent by having some of these fullwidth latin characters in the data but not all of them. Either we should have none or all of them. I would be tempted to remove the ones in the set. Anyone concerned about confusability ought to apply NFKC first (or make sure that the target repertoire is stable through a NFKC operation).

