NFD normalisation test

From: spir ([email protected])
Date: Sat Feb 06 2010 - 05:56:53 CST

Next message: spir: "Re: NFD normalisation test"

Previous message: Curtis Clark: "Re: Last speaker of Bo died"
Next in thread: spir: "Re: NFD normalisation test"
Reply: spir: "Re: NFD normalisation test"
Maybe reply: Kenneth Whistler: "Re: NFD normalisation test"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hello,

I have a bunch of questions on the topic.

The provided test data hold a huge list of specific and generic cases, of which about 11500 hangul ones.
-1- Why so many? Is it necessary to test all of these? I guess for instance if a func correctly transforms 1, 2, 3 hangul LVT syllables, then it correctly transforms all of them, no?
-2- Since hangul codes are normalized algorithmically (as opposed to a mapping), shouldn't they be in a separate part?
-3- What are the specific cases (part 0), why are they apart?

I also wonder about the source codes to be normalised.
-4- Does each code / group of codes represent a whole, consistent, "user-perceived character"?
-5- Would their concatenation build a valid character string (text)?
-6- Should the (NFD) normalisation of this text result in the concatenation of individual normalized cases?

My intention is to do the following; please tell me whether it makes sense:
* Build separate test sets for specific / hangul / generic cases (done).
* Select all specific, and N randomly-chosen hangul and generic cases.
* From the complete data, run and check case-per-case normalisation, using the given assertions c3 == NFD(c1) == NFD(c2) == NFD(c3) and c5 == NFD(c4) == NFD(c5).
* Using only source and NFD data columns, run and check complete text normalisation.

________________________________

la vita e estrany

http://spir.wikidot.com/

Next message: spir: "Re: NFD normalisation test"
Previous message: Curtis Clark: "Re: Last speaker of Bo died"
Next in thread: spir: "Re: NFD normalisation test"
Reply: spir: "Re: NFD normalisation test"
Maybe reply: Kenneth Whistler: "Re: NFD normalisation test"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Feb 06 2010 - 06:02:54 CST