Curtis Clark wrote:
> In biological Latin, "v" and "u" cannot map to the same
> codepoint. As long as scientific names obey a set of codified
> grammatic rules, they are "set as spelt", and all future
> authors must obey the original spelling. No
> algorithmic unification of "v" and "u" (or "i" and "j") could
> ever work.

This is correct, in fact I said that the text should also be tagged with the desired "spelling style" (aka "sublanguage").

It is the same problem for many other languages, including English. For a spellchecker, it is not enough to know that the language is English ("en"): it also needs to know the national "sublanguage": "UK", "US", etc.
Without this information the spellchecker cannot properly deal with spelling variants like program/programme, serialize/serialise, color/colour, etc.

Latin should not be any different: in order for language-aware tools to work properly, it is not enough to say "it's Latin": we needs things like "Latin(Traditional)", "Latin(Biological)", "Latin(Scholastic)", "Latin(Vatican's)", "Latin(Asterix's)", etc.

_ Marco
