Re: Specification of Encoding of Plain Text

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Fri, 13 Jan 2017 09:02:32 +0000

On Thu, 12 Jan 2017 21:03:29 +0100
Mark Davis ☕️ <mark_at_macchiato.com> wrote:

> Latin is not a complex script,...

Unlike the common script, which notably has U+2044 FRACTION SLASH.

That statement is actually dubious from a typographical point of view.

> ...so it was only an illustration.

But it's good for looking for the non-obvious issues.

> A more serious effort would look at some of the issues from
> http://unicode.org/reports/tr29/, for example.

I don't think we want to have to repeat them all for each script.
Putting common-script punctuation and numbers in the regex will add
obscurity, and possibly be a maintainability issue.

Richard.
Received on Fri Jan 13 2017 - 03:03:10 CST

This archive was generated by hypermail 2.2.0 : Fri Jan 13 2017 - 03:03:11 CST