RE: Last Call: Language Tagging in Unicode Plain Text to Proposed

From: John Clews (
Date: Sun Jul 12 1998 - 08:21:59 EDT

In message <> "Chen, Qifan" writes:
> Patrik,
> I just have a small comment to the proposal which is why only including
> TAGGED version of ASCII characters in to the set? Why can not tags
> be composed of non-ASCII characters?
> It seems that we only need to have BEGIN TAG and CANCEL TAG two
> special characters. All other UNICODE characters (except of course
> the special two) should be allowed inside a tag. From language
> processing point of view, this is not more complicated than the
> proposed approach.

If this was possible, and only two special characters were required,
what about the possibility of allocating two characters in the
SPECIALS collection at the end of the BMP? Would that cause additional
problems that I had not considered? (Although subsequent emails do
list additional reasons for contents of tags also being different).

Looking forward to any comments

Best wishes

John Clews

John Clews (Chair of ISO/TC46/SC2: Conversion of Written Languages)

SESAME Computer Projects, 8 Avenue Road, Harrogate, HG2 7PG, U.K. Email:; tel: +44 (0) 1423 888 432

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT