RE: Unicode Sets in 'Unicode Regular Expressions'

From: Phillips, Addison <>
Date: Tue, 27 May 2014 22:36:04 +0000

A "Unicode set" in this context means "a set of code points". This is discussed in section 1.2:

This is done by providing syntax for sets of characters based on the Unicode character properties, and allowing them to be mixed with lists and ranges of individual code points.
More generally, there is no term "Unicode set" defined, although is it referred to in places such as RL1.3 as a shorthand. It merely means "the set of all code points selected" (by whatever selection, subtraction, intersection, or differencing has been applied beginning from the Universal Character Set as a whole). Or at least this is how I have already read it.
> -----Original Message-----
> From: Unicode [] On Behalf Of Richard
> Wordingham
> Sent: Tuesday, May 27, 2014 3:18 PM
> To:
> Subject: Unicode Sets in 'Unicode Regular Expressions'
> UTS#18 'Unicode Regular Expressions' Version 17 Requirement RL1.3
> 'Subtraction and Intersection' talks of Unicode sets.  What is the relevant
> definition of a 'Unicode set'? Is it a finite set of non-empty strings?  Other
> possibilities that occur to me, depending on context, include sets of codepoints
> and sets of indecomposable codepoints.
> Richard.
> _______________________________________________
> Unicode mailing list
Unicode mailing list
Received on Tue May 27 2014 - 17:38:11 CDT

This archive was generated by hypermail 2.2.0 : Tue May 27 2014 - 17:38:11 CDT