L2/02-087

Title: Explicit Line Break Property Proposal
Date: January 15, 2002
Source: Kent Karlsson

For discussion at UTC#90/L2 187.

The UCD already contains a property "White_space" intented to list all
characters that should be characterised as "whitespace" in programming
language syntaxes. However, not all whitespace are created equal:
some whitespace characters are regularly excluded from string and
character literals; namely those that (are known to) cause an explicit
line break. Instead one must use "escapes" for them (\n etc.)

The same characters are in many programming languages also used to
terminate some comments, to-end-of-line comments.

However, the syntaxes often miss some of the explicit line break
characters.

To make these explicitly listed in the Unicode standard, for easy
reference, just as the whitespace characters are, add a new property
(similar to White_Space): Explicit_Line_Break.

000A..000D ; Explicit_Line_Break # Cc [4] <control>..<control>
0085 ; Explicit_Line_Break # Cc <control>
2028 ; Explicit_Line_Break # Zl LINE SEPARATOR
2029 ; Explicit_Line_Break # Zp PARAGRAPH SEPARATOR

All of these are already in the White_Space category.