The Unicode Consortium Discussion Forum (CLOSED)

The Unicode Consortium Discussion Forum (CLOSED)

The forum has been closed, but prior postings are accessible for reading.
 Forum Home  Unicode Home Page Code Charts Technical Reports FAQ Pages 
 
It is currently Sun Dec 21, 2014 2:26 pm

All times are UTC - 6 hours [ DST ]


Forum rules


Use this forum for technical discussion of UAXes 11, 14, 15, 24, 29, 31, 34, 42, and 44. Technical discussion of UTSes 6, 10, 18, 22, 39, and 46. Technical discussion of UTRs 16, 17, 20, 23, 25, 26, 33, and 36, as well as the related properties and files in the Unicode Character Database.



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 
Author Message
 Post subject: Tracking changes in the block (aka block name) property
PostPosted: Thu Dec 16, 2010 9:29 pm 
Offline
Unicode Guru

Joined: Tue Dec 01, 2009 2:49 pm
Posts: 182
One place where block names are used is as part of regular expressions. (See UTS#18: Unicode Regular Expressions for more details).

Almost every version of Unicode adds new characters in new blocks, and occasionally the name of some blocks has changed in the character names list, or an alternate name has been recognized (over time this has affected about a dozen blocks out of well over a hundred).

What should an implementer do to track these changes?


Top
 Profile  
 
 Post subject: Re: Tracking changes in the block (aka block name) property
PostPosted: Thu Dec 16, 2010 10:58 pm 
Offline
Unicode Guru

Joined: Tue Dec 01, 2009 2:49 pm
Posts: 182
Whenever Unicode changes the preferred name for a property or property value, the old alias is maintained. It is strongly recommended that all programs (and other standards) accept all of the aliases as equivalent.

The full list of aliases are found in:

http://unicode.org/Public/UNIDATA/PropertyAliases.txt
http://unicode.org/Public/UNIDATA/PropertyValueAliases.txt

(The above links are to the latest versions of these files; there are also specific versioned files for Unicode 6.0.0, 5.2.0, ...)

So for the block property, if you look at http://unicode.org/Public/UNIDATA/Prope ... liases.txt, you'll find under Block (blk) the following three cases where there are multiple names for a block:

blk; n/a ; Arabic_Presentation_Forms_A ; Arabic_Presentation_Forms-A

blk; n/a ; Basic_Latin ; ASCII

blk; n/a ; Greek_And_Coptic ; Greek

If a new Unicode version rolls around, when you update your software, you simply make a diff between the new and old versions of these alias files and you support any new names for existing blocks.

To find the ranges for new blocks, check
http://unicode.org/Public/UNIDATA/Blocks.txt

(Thanks to Mark Davis for much of this info).


Top
 Profile  
 
 Post subject: Re: Tracking changes in the block (aka block name) property
PostPosted: Thu Dec 16, 2010 11:17 pm 
Offline
Unicode Guru

Joined: Tue Dec 01, 2009 2:49 pm
Posts: 182
Now that leads to the next question:

Do the ranges for Unicode blocks ever change?

The answer is "no." Once defined, they are stable.

However, I kind no formal guarantee of this stability in the Unicode Stability Policy, and some very early versions of Unicode did change blocks and block ranges, but at this time, that is only of historical interest.


Top
 Profile  
 
 Post subject: Re: Tracking changes in the block (aka block name) property
PostPosted: Thu Dec 16, 2010 11:25 pm 
Offline
Unicode Guru

Joined: Tue Dec 01, 2009 2:49 pm
Posts: 182
If I don't want to download all existing versions of the Unicode character database, is there a quick way to find out whether a given block was defined in a given version?

Here's how I would do that. There's a file http://unicode.org/Public/UNIDATA/DerivedAge.txt which lists all characters added for each version.

If you correlate that information with the list of block ranges in http://unicode.org/Public/UNIDATA/Blocks.txt you can determine the lowest version number for which any characters are defined for a given block range. That is the version at which the block was added to the standard.

A small perl script should do the trick.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


Quick-mod tools:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Template made by DEVPPL.com