Did you click the ICU link? It will direct you to http://icu-project.org
I would suggest instead of reading those .TXT files, either using TR#22, or else ICU UCM format. TR#22
was designed to provide an XML interchange format for mapping tables.
Hope this helps,
Steven (IBM and ICU project)
This was very helpful. I was able to (eventually) find the Subversion URL and checkout the 'icu' repository. That information was surprisingly difficult to find. Clicking the "Source Code Repository" link on the front page of http://site.icu-project.org/
just takes you to the web-based (Trac) browser. The actual Subversion URLs should be easier to find.
So I filed bug http://bugs.icu-project.org/trac/ticket/9949
and fixed that link so that it points to a repository help page (as the sidebar actually does) instead of directly to trac. That whole section duplicates the sidebar, and should probably go. Feel free to reply on the bug if you have further comments. By the way, in the Trac browser, there's a "Subversion Location" link at top that takes you to SVN. Filed another ticket http://bugs.icu-project.org/trac/ticket/9950
about making commits against a ticket easier to find.
Also, it took a very, very long time to checkout the code.
Well, sure if you check out the entire
repo, including all subprojects, including each tagged version, which someone did a few hours ago (perhaps you). What SVN URL did you checkout?
It would be wise to modernize and switch to using git. Not just for performance reasons but also to gain all the benefits that come with distributed source control.
Yeah, I've come to not hate DCVS. But from where I sit, my challenge is migrating a pretty large and complicated set of codebases, including tools supporting all phases of release and change control management, automated build, client support, website updates, etc. It's non trivial. I wrote the tooling which integrates commits with bugs and vice versa, and I ported it from previous code and bug systems (CVS/Jitterbug).
git-svn does seem promising, one possibility would be having a readonly mirror on the website that could be pulled from, maintained on the server side.
Sorry Sarasvati, I know this is way off topic for a unicode forum. But all of the above (including the CVS/Jitterbug part) applies to CLDR and other Unicode tooling.
Lastly, the link to icu-project.org should be placed directly inside this:http://www.unicode.org/Public/MAPPINGS/ ... sions.html
The path from that document to icu-project.org is a bit convoluted: You must first load this page:http://www-01.ibm.com/software/globaliz ... index.html
(That link is titled, "International Components for Unicode (ICU)")
Way down at the bottom of that page is a tiny, three-character link to site.icu-project.org. There's really no point in sending people to that ibm.com page when all of the same information can be found on the direct link to site.icu-project.org. It is an unnecessary (and confusing) link-in-the-middle.
The previous link is a page about IBM vendor mappings. The www-01 page is an IBM page. I don't see a problem with the page, but I will try to get that link made more obvious.
For reference, I'm almost done writing a script that converts .ucm files to .py encoding modules. To my amazement such a script hasn't already been written (couldn't find anything in Google).
Had you considered taking UTR#22 format? That's at least a more standardized format, XML parseable.
Would be interesting if you can pass the ICU converter tests through your encoding modules, both from performance and compatibility.
When I'm done with it I'll upload it to Github along with all the .ucm files pre-converted and ready-to-go. It will be a lot simpler to use than having to deal with the binary/compiled/not-compatible-with-Python3 PyICU module.
Please file an ICU bug so we can at least link to your program. Have you contacted the PyICU module maintainers (Andi)? I don't use it heavily, but I hadn't had trouble getting PyICU to work on v2.x.