Re: ISO 8859-11 (Thai) cross-mapping table

From: John Cowan (jcowan@reutershealth.com)
Date: Tue Oct 08 2002 - 08:44:59 EDT

Next message: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"

Previous message: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
In reply to: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
Next in thread: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
Reply: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Elliotte Rusty Harold scripsit:

> The Verifier class has a similar issue, though there it's a case of
> determining whether or not any given character is a legal XML
> character/name character/name-start character/ etc. This is done with
> a trick introduced in JDOM where the code looks like this:

The underlying data structure here is called a "range table", and is
a list of ranges in codepoint order, expressed thus:

        start of first range
        end of first range + 1
        start of second range
        end of second range + 1

etc. etc. What you are doing is equivalent to a linear search over
this range followed by loop unrolling. However, you can do better,
especially in complex cases, with a *binary* search over the range
followed by loop unrolling. The trick here is that if the binary
search returns an even value, it succeeds; an odd value fails.

But I agree that testing ASCII first is wise. Mozilla bypasses
its general algorithm in the ASCII/Latin-1 case, providing a fast
256-element lookup table and then an extremely compact (but somewhat
slower than the above) data structure for the rest of Unicode.

-- 
John Cowan                                jcowan@reutershealth.com
At times of peril or dubitation,          http://www.ccil.org/~cowan
Perform swift circular ambulation,        http://www.reutershealth.com
With loud and high-pitched ululation.

Next message: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
Previous message: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
In reply to: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
Next in thread: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
Reply: Elliotte Rusty Harold: "Re: ISO 8859-11 (Thai) cross-mapping table"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Oct 08 2002 - 09:35:43 EDT