Mark Davis 🚙 wrote:
>> I suspect the punycode goal is to take a wide character set into a
>> restricted character set, without caring much on resulting string
>> length; if the original string happens to be in other character set
>> than the target restricted character set, then the string length
>> increases too much to be of interest in the SMS discussion.
>
> That is not correct. One of the chief reasons that punycode was
> selected was the reduction in size.
But certainly the main motivation behind the development of Punycode, or any of the ACEs (ASCII-Compatible Encodings) that came before it, was to provide a compact encoding given the constraints of the set of characters allowed in domain names. The extensibility of the algorithm to target character sets of different sizes was definitely an advantage.
> Tests with the idnbrowser is not relevant. As I said: 
>
>> In that form, it uses a smaller number of
>> bytes per character, but a parameterization allows use of all byte
>> values.
>
> That is, the parameterization of punycode for IDNA is restricted to
> the 36 IDNA values per byte, thus roughly 5 bits. When you
> parameterize punycode for a full 8 bits per byte, you get considerably
> different results.
Not to say this isn’t so, but can you point to a tool or site where a user can type a string and see the output with different parameterizations? Pretty much all of the “Convert to Punycode” pages I see are only able to convert to the IDNA target.
-- Doug Ewell | Thornton, Colorado, USA http://www.ewellic.org | @DougEwell Received on Sat Apr 28 2012 - 13:54:06 CDT
This archive was generated by hypermail 2.2.0 : Sat Apr 28 2012 - 13:54:06 CDT