Re: minimizing size (was Re: allocation of Georgian letters)

From: Douglas Davidson ([email protected])
Date: Thu Feb 07 2008 - 11:31:41 CST

Next message: Sinnathurai Srivas: "Re: minimizing size (was Re: allocation of Georgian letters)"

Previous message: John H. Jenkins: "Re: minimizing size (was Re: allocation of Georgian letters)"
In reply to: Michael S. Kaplan: "Re: minimizing size (was Re: allocation of Georgian letters)"
Next in thread: Sinnathurai Srivas: "Re: minimizing size (was Re: allocation of Georgian letters)"
Reply: Sinnathurai Srivas: "Re: minimizing size (was Re: allocation of Georgian letters)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Feb 7, 2008, at 3:22 AM, Michael S. Kaplan wrote:

> Having flown halfway around the world to talk to people who for
> whatever reasons, both valid and invalid (and not really
> distinguishing which is which on their list of concerns), are
> unhappy with a language encoding that in their view doubles or worse
> the amount of bytes used to store their language in Unicode, I can
> tell you that this as very real concern on some people's minds.
>
> True or false, it is on their minds. They can all add and multiply,
> and it certainly looks like a 2x or 3x situation to them.
>
> And we get a lot further by acknowledging their concerns and then
> showing them that they have less to be concerned about than they
> think, in the end, then we ever would by telling them there are
> wrong, wrong, wrong.

One mitigating factor is that many document formats have at least an
option to employ some form of compression. For example, both OOXML
and ODF are zip-archived XML, which means that most text will usually
end up being compressed. If one is concerned about sending HTML over
the wire, then one can use HTTP compression. Obviously these are
general-purpose compression algorithms, not text-specific ones, but
they still should be able to help. Actually, in most XML and HTML
documents, a large proportion of the characters are ASCII markup
anyway, so the overall expansion is not going to be 2x or 3x in the
first place. Furthermore, in many cases the size of the text in any
form is less significant than the size of other data such as images.

Douglas Davidson

Next message: Sinnathurai Srivas: "Re: minimizing size (was Re: allocation of Georgian letters)"
Previous message: John H. Jenkins: "Re: minimizing size (was Re: allocation of Georgian letters)"
In reply to: Michael S. Kaplan: "Re: minimizing size (was Re: allocation of Georgian letters)"
Next in thread: Sinnathurai Srivas: "Re: minimizing size (was Re: allocation of Georgian letters)"
Reply: Sinnathurai Srivas: "Re: minimizing size (was Re: allocation of Georgian letters)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Feb 07 2008 - 11:33:50 CST