Re: Getting A Newb Started

From: Jeroen Ruigrok van der Werven (asmodai@in-nomine.org)
Date: Tue Jul 08 2008 - 00:51:27 CDT

  • Next message: J: "Re: Getting A Newb Started"

    -On [20080707 21:52], William J Poser (wjposer@ldc.upenn.edu) wrote:
    >There seem to be religious views on this question, but my own practice is
    >to use UTF-32 internally in almost all cases. Yes, it takes more memory
    >than UTF-8, but the modest additional memory usage doesn't really matter
    >much. On the other hand, dealing with UTF-32 is much easier and less error
    >prone than dealing with UTF-8. Every four bytes is a character. You can do
    >simple array arithmetic, simple calculations of how much memory you need
    >to allocate, etc.

    We recently tested this with Trac and a Python with 2-byte and 4-byte
    storage. Additional memory consumption was less than 5% for this web
    application. And given it's an issue tracker with integrated wiki it uses a
    lot of strings.

    -- 
    Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
    イェルーン ラウフロック ヴァン デル ウェルヴェン
    http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
    Man is the Dream of the dolphin...
    


    This archive was generated by hypermail 2.1.5 : Tue Jul 08 2008 - 00:54:27 CDT