Re: Characters

From: William_J_G Overington (wjgo_10009@btinternet.com)
Date: Mon Feb 14 2011 - 02:39:09 CST

  • Next message: Neil Harris: "Re: Characters"

    Thank you to Alex Plantema <alex.plantema@xs4all.nl> for the links.
     
    http://www.unicode.org/mail-arch/unicode-ml/y2011-m02/0092.html
     
    The analogy with the mobiles is very helpful. I feel that I learned a lot from that.
     
    I am interested in art so I searched about the mobiles as well.
     
    I found the following on the web.
     
    http://www.nga.gov/exhibitions/caldwel.shtm
     
    http://www.nga.gov/collection/calderinfo.shtm
     
    From there, there is a link in the left margin labelled as Reinstallation Views that has a sequence of images that show the huge size of the mobile.
     
    http://en.wikipedia.org/wiki/Alexander_Calder
     
    As I mentioned earlier in this post, the analogy with the mobiles is very helpful. I feel that I learned a lot from that. I started to wonder how I could think of adaptive Huffman coding using an analogy.
     
    For that, I thought in terms of the arch in the holodeck in Star Trek The Next Generation, the episode Elememtary, Dear Data. The idea is that the arch can be used to change the parameters governing a simulation from within the simulation.
     
    Going back to the mathematical toy compression encodings earlier in this thread, I wondered how one could have adaptive Huffman coding in a mathematical toy compression encoding.
     
    The first encoding used 0 for U+0020 SPACE and 1 plus 16 bits for everything else.
     
    The second encoding used 00 for U+0020 SPACE and 01 for another character and 1 plus 16 bits for everything else.
     
    http://www.unicode.org/mail-arch/unicode-ml/y2011-m02/0089.html
     
    Let us suppose now that the way to encode the file is to start with sixteen bits representing the space character and then sixteen bits for whichever character is to be encoded as 01 in the compressed format. After that, the 00, 01 and "1 then 16 bits" encoding is used throughout the file.
     
    Suppose that the decoding software stores the value U+0020 in a character variable named z00 and the character that is encoded as 01 in a character variable named z01. For example, z01 might contain the value U+0065.
     
    Suppose that a third encoding is the same as the second encoding yet has an additional feature. If, while decoding, the character U+FDEF, which is a noncharacter, is produced, then it is not passed through to the output file of uncompressed Unicode characters, yet acts as a Load Immediate command and loads the next sixteen bits in the input stream into the z01 register, thus changing which character 01 decodes into becoming, unless it is changed again by using another U+FDEF character.
     
    http://www.unicode.org/charts/PDF/UFB50.pdf
     
    As there are thirty-two noncharacters in that collection, the method could be expanded to produce a more advanced type of compression format with a more complex decoding tree that also uses more "arch buttons" so as to include the capability to restructure the decoding tree.
     
    William Overington
     
    14 February 2011
     



    This archive was generated by hypermail 2.1.5 : Mon Feb 14 2011 - 02:44:09 CST