From: William_J_G Overington (firstname.lastname@example.org)
Date: Thu Jun 04 2009 - 04:13:49 CDT
On Wednesday 3 June 2009, Kenneth Whistler <email@example.com> wrote:
> William Overington suggested:
> > The suggestion of using b64-encoded binary data could
> > be adapted by placing a Unicode U+FFFC OBJECT
> REPLACEMENT CHARACTER
> > in front of the b64-encoded binary data. That
> way, the parameter
> > passing would always be in Unicode characters and the
> presence of
> > a U+FFFC character would indicate that subsequent
> characters in
> > the parameter should be interpreted as b64-encoded
> binary data.
> It may perhaps be belaboring the obvious, but U+FFFC
> REPLACEMENT CHARACTER is not defined that way, and would
> indicate that (or anything else) about subsequent
> in a string parameter.
Ken is correct.
> Any attempt to use U+FFFC in that way would be very
> unlikely to
> be interpreted as such by any Unicode-conformant system,
> in fact is nothing more than an arbitrary attempt to
> a text convention which would consist of a higher-level
Well, not quite arbitrary. The problem is to develop a demonstration of a new idea of passing objects using a text parameter. Ruszlán Gaszanov asked "What's wrong with passing b64-encoded binary data?" and I suggested that "Passing b64-encoded binary data could be ambiguous as to whether it was text or b64-encoded binary data." and suggested a way that either text or b64-encoded binary data could be passed as a parameter.
The Unicode Standard includes the following document.
The document has the following on page 26.
U+FFFC. The U+FFFC object replacement character is used as an insertion point for objects located within a stream of text. All other information about the object is kept outside the character data stream. Internally it is a dummy character that acts as an anchor point for the object’s formatting information. In addition to assuring correct placement of an object in a data stream, the object replacement character allows the use of general stream-based algorithms for any textual aspects of embedded objects.
So, my suggestion needs to be altered so that the parameter passing mechanism, upon detecting a U+FFFC character, places all subsequent characters from after the U+FFFC character into a separate storage place. The passed parameter is thus then true Unicode that may, but need not, contain a U+FFFC character.
> One could equally well (and probably with equal outcome)
> that a U+25E7 SQUARE WITH LEFT HALF BLACK character would
> that subsequent characters in a parameter should be
> as b64-encoded binary data.
Well, no, because the suggestion of using U+FFFC does have a clue for humans as to what might be meant.
> Or for that matter, that
> characters in a string should be interpreted as a chocolate
> cookie recipe.
Well, U+003C LESS-THAN SIGN gets used for many purposes in some documents.
4 June 2009
This archive was generated by hypermail 2.1.5 : Thu Jun 04 2009 - 04:16:12 CDT