The most recent proposal for embedded language identifiers would seem to   
fit the need.  For texts that change languages often (such as lexicons   
and dictionaries), it may bloat the text considerably, depending upon the   
id of the particular language.  In that case it would be a good portable   
solution but not necessarily a good storage solution.  Aside from that,   
the proposal has tremendous merit.  Selection of which codepoints to use   
for the 16 bits might be better placed at the end of the private use area   
if it is to be there at all.  Our display engine uses that area for   
characters not included in Unicode 1.1.  Your effort is appreciated.  I   
am looking forward to the final solution.

Chris McGuire
Development: Logos Research Systems, Inc.