RE: Help in a HURRY !!!!!!!!!!!!!!!!!!!!!!!

From: Yves Arrouye (yves@realnames.com)
Date: Tue May 15 2001 - 00:39:08 EDT


To go with Lukas's Perl code, I'll provide a C version, not really tested
either, with ICU, to give him a choice. No error checking etc., just to give
the idea. If you want UTF-16 you'll need to use the macros in
unicode/utf16.h to generate surrogate pairs properly.

#include <stdio.h>
#include <unicode/utf8.h>

#define LINE_MAX 80 /* Whatever. */

int main() {
    char buf[LINE_MAX];

    while (fgets(buf, sizeof(buf), stdin)) {
          int i;
          size_t len = strlen(buf);

          if (buf[len - 1] == '\n') {
              buf[--len] = 0; /* We don't want that one in
the output. */
          }

          for (i= 0; i < len;) {
                int32_t c;

                UTF8_NEXT_CHAR_UNSAFE(buf, i, c);
                printf(c < 0x80U ? "%c" : "&#%ld;", c); /* As Lukas's code,
use entities only above ASCII. */
          }
          putchar('\n'); /* Separate lines; will produce white space
in HTML. */
    }
}

Hope this helps,
YA



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT