Re: Question about foreign/compound characters.



Thanks.

1) Would you supply the lower-case tables as well?

2) Is there a program around to do this or that can
easily be canibalized to do it (It seems a bit beyond
DCL)?

(Maybe Larry has a TECO routine that does it in one command.)



JF Mezei
<jfmezei.spamnot@xxxxxxxxxxxxx>
To
03/21/2007 06:01 PM Info-VAX@xxxxxxxxxxxx
cc

Subject
Re: Question about foreign/compound characters.










norm.raphael@xxxxxxxxx wrote:
Is there a way to get this and others (like <n><~>) back to their
vanilla english spelling equivalents so the processing is not
messed up?

You need to translate it yourself.

Assuming ISO-8851-1:

The first table uppercases characters, retaining their accented
characteristics.

The second table maintains casing, but converts to the nearest non-accented

character (there are a few that don't map properly).


extern char char88591_upcase_simple[] = {
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
' ', '!', '"', '#', '$', '%', '&', '\'',
'(', ')', '*', '+', ',', '-', '.', '/',
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', ':', ';', '<', '=', '>', '?',
'@', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z', '[', '\\', ']', '^', '_',
'`', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z', '{', '|', '}', '~', 0x7f,
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f,
0xa0, '¡', '¢', '£', '¤', '¥', '¦', '§',
'¨', '©', 'ª', '«', '¬', '­', '®', '¯',
'°', '±', '²', '³', '´', 'µ', '¶', '·',
'¸', '¹', 'º', '»', '¼', '½', '¾', '¿',
'À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ', 'Ç',
'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î', 'Ï',
'Ð', 'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö', '×',
'Ø', 'Ù', 'Ú', 'Û', 'Ü', 'Ý', 'Þ', 'ß',
'À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ', 'Ç',
'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î', 'Ï',
'ð', 'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö', 0xF7,
'Ø', 'Ù', 'Ú', 'Û', 'Ü', 'Ý', 0xFE, 0xFF };




extern char char88591_noaccent [] = {
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
' ', '!', '"', '#', '$', '%', '&', '\'',
'(', ')', '*', '+', ',', '-', '.', '/',
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', ':', ';', '<', '=', '>', '?',
'@', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z', '[', '\\', ']', '^', '_',
'`', 'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o',
'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
'x', 'y', 'z', '{', '|', '}', '~', 0x7f,
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f,
0xa0, '¡', '¢', '£', '¤', '¥', '¦', '§',
'¨', '©', 'ª', '«', '¬', '­', '®', '¯',
'°', '±', '²', '³', '´', 'µ', '¶', '·',
'¸', '¹', 'º', '»', '¼', '½', '¾', '¿',
'A', 'A', 'A', 'A', 'A', 'A', 'Æ', 'C',
'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I',
'Ð', 'N', 'O', 'O', 'O', 'O', 'O', '×',
'O', 'U', 'U', 'U', 'U', 'Y', 'Þ', 'ß',
'a', 'a', 'a', 'a', 'a', 'a', 'æ', 'c',
'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i',
'o', 'n', 'o', 'o', 'o', 'o', 'o', '÷',
'o', 'u', 'u', 'u', 'u', 'y', 'þ', 'y' };




.



Relevant Pages

  • Re: Some Windows symbol characters mis-render in MacOS X
    ... Leonard Blaisdell wrote: ... What do the following characters look like to you? ...
    (comp.sys.mac.system)
  • Re: brute-force estimation
    ... characters (capital and lower-case), numbers and basic punctuation. ... That is roughly 80 symbols to play with. ...
    (sci.crypt)
  • URL link since Outlook 2000
    ... when I start a URL link to a unix server, with capital and lower-case, since ... a Email in Outlook 2000, IE6 modify all characters in lower-case and I can't ...
    (microsoft.public.windows.inetexplorer.ie6.browser)
  • Re: preg_match help
    ... No other characters. ... I dont want it to start with a number???? ... Start of line THEN a single alphabetic character (upper- or lower-case) THEN ...
    (comp.lang.php)
  • Re: Translate selected charactes in big files
    ... I've got a requirement to translate codepage 437 box characters into ... I've just knocked up this suggestion, hardcoded translate table (TM's ... Robert AH Prins ...
    (comp.lang.asm.x86)