Re: Find out which character set is used
- From: Måns Rullgård <mru@xxxxxxxxxxxxx>
- Date: Sun, 29 Jan 2006 22:18:20 +0000
thomas.mertes@xxxxxx writes:
> When a program is running inside an terminal emulator like
> xterm or Konsole the emulator uses some character set like
> ISO Latin-1, ISO Latin-9 or UTF-8.
>
> As far as I know the use of UTF-8 can be recognized by
> examining the environment variables LC_ALL, LC_CTYPE
> and LANG for the string UTF-8.
>
> When UTF-8 is not used, something like Latin-1, Latin-9 or
> some other character set is used.
>
> My question is now:
> How to find out which character set is used in a terminal?
man locale
> For X11 programs there is a similar problem:
> Which character set is used to encode file names in a
> directory?
The encoding is whatever the file was created with. There is no way
to tell. The best you can do is hope that it matches LC_CTYPE. If
you are ambitious, check for common patterns of the most likely
encodings. Of course, on short strings like filenames this is likely
to give many false matches.
The easy way out is to require utf8, and let the user blame himself if
he chooses to use something else.
--
Måns Rullgård
mru@xxxxxxxxxxxxx
.
- Follow-Ups:
- Re: Find out which character set is used
- From: thomas . mertes
- Re: Find out which character set is used
- References:
- Find out which character set is used
- From: thomas . mertes
- Find out which character set is used
- Prev by Date: Find out which character set is used
- Next by Date: Re: Writing a unix command prompt
- Previous by thread: Find out which character set is used
- Next by thread: Re: Find out which character set is used
- Index(es):
Relevant Pages
|