Re: Any portable way get a filename in UTF-8 or to get the FS encoding ?



On Oct 8, 4:15 am, Timothy Madden <terminato...@xxxxxxxxx> wrote:
Fredrik Roubert wrote:
On Sun, 07 Oct 2007 22:22:12 +0300, Timothy Madden wrote:
[...]
A process that wants to interpret the bytes that makes up a file name
must look at its environment for hints about which encoding the user
wants those file names to be interpreted as (eg. the LC_* environment
variables). You can use the mbstowcs() library function to automatically
convert a string into a wide character string according to the encoding
specified by the current environment.

How about files from a remote file system ? Than I am out of luck !

I use to connect through VPN, at work, to my client's LAN. They use
Latin-1, I use Latin-2.

How can I tell that programmatically and portably ? My app has to work
with files from both machines.

I would like a standard way to get that encoding, and the file system
should be the first to know about it.

I guess I will just have to rely on the user passing the encoding for
files whose names I process on the command line, or else assume the LC_*
default.

You could adopt a convention where the encoding is contained in the
filename itself. There's a scheme like this for email subject lines.
For example I have a piece of spam in my inbox with a subject of =?
ISO-2022-JP?B?GyRCMnEwd0ApNVUxZyU1JSQbKEI=?= which I presume a smart
enough mail client would display as Japanese text. (Mine doesn't, but
I don't care cause it's spam and I can't read Japanese anyway.)

.



Relevant Pages