Re: Any portable way get a filename in UTF-8 or to get the FS encoding ?



On Sun, 07 Oct 2007 22:22:12 +0300, Timothy Madden wrote:

I have seen _wreaddir function in some implementations, but is there a
portable way to get a file's name in UTF-8 or to get a file name in the
underlaying encoding of its file system and to get the encoding ?

Are POSIX implementations required to convert the file name return by
readdir to the application's execution character set ?

The encoding used for file names on any given file system is never
specified in a POSIX system, and a user is free to create file names
using several different encodings even on the same file system. (I
actually have such a file system myself, where most file names are
encoded in UTF-8 but the file names in one directory are encoded in
ISO-8859-1.)

A process that wants to interpret the bytes that makes up a file name
must look at its environment for hints about which encoding the user
wants those file names to be interpreted as (eg. the LC_* environment
variables). You can use the mbstowcs() library function to automatically
convert a string into a wide character string according to the encoding
specified by the current environment.

Cheers // Fredrik Roubert

--
Dyre Halses gate 10 | +47 73568556 / +47 41266295
NO-7042 Trondheim | http://www.df.lth.se/~roubert/
.



Relevant Pages

  • Re: A Great Idea (tm) about reimplementing NLS.
    ... doesn't magically convert!utf-8 to utf-8. ... Unless the original encoding is 'wrong' and has two mapped characters ... Thats a fault of the file system design, ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)
  • Re: Ada.Directories problems, a summary
    ... Name encoding. ... Either String versions should be UTF-8 or else replaced ... string names to file system names? ... example, to find drive letters under Windows, network shares, mount points ...
    (comp.lang.ada)
  • Re: Any portable way get a filename in UTF-8 or to get the FS encoding ?
    ... So I download and read sus v2 and sus v3 to see the openddir/readdir/closedir functions, but they only return charstrings for file names and they say nothing about the encoding of the file names. ... I have seen _wreaddir function in some implementations, but is there a portable way to get a file's name in UTF-8 or to get a file name in the underlaying encoding of its file system and to get the encoding? ... A filename is just a NUL terminated string which is completely compatible with UTF-8. ...
    (comp.unix.programmer)
  • Re: Any portable way get a filename in UTF-8 or to get the FS encoding ?
    ... I would like a standard way to get that encoding, and the file system should be the first to know about it. ... Even more, if the user has two apps, one that only knows SHIFT_JIS and one that only knows ANSI, the user just needs to arrange that current locale for the first app is SHIFT_JIS, and the current locale for the second app is ANSI. ...
    (comp.unix.programmer)
  • Re: A Great Idea (tm) about reimplementing NLS.
    ... > I don't think that any filesystem knows about the encoding of every ... > IOW given a FAT filesystem on an USB stick, ... > conversion layer in the kernel). ... file system implementation to decide to use it or not. ...
    (Linux-Kernel)