Re: Script to strip illegal characters from files and directories?

Dave B wrote:
On Saturday 10 May 2008 11:56, Janis Papanagnou wrote:

No need to handle / in filenames; that's an illegal character in Unix
filenames. With ksh93 or bash you may want to try...

find . | while read -r f ; do mv -i "${f}" "${f//[:;*?\"<>|]}" ; done

where the characters are removed (as you seem to like) or try

find . | while read -r f ; do mv -i "${f}" "${f//[:;*?\"<>|]/_}" ; done

to replace the characters by an _ (which I think is better).

The script will try to do "mv . ." first, which of course will fail.

Yes, but what is the problem; that an error message is displayed?

You should at least check that the new name differs from the old name, and
probably use "--" to indicate the end of the options to mv.

The first point is not necessary; you just prevent the message, again.
The second point is valid if you have filenames starting with a dash.

Furthermore, if a directory with strange characters is encountered first
(and find by default output directories first), then renaming the files
inside the directory will fail.

Right. Good point. It will be necessary to use the find option -depth.


If the structire is as follows:


Then "dir<>foo" will be renamed first, and subsequent attempts to
rename './dir<>foo/file1**?' and './dir<>foo/file:2:bar' to something else
will fail, since directory 'dir<>foo' does not exist anymore.


Relevant Pages

  • Re: Filesystem syntax constraints under Windows
    ... for example the Latin alphabet as used in English has 26 characters. ... Windows mangles the names it is given. ... Don entered filenames, albeit unusual ones. ...
  • Re: spaces in filenames ?
    ... an assumption about the characters allowed in a filename. ... filenames don't contain newline characters (but beware that once ... IFS=" ...
  • Re: Mac file NAMING that doesnt work on PCs
    ... Mac documents folder to that server once and was rejected because my ... several characters which have special meanings on Windows and therefore ... Colon separates the drive letter from the rest of the pathname. ... Quotes are used to delimit filenames containing spaces. ...
  • Re: matching ? in a string ending with digits
    ... for my $item (@arr) { ... what other characters will fail to match in a string ... regex to fail as mentioned. ...
  • Re: FTPFindFirstFile unicode
    ... window; the hex for the text in the lower window appears below that window. ... be encoded using 520 characters in UTF-8, and this might be where the problem is. ... It works fine for filenames that have english ...