Re: Script to strip illegal characters from files and directories?



On Saturday 10 May 2008 11:56, Janis Papanagnou wrote:

No need to handle / in filenames; that's an illegal character in Unix
filenames. With ksh93 or bash you may want to try...

find . | while read -r f ; do mv -i "${f}" "${f//[:;*?\"<>|]}" ; done

where the characters are removed (as you seem to like) or try

find . | while read -r f ; do mv -i "${f}" "${f//[:;*?\"<>|]/_}" ; done

to replace the characters by an _ (which I think is better).

The script will try to do "mv . ." first, which of course will fail.
You should at least check that the new name differs from the old name, and
probably use "--" to indicate the end of the options to mv.

Furthermore, if a directory with strange characters is encountered first
(and find by default output directories first), then renaming the files
inside the directory will fail.

If the structire is as follows:

dir<>foo
|
+------file1**?
\------file:2:bar

Then "dir<>foo" will be renamed first, and subsequent attempts to
rename './dir<>foo/file1**?' and './dir<>foo/file:2:bar' to something else
will fail, since directory 'dir<>foo' does not exist anymore.

--
D.
.



Relevant Pages

  • Re: matching ? in a string ending with digits
    ... for my $item (@arr) { ... what other characters will fail to match in a string ... regex to fail as mentioned. ...
    (comp.lang.perl.misc)
  • Re: File IO-I am defeated!
    ... So add some code to trap the error and have the error routine run the code I posted in my last response, except change my code so that it send its output to a log file instead of to a ListBox. ... You are telling us that these files contain only "standard text characters", and that you have produced them using your own code so you are certain you are correct. ... Your own code will happily load a file containing whatever bytes values you wish, whether they are what you would call standard text characters or not, EXCEPT it will fail with exactly the error you describe if there is a Chror a Chranywhere in the file. ... I would bet my bottom dollar that when you eventually get the log back from the faulty file you will find at least one of those characters in there, most probably the zero. ...
    (microsoft.public.vb.general.discussion)
  • Re: ProgID limit
    ... I looked it up because I've found VB6 will fail to build an ... EXE when progids exceed 39 characters (the error message given is ...
    (microsoft.public.win32.programmer.ole)
  • Re: Script to strip illegal characters from files and directories?
    ... to replace the characters by an _. ... which of course will fail. ... The second point is valid if you have filenames starting with a dash. ... (and find by default output directories first), ...
    (comp.unix.shell)
  • Re: GetProcAddress with symbol length of 62 = RtlFreeHeap invalid address.
    ... > It doesn't fail for me in Windows 2000. ... RtlFreeHeap error when the symbol name is 62 characters and I can't figure out ...
    (microsoft.public.vc.language)