Re: Grep and mv
From: Chris F.A. Johnson (cfajohnson_at_gmail.com)
Date: 07/15/05
- Next message: John L: "Re: Grep and mv"
- Previous message: WCB: "Re: Grep and mv"
- In reply to: WCB: "Re: Grep and mv"
- Next in thread: Robert Bonomi: "Re: Grep and mv"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 15 Jul 2005 16:35:20 -0400
On 2005-07-15, WCB wrote:
> Chris F.A. Johnson wrote:
[snip]
>>>> If it doesn't work, tell us EXACTLY what happens. Use set -x and
>>>> redirect stderr to a file.
>>>
>>> Cut and pasted. changed mod and owner.
>>> Loaded fresh test files in directory
>>>
>>> I ran this script.
>>> still it has trailing spaces.
>>
>> What is "it"?
>
> File names.
>
>>
>>> In the terminal, they show up as ?
>>
>> In what context do they (whatever "they" are) "show up"?
>
>
> If I type ls in the terminal, I see a ? at end of each file name.
What shows up? A space or a question mark?
> This is how a terminal shows a blank at end of a file name.
No, it isn't. It's something else, probably a carriage return (^M).
Please post the output of:
grep HCO xx01 | hexdump -C
[snip]
> Each and every file name has a jolly little space appended
> at the end this way.
Every file name? Or just the ones you have created with the script?
>> and are seen boxes in properties.
Boxes are almost certainly not spaces.
[snip]
>>> listing in the terminal looks like
>>>
>>> HCO-BULLETIN-OF-10-MARCH-1965?
What is the output of:
ls HCO-BULLETIN-OF-10-MARCH-1965* | hexdump -C
>>> HCO-BULLETIN-OF-29-MARCH-1965?
>>> HCO-BULLETIN-OF-2-APRIL-AD15?
>>> HCO-BULLETIN-OF-4-APRIL-AD15?
>>> HCO-BULLETIN-OF-5-APRIL-1965?
>>> HCO-BULLETIN-OF-5-MARCH-1965?
>>> HCO-BULLETIN-OF-7-APRIL-AD15?
>>>
>>> Each file is a file, so its renaming a real file.
>>
>> What files did you have to start with?
>
> xx03, xx04, xx05 and so on.
>
> These files have been checked and run through dos2unix
> just in case.
> No control characters, not DOS not Mac, not Word,
> simple ascii. At the end of the name to be extracted,
> no spaces. cat -v shows a ^M at end of the string extracted.
> "^M", not " ^M". So our mystery space is not coming from
> there.
>
> cat -v xx03 show no other characters beyond ascii than ^M.
In other words, a DOS/Windows file, or a Mac file if there are no
linefeeds. Unix test files do not contain ^M.
> All other test files are likewise clean.
That's not clean.
> My original file was extracted from a CD. It was one
> big text file I broke down into sections and then into
> individual files using csplit which had no
> problems with doing so. The Nano editor displays them
> with no artifacts or problems, nor does Kate nor Kwrite
> have problems. Cat -v shows no problems.
>
> So I am very definitely sure its not a buggered file
> nor word processor control characters that are at any
> way an issue.
>
> Now all I have to do is find how to make 4000 xxNN
> files have somewhat meaningful names.
>
> The files I am using here are clean and readable.
> My silly little grep script extracts the names as
> expected. No spaces or artifacts there showing
> up, so its again, not a problem with the files.
>
>
>> What files did you end up with?
>
>
> HCO-BULLETIN-OF-10-MARCH-1965?
> HCO-BULLETIN-OF-29-MARCH-1965?
> HCO-BULLETIN-OF-2-APRIL-AD15?
> HCO-BULLETIN-OF-4-APRIL-AD15?
> HCO-BULLETIN-OF-5-APRIL-1965?
> HCO-BULLETIN-OF-5-MARCH-1965?
> HCO-BULLETIN-OF-7-APRIL-AD15?
>
> They are files, I can cat them and read them.
>
>> Is that what you wanted? If not, what is different from what you
>> wanted? Please post the script you used, directly from the file.
>
> This script almost does the trick except for the spaces.
> Since cat -V shows the string extracted ends as an example,
>
> HCO-BULLETIN-OF-4-APRIL-AD15^M
EXACTLY! That a DOS/Windows line ending. IT IS NOT A SPACE.
> The space is not coming from within the file with this string.
There is no space.
> So there should not be a trailing space to cut.
There isn't.
> So something seems to be ADDING a space.
There is no space.
> Since line 8 in your script cuts trailing spaces
> it seems a logical deduction that the space is added
> after line 8, somewhere.
It would be, if there were a space. There isn't; it's a carriage
return.
> I have no idea where that artifact
> is coming from.
From your DOS/Win file.
[snip]
> The other thing that weirds me out big time is, your
> mv "$i" "$f" seems to work.
> When I extract $f and $i, my echo "$f" >> F
> tests show repeatedly that I am extracting real
> data that mv should then use.
>
> Neither mv $i $f nor mv "$i" "$f" work, I just get
> different error messages.
>
> My script does one file and then gives a bunch of error
> messages.
>
>
> Just before that step, if I do echo $i >> I
> and echo $f >> F I show that I am getting
> the xxNN files and extracted name files OK
> to that point.
>
> It does not matter if I use mv "$i" "$f" or mv $i $f
>
> ****************
> #1/bin/bash
>
> # mover4
>
> for i in xx*
> do
>
> grep -m 1 HCO* $i > x
Not again! !@#%$#@. How many times do you need to be told?
QUOTE THE WILDCARD (*).
> sed 's/^ *//' x > y
> sed 's/ /-/g' y > z
>
> # cat x >> X
> # cat y >> Y
> # cat z >> Z
> # OK to here so far
>
> f=$(cat z)
> # echo $i >I
> # echo $f >F
> # mv $i $f
> mv "$i" "$f"
> done
>
> *****************************
>
> files to start
>
> xx03 xx10 xx100 xx101 xx102 xx103 xx104
>
> If echo "$f" >> I
> if echo "$i" >> F
>
> I
>
> xx104
> xx03
> xx10
> xx100
> xx101
> xx102
> xx103
> xx104
>
> F
>
> HCO-BULLETIN-OF-27-DECEMBER-1967
> HCO-BULLETIN-OF-5-MARCH-1965
> HCO-BULLETIN-OF-18-APRIL-AD15
> HCO-BULLETIN-OF-11-OCTOBER-1967
> HCO-BULLETIN-OF-9-NOVEMBER-1967
> HCO-POLICY-LETTER-OF-22-NOVEMBER-1967
> HCO-BULLETIN-OF-28-NOVEMBER-1967
> HCO-BULLETIN-OF-27-DECEMBER-1967
>
> OK, this works!
>
> ************************************
>
> Now mv "$i" "$f"
> Exactly as used in your script.
>
> results?
>
> ls ..
>
> HCO-BULLETIN-OF-5-MARCH-1965?* xx10* xx101* xx103* y
> x xx100 xx102* xx104*
> z
>
> It does not work like yours. And adds a space
> as a final insult.
> Which space is NOT at end of echo "$f" >> F names
No, but there's probably a ^M. Post the output of:
hexdump -C F
> This is bash 2.05 B patch level 0. As supplied by
> Mandrake 10.1.
>
> The error messages I get when running this script are:
> mv: cannot move `xx10' to 1`': No such file or directory
NO, IT'S NOT!!! PLEASE DO NOT RETYPE ERROR MESSAGES!!!
> This for all remaining xxNN files.
Then read the error message. Try to understand what it is telling
you.
> Why your mv works, and not mine is a mystery to me.
> But this is where the mystery space is coming from also,
> obviously since the one file that did work with my script
> has one also.
>
> It sure look like a mv bug to me.
>
> Next step, googling for info on possible broken mv.
Don't waste your time; mv is NOT broken.
Try this:
CR=$'\r'
for i in xx*
do
f=$(grep -m 1 HCO "$i")
if [ -z "$f" ]
then
printf "File: %s, no HCO found; skipping\n" "$i"
continue
fi
f=${f%$CR}
while :
do
case $f in
\ *) f=${f# } ;; ## remove leading space
*\ ) f=${f% } ;; ## remove trailing space
*\ *) f=${f// /-} ;; ## convert spaces to hyphens
*--*) f=${f//--/-} ;; ## convert multiple hyphens to a single hyphen
*) break ;; ## nothing more needs doing; exit the loop
esac
done
mv "$i" "$f"
done
--
Chris F.A. Johnson <http://cfaj.freeshell.org>
==================================================================
Shell Scripting Recipes: A Problem-Solution Approach, 2005, Apress
<http://www.torfree.net/~chris/books/cfaj/ssr.html>
- Next message: John L: "Re: Grep and mv"
- Previous message: WCB: "Re: Grep and mv"
- In reply to: WCB: "Re: Grep and mv"
- Next in thread: Robert Bonomi: "Re: Grep and mv"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|