Re: Command piping question
- From: Dan Foster <usenet@xxxxxxxxxxx>
- Date: Wed, 08 Feb 2006 16:37:34 -0600
In article <1139436456.277381.176770@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>, BD <bobby_dread@xxxxxxxxxxx> wrote:
Basic ksh question here... there's a principle in command redirection
which would make my life easier if I understood.
Say I want to delete all files in a directory that have properties that
I can see through an 'ls -al'.
For example, If I wanted to show all files with a timestamp of 12:##, I
could go
ls -al |grep 12:
Can I pipe the output somehow to delete files based on the same
criteria, as in
ls -ls |grep 12: >rm
I know that's not correct, but can I accomplish this somehow with
redirection or piping?
Yes, you can. One way -- but not the best way:
$ ls -al | grep '12:' | awk '{print $NF}' | xargs rm
The piping is to basically whittle down to a list of files, then extract
the filenames of matching files, then tell rm to nuke the whole pile of
matching files in one swoop.
A safer method would be:
$ ls -al | awk '$8 ~ /^12:/ {print $NF}' | xargs rm
What is the output of the ls -al? Lines like:
-rw-r--r-- 1 root system 90 Jan 15 16:22 userfs.list.08
In awk, the default field separator (FS) is a space.
So you can see that there are 9 fields here.
The eighth field has the time.
The awk program is between the two ' ' quote marks.
$8 ~ /^12:/
means:
"If the 8th field matches something that starts with 12:"
and...
{print $NF}
means:
"...then print out the final field (the filename)"
$NF will always be the final field. Doesn't matter how many fields a
line may have... $NF is *ALWAYS* the last field on the line.
Last field here has the filename.
The ls -l | awk method is better than ls -l | grep because the awk
method only checks the time field ONLY... grep checks against the entire
line so it's not as safe.
What if you had a filename like 12:05.txt (a legal filename) created at 9am?
grep would include it.
awk would not.
So awk is safer and more bulletproof in this situation.
Why use xargs? Because rm itself doesn't take filenames to nuke from
the standard input (stdin) -- in other words, can't use the piped output.
Example:
$ echo foo.txt bar.txt | rm
will not work because of this reason.
But:
$ echo foo.txt bar.txt | xargs rm
will work -- xargs collect the output from the pipe then runs "rm
foo.txt bar.txt".
So xargs collects the arguments then runs rm with the filenames.
xargs is also good because it can break down really long lists of
arguments (filenames in this case) into chunks that will not exceed the
maximum length of arguments for a single command.
What if you matched 2,000 filenames? But what if you only could fit
about 500 filenames on a single rm command before it fails. What would
happen if you told rm to nuke all 2,000 files -- it'd fail or it'd
ignore the last 1,500 files.
xargs avoid that problem by figuring out it can stuff about X number of
arguments at one time... let's say, 500 arguments for rm. If it has
2,100 arguments... then it calls rm five times (500, 500, 500, 500, 100).
It runs rm fewer times instead of running rm once for every single file
like 'find' might do.
The 'find' approach would run rm 2,100 times for 2,100 files... xargs
might run rm 5 times for 2,100 files. Guess which is much faster to run?
It's not a big deal with a small directory, but if you use xargs, it'll
automatically be faster the day someone uses your code on a huge
directory without having to change a single thing.
So if you plan ahead and use ls|awk|xargs rm, your code will be more
likely to run correctly in odd situations, and will work fine on small,
medium sized, or huge directories.
This is also a good way to illustrate piping concepts, too.
-Dan
.
- Follow-Ups:
- Re: Command piping question
- From: Jurjen Oskam
- Re: Command piping question
- From: BD
- Re: Command piping question
- References:
- Command piping question
- From: BD
- Command piping question
- Prev by Date: Command piping question
- Next by Date: Re: Command piping question
- Previous by thread: Command piping question
- Next by thread: Re: Command piping question
- Index(es):
Relevant Pages
|