Re: trying to recursively get the files' owners and permissions as well as an md5sum of the data
- From: Stephane CHAZELAS <stephane_chazelas@xxxxxxxx>
- Date: Tue, 17 Jun 2008 16:59:51 +0000 (UTC)
2008-06-17, 00:21(-04), Albretch Mueller:
Also, I have read somewhere that coding like this:[...]
~
sh-3.1# md5sum `find . -type f -print0 | xargs -0`
~
is better than doing it like:
~
sh-3.1# find . -type f -print0 | xargs -0 md5sum
~
I actually read what this guy said. (S)He didn't say "faster" or "less
memory taxing", which are both measurable, but "better" because md5sum is
loaded into memory only once
~
I don't really know how the OS handles this, so I am asking
That's nonsense.
find ... -print0 | xargs -0 cmd
Tells find to output each filename followed by the NUL
character. The NUL character is the one character that cannot
occur in a file path on Unix. xargs -0 tells xargs to split it's
input on the NUL character and that pass each element resulting
of the splitting to the command. So that cmd gets one argument
per file found by find which is fine. The only improvement one
might suggest is to also use the -r (also GNU specific) option
to xargs so that it doesn't run cmd if its input is empty (if
find didn't find any file).
sh-3.1# md5sum `find . -type f -print0 | xargs -0`
couldn't be more wrong.
Here, as the cmd is not provided, xargs calls the "echo" command
instead. So the files found by find will be passed as arguments
to echo. echo is a command that outputs its arguments separated
by the space character. It also performs some transformations on
those arguments, for instance it transforms the "\n" string into
a newline character.
Then that output of echo (there can be several instances of echo
called) is gathered by the shell (because of `...`) and stored
in memory. When xargs has finished, then the *shell* will split
all that output. The splitting in `...` is done by default on
spaces, tabs and newline characters. Then, for every word
resulting from that splitting, the shell performs globbing, that
is for every word that contains wildcard characters such as *,
?, [...], the shell will try to expand that to the matching
files relative to the current directory.
And then, it will pass that big list as arguments to the md5sum
command (and contrary to xargs, it will not work around the
limitation on the number of arguments).
As an example, if you do:
touch 'some
file with *a* newline character in it, \n and plenty of spaces'
find . -type f -print0 will output:
some<NL>file with *a* newline character in it, \n and plenty of spaces<NUL>
xargs -0
reading that will split it in one argument to echo:
some<NL>file with *a* newline character in it, \n and plenty of spaces
echo will output:
some<NL>file with *a* newline character in it, <NL> and plenty of spaces<NL>
`...` will split that into those elements:
1 some
2 file
3 withs
4 *a*
5 newline
6 character
7 in
8 it,
9 and
10 plenty
11 of
12 spaces
The 4th one contains wildcards, so is subject to globbing. *a*
means any file name that contains "a". And the file happens to
match, so the list becomes:
1 some
2 file
3 withs
4 some<NL>file with *a* newline character in it, \n and plenty of spaces
5 newline
6 character
7 in
8 it,
9 and
10 plenty
11 of
12 spaces
And those will be passed as arguments to md5sum.
--
Stéphane
.
- Follow-Ups:
- References:
- trying to recursively get the files' owners and permissions as well as an md5sum of the data
- From: Albretch Mueller
- Re: trying to recursively get the files' owners and permissions as well as an md5sum of the data
- From: Stephane CHAZELAS
- Re: trying to recursively get the files' owners and permissions as well as an md5sum of the data
- From: Albretch Mueller
- Re: trying to recursively get the files' owners and permissions as well as an md5sum of the data
- From: Albretch Mueller
- trying to recursively get the files' owners and permissions as well as an md5sum of the data
- Prev by Date: Re: trying to recursively get the files' owners and permissions as well as an md5sum of the data
- Next by Date: [announcement] paexec-0.9.0 released
- Previous by thread: Re: trying to recursively get the files' owners and permissions as well as an md5sum of the data
- Next by thread: Re: trying to recursively get the files' owners and permissions as well as an md5sum of the data
- Index(es):
Relevant Pages
|