Re: comp.unix.shell FAQ - Answers to Frequently Asked Questions

From: Stephane CHAZELAS (this.address_at_is.invalid)
Date: 07/30/04


Date: 30 Jul 2004 10:38:32 GMT

2004-07-29, 21:40(+00), Joe Halpin:
> The FAQ has been posted here, and is also available at
> http://home.comcast.net/~j.p.h/
[...]

some comments.

Why not splitting it so that parts are less than 60kB?

[...]
> Version: $Id: cus-faq.html,v 1.7 2004/05/31 14:24:00 jhalpin Exp $

The text version is 1.6

Note that everywhere "zsh 4.1" is mentionned in that faq, it should be
replaced with "zsh 4.1 and above" (latest version is 4.2.x).

> 2. How can I generate random numbers in shell scripts?
[...]
> c. Use a utility such as awk(1), which has random number generation
> included. This approach is the most portable between shells and
> operating systems.
>
> awk 'BEGIN {srand();print rand()}'
>
> However, note that if you call this line more than once within
> the same second, you'll get the same number you did the previous
> time.

Won't work with old awks though (same as in 6.h)

[...]
> 4. How can I remove whitespace characters within file names?
>
> File names in unix can contain all kinds of whitespace characters,
> not just spaces. The following examples only work with spaces,
> adjust accordingly.
>
> a. Use the substitution capabilities of awk, sed, et al.
[...]
> b. Use the substitution capabilities of the shell if it has
> them. Check the man page for your shell (probably under a
> section named something like "Parameter expansion") to see. For
> example:
>
> f=${filename// /_}
>
> With zsh:
>
> autoload -U zmv
> zmv '* *' '$f:gs/ /_/'

It should be noted that the zmv solution renames the files (call mv
internally and adress several problems that may arise) while the other
solutions only update a variable (and then, renaming the files may
involve a quite complicated script to do it reliably).

[...]
> 6. How do I do date arithmetic?
[...]
> g. Arbitrary date arithmetic
[...]
> Another possibility is given in the examples section, from
>
> http://groups.google.com/groups?selm=n6d6zalnpk.fsf%40ogion.it.jyu.fi

See also:

http://groups.google.com/groups?selm=slrnbvnhu2.3v.stephane.chazelas@spam.is.invalid

[...]
> 20. How do I reverse a file?
[...]
> Also, nl can be used as
>
> nl -ba -d'
> '
>
> i.e. NL as the delimiter.

> You may also be able to use nl with the -p option, which is POSIX,
> and widely available, going back to AT&T SysV/386 R3.2 from 1988
> or thereabouts.

That latter part is irrelevant to the question. -p is of no use here as
it doesn't prevent the page splitting.

> 21. how do I remove the last n lines?
[...]
> AWK solutions:
>

1
> awk 'NR<=(count-12)' count="`awk 'END{print NR}' file`" file

2
> awk 'NR>n{print a[NR%n]} {a[NR%n]=$0}' n=12 file
>
> awk 'BEGIN{n=12} NR>n{print a[NR%n]} {a[NR%n]=$0}' file

It should be noted that 2 is in every case much more efficient than
the other ones (it runs only one awk and read the file only once).

1 (and the non-awk solutions) might only be prefered if <n> is
very big (worth several MB) as the 2 solution holds <n> lines in
memory.

[...]
> sed -n -e :a -e '1,12{N;ba' -e '}' -e 'P;N;D' file
>
> The last solution is basically same algorithm as the rolling
> array awk solutions, and shares with them the advantage that
> the file is only read once - they will even work in a pipe.

But many sed implementations have a low limit on what it can
hold in its pattern space (in memory) so that one can't be used
for big values of <n>.

[...]
> Using GNU dd:
>
> ls -l file.txt | {
> IFS=" "
> read z z z z sz z
> last=`tail -10 file.txt | wc -c`
> dd bs=1 seek=`expr $sz - $last` if=/dev/null of=file.txt
> }
>
> This is different than other solutions in that, rather than
> creating a new, shorter file it overwrites the trailing lines
> in the original file with nulls.

No, it doesn't fill with 0. It truncate(2)s the file. It may
reveal the most efficient solution for very big files (and
relatively small values of <n>) as it doesn't read the whole
file (just the trailing part) and just does a ftruncate system
call.

You could do the same with perl with a bit of programming.

[...]
> 25. How do I rename *.foo to *.bar?
[...]

The mmv, ren, rename and zmv solutions should be put first as they are
the simplest and most reliable ones.

For the other ones, a "use it at your own risk" or "make a
backup of your data before" or "prepend echo to the moving
command first" warnings would be welcome.

See also the "-n" option to zmv.

[...]
> for file in *.foo

for file in ./*.foo

(or use -- in every command below).

> do
> newfile=`basename "$file" .foo`.bar
> [ -f "$file" ] || continue
> [ -f "$newfile" ] && continue ## or deal with it another way

What if "$newfile" is a directory?

ls -d "$newfile" > /dev/null 2>&1 && continue

> mv "$file" `basename "$file" .foo`.bar

mv "$file" "$newfile"

> 26. How do I use shell variables in awk scripts
>
> Depending on the version of awk being used, either use the -v
> command line option,
>
> $ awk -v var=xxx '{print $0,var}'
>
> or add the variable after the command, as in
>
> $ awk '{print $0,var}' var=xxx file
>
> Note that using the latter syntax var will not be available in the
> BEGIN section.

Also beware of the \n, \t... expanded.

A third (and more reliable solution) is to use the ARGV special awk
array.

> 28. How do I get one character input from the user?
>
> In bash this can be done with the "-n" option to read.
> In ksh93 it's read -N
> In zsh it's read -k
>
> More portably:
>
> OLDSTTY=$(stty -g) # save our terminal settings
> stty cbreak # enable independent processing of each input character
> ONECHAR=$(dd bs=1 count=1 2>/dev/null) # read one byte from standard in
> stty $OLDSTTY # restore the terminal settings

stty "$OLDSTTY"

[...]
> 33. How do I split a pathname into the directory and file?
>
> The most portable way of doing this is to use the external
> commands dirname(1) and basename(1), as in
>
> pathname='/path/to/some/file'
> dir=`dirname $pathname`
> file=`basename $pathname`

dir=`dirname "$pathname"`
file=`basename "$pathname"`

> However, since this executes an external command, it's slower than
> using shell builtins (if your shell has them)

and fails if the {dir,base}name ends in NL characters.

> For ksh, bash, and
> POSIX shells the following will do the same thing more
> efficiently:

and zsh.

>
> pathname=/path/to/some/file
> file=${pathname##*/}
>
> To get the directory using the shell builtin, you should first
> ensure that the path has a '/' in it.
>
> case $pathname in
> */*) dir=${pathname%/*};;
> *) dir=''
> esac

In zsh, (abd csh, tcsh), you have

${pathname:h} (head) ${pathname:t} (tail).

[...]
> 34. How do I make an alias take an argument?
>
> In Bourne-derived shells aliases cannot take arguments, so if you
> need to be able to do that, define a shell function rather than
> an alias.

In no shell actually (in csh it's just a trick that makes look like it
takes an argument), an alias is just... an alias.

-- 
Stephane


Relevant Pages