c.u.s. FAQ, take two

joe_at_invalid.address
Date: 07/21/03


Date: Mon, 21 Jul 2003 01:09:51 GMT

This is what I've got so far. If there are no objections, I'll see
about getting it accepted at comp.answers. Changes will always be
possible.

I removed some questions because they weren't really asked frequently
here, and nobody submitted an answer to them. I'll be happy to
consider objections to this. I'm an editor, not an author, so if
people here think I've removed something which should be here, let me
know.

As always, contributions or comments are welcome.

Joe

-- 
There are 10 kinds of engineers, those who understand binary and those
who don't.
Archive-name: comp-unix-shell-faq/faq/cus-faq
Version: $Id$
Maintained By: Joe Halpin <j.p.h@comcast.net>
This FAQ contains the answers to some Frequently Asked Questions
often seen in comp.unix.shell. It spells "unix" in lower case letters
to avoid arguments about whether or not Linux, FreeBSD, etc are
unix. That's not the point of this FAQ, and I'm ignoring the issue.
There are two levels of questions about shells.
One is the use of the shell itself as an interface to the operating
system. For example, "how do I run a program in the background, and go
on with other things?". Or "how do I setup environmental variables
when I log in?". 
The other level is how to write shell scripts. This often involves
having the shell execute unix utilities to perform part of the work
the shell script needs to accomplish, and requires knowledge of these
utilities, which isn't nominally in the scope of shell
programming. However, unless the question involves something other
than standard unix utilities, it should be included in this FAQ.
Standard unix utilities are defined by either POSIX or the Single Unix
Specification. These are now joined and are normally abbreviated as
"POSIX/SUS". This specification can be found at
http://www.opengroup.org/onlinepubs/007904975/toc.htm
The man pages found on that web page define standard behavior for any
given utility (or the shell itself). However, you should also check
the man page on your system for any utility or shell you need to
use. There isn't always a perfect correspondence between the standard
and a particular implementation (in fact, I'm not sure there's any
case in which they perfectly correspond).
Other good web sites that provide information about shells and shell
programming (including OS utilities) include:
http://www.shelldorado.com/
http://www.faqs.org/faqs/by-newsgroup/comp/comp.unix.shell.html
The answers given in this FAQ are provided with the best intentions,
but they may not be accurate for any particular shell/os. They may be
completely wrong. If you don't test the answers, that's a bug in your
procedures. There are no guarantees for the answers given here.
If you do test the answers and find a problem, please send email to
the maintainer, so it can be corrected. However, under no
circumstances will the maintainer be held liable for mistakes in this
FAQ. If the answers work for you, well and good. If not, please tell
me and I'll modify them appropriately so that this will be more
useful. If you don't agree with this, stop reading here.
A number of people have contributed to this FAQ, knowingly or
unknowingly. Some of the answers were taken from previous postings in
the group, and other people contributed questions and answers
directly to the maintainer, which you are welcome to do as well.
(For the group -- Please let me know if you'd like your name and/or
email address listed here when you contribute something)
CONTENTS:
0.  Glossary
1.  How can I send e-mails with attached files?
2.  How can I generate random numbers in shell scripts?
3.  How can I automatically transfer files using FTP with error checking?
4.  How can I remove whitespace characters within file names?
5.  How can I automate a telnet session?
6.  How do I do date arithmetic?
7.  Why did someone tell me to RTFM?
9.  How do I create a lock file?
10. How can (CRLF|LF|LFCR) be translated into one of the other?
11. How can a shell prompt be set up to change the title of xterm?
12. How do I batch an FTP download/upload?
13. How do I get the exit code of cmd1 in cmd1|cmd2
14. Why do I get "script.sh: not found"
15. Why doesn't echo do what I want?
16. How do I loop through files with spaces in their name?
17. how do I change my login shell?
18. When should I use a shel instead of perl/python/ruby/tcl...
19. Why shouldn't I use csh?
20. How do I reverse a file?
21. How do I remove last 10 lines?
22. how do I get file size, or file modification time?
23. How do I get a process id by process argv[0]?
24. How do I get a script to update my current environment?
25. how do I rename *.foo to *.bar?
26. How do I use shell variables in awk scripts
27. How do I input the user with a timeout?
28. How do I get one character input from the user?
29. why isn't my .profile read?
30. why do I get "[5" not found in "[$1 -eq 2]"?
31. How do I exactly display the content of $var (with a \n appended).
32. How do I exactly display the content of $var (without a \n
    appended). 
Appendix A: Some example scripts
Appendix B: References. These correspond with numbers in square
	    brackets (e.g. [1]) which may appear in the text.
======================================================================
ANSWERS
0. Glossary
   Google
      Google is one of the search engines on the Internet. It took
      over dejanews some years ago, and now is the standard reference
      when directing someone to a past thread one some topic. This is
      a very good place to start when researching a question about
      shell programming (and just about anything else).
        http://groups.google.com/advanced_group_search
   POSIX/SUS ("the standard")
      POSIX (Portable Operating System Interface) and SUS (Single Unix
      Specification) have been joined into one standard. This is what
      people usually mean when they refer to "the standard" in
      discussions about unix. When people in this group refer to the
      POSIX shell, they are talking about the shell prescribed by this
      specification. You can find this standard at 
        http://www.opengroup.org/onlinepubs/007904975/toc.htm
   portable
      The word "portable" means different things to different people,
      in different situations, which is to say, there isn't one
      definition of "portable".
      At one extreme, a portable script is one which will work under
      any shell, on any operating system. At this end of the spectrum,
      there is no such thing as a portable shell script (some
      operating systems don't even have shells). If we confine the
      operating system to unix (which would make sense since this is
      comp.unix.shell), the only truly portable scripts are those
      which make no use of builtin shell facilities or syntax, but
      which only call external utilities. For example
        echo Hello World
      would probably qualify. However, that doesn't do anyone much
      good.
      Given that there are probably few (if any) scripts which have to
      meet such a standard, a more frequent use of the word "portable"
      indicates the degree to which a script will run under different
      shells and/or different operating environments.
      For example, if you're writing an installation script for an
      application, and the platforms on which that application runs
      are defined, then the problem is pretty well bounded. The choice
      of shell is one which is available on all required platforms,
      and the syntax to be used is the smallest subset of all the
      variants of that shell on the target platforms.
      The degree to which your shell script needs to be portable has
      to be determined by you, or the requirements you've been given
      for the script.
   race condition
      This is a situation in which two entities (processes, threads,
      etc) are trying to access a shared resource, or perform the same
      action, and the result depends on the order of execution of the
      two entities.
   shebang
      This is the first line of a shell script, which indicates to the
      operating system which interpreter (shell) it should invoke to
      interpret the script. It has the form
      #!/path/to/shell [ argument ]
      where /path/to/shell might be /bin/sh, /usr/local/bin/bash, etc.
      This line is only interpreted by the operating system. That is,
      if a shell script (test.sh) is executable and run from the
      command line by typing its name. 
      If it's run by typing
      $ sh test.sh
      then sh is run with the argument "test.sh". It then interprets
      test.sh. For sh, the shebang line is simply a comment, and is
      ignored.
   UUOC
   
      This is short for "Useless use of cat". It's used to point out
      that some example script has used cat when it could have used
      redirection instead. It's more efficient to redirect input than
      it is to spawn a process to run cat. For example
        $ cat file | wc -l
      runs two processes, one for cat and one for wc. This is less
      efficient than
        $ wc -l file
1. How can I send e-mails with attached files?
   a. Use uuencode
   
      This is the simplest way to do this. For example
      $ uuencode surfing.jpeg surfing.jpeg | mail sylvia@home.com
      To send regular text as well
      $ (cat mailtext; uuencode surfing.jpeg surfing.jpeg) |
        mail sylvia@home.com
   b. Use MIME
      $ metasend -b -t john@friends.org -s "Hear our son!" \
    	-m audio/basic -f crying.au
      These examples are taken from
      http://www.shelldorado.com/articles/mailattachments.html which
      goes into much more detail about this.
2. How can I generate random numbers in shell scripts?
   This depends on the shell, and the facilities available from the
   OS. 
   a. Some shells have a variable called RANDOM, which evaluates to a
      different value every time you dereference it. If your shell has
      this variable,
      $ number=$RANDOM will produce a random number. 
   b. Some systems have a /dev/urandom device, which generates a
      stream of bits. This can be accessed using the dd(1) utility. An
      example of this (from a more extensive discussion of different
      techniques at http://www.shelldorado.com/scripts/cmds/rand)
      n=`dd if=/dev/urandom bs=1 count=4 2>/dev/null | od -t u4 | \
      awk 'NR==1 {print $2}'` 
   c. Use a utility such as awk(1), which has random number generation
      included. This approach is the most portable between shells.
      awk 'BEGIN {srand(seed);print rand()}'
      However, note that if you call this line more than once within
      the same second, you'll get the same number you did the previous
      time. 
3. How can I automatically transfer files using FTP with error
   checking?
    First, there are tools to do that: curl, wget, lftp, ncftp. But, they
    are generally not part of the base system (you need to install them).
    
    zsh (version 4 and above) provides a FTP facility, see
    "info -f zsh -n 'zsh/zftp Module'"
    
    #! /usr/bin/zsh
    zftp open host user passwd || exit
    zftp get /remote/file > /local/file; r=$?
    zftp close && exit r
    
    With your system "ftp" command, two ways:
    
    1- using "ftp -n". Without the -n option, ftp expects user interaction
    to enter the password, so you'd need to use "expect". With "-n", you
    provide the user and passwd as any other FTP command.
    
    #! /bin/sh
    ftp -n << EOF
    open ftp.domain.org
    user anonymous ${LOGNAME:-`who am i`}@
    binary
    get /remote/file /local/file
    bye
    EOF
    
    The error checking can't be made correctly (if "open" fails, the "user"
    command will be still sent even if it shouldn't).
    
    2- using ~/.netrc
    
    If you put:
    
    <<
    machine ftp.domain.org
    login mylogin
    password mypasswd
    macdef init
      binary
      get /remote/file /local/file
      bye
    
    
    >>
    
    (with the trailing empty line) in your ~/.netrc (ensure it's not world
    readable) and then run "ftp ftp.domain.org", ftp will find the matching
    "machine" entry in your ~/.netrc and use the parameters provided there
    to make the ftp transaction.
    
    Those work at least on Linux, FreeBSD, Solaris, HPUX
4. How can I remove whitespace characters within file names?
   File names in unix can contain all kinds of whitespace characters,
   not just spaces. The following examples only work with spaces,
   adjust accordingly.
   a. Use the substitution capabilites of awk, sed, et al.
      f=`echo "$filename" | sed 's/ /_/g'` #replace with '_'
      f=`echo "$filename" | awk '{gsub(" ","_");print $0}'`
      f=`echo $filename | tr ' ' _`
      or, more efficiently (although not exactly a one-liner)
      f=`tr ' ' _ <<EOF
      $filename
      EOF
      Add characters to the tr command line as needed (see the man
      page for tr to find out the available escape sequences).
   b. Use the substitution capabilities of the shell if it has
      them. Check the man page for your shell (probably under a
      section named something like "Parameter expansion") to see. The
      following works with bash, ksh93 and zsh (at least).
      f=${filename// /_}
5. How can I automate a telnet session?
   This is outside the realm of shell programing, per se. You need
   a more special purpose scripting language such as expect. See
   http://expect.nist.gov/ 
   Perl scripts can also do this with the Telnet module from CPAN.
6. How do I do date arithmetic?
   This often depends on exactly what you have in mind.
   a. Finding yesterday's date
      if you want to determine whether or not one file is older than
      another, you can (with bash, pdksh, ksh93 do
        $ [[ file1 -ot file2 ]] && echo file1 is older
      or you can use find to search a directory tree for files that
      are newer/older than some file:
        $ find . -name '*.c' -cnewer test.c
      The GNU version of date also has some nice features in this
      respect. For example
        To find yesterday's date
	  $ date --date yesterday
        To find tomorrow's date
	  $ date --date tomorrow
         See the man page for GNU date for other options. It can also
         provide dates more than one day in the past/future.
       However, daylight saving time can make this inaccurate if you
       do it at the right (wrong?) time of the year.
       If you don't have GNU date, playing with the TZ environmental
       variable is also an option for this kind of thing. If you know
       the local timezone, you can adjust the TZ variable to make the
       computer think it's the time you want to discover. Assuming the
       timezone is CST6DST, to find yesterday's date:
         $ YESTERDAY=`TZ=CST24DST date +%b\ %d\ %Y`
       The offset is limited to 24:59:59, so it depends on the
       implementation whether it will work for times outside that
       period.
   b. Finding elapsed time
      If you want to find elapsed time, perhaps because you want to
      know when some operation has timed out, some shells (bash, ksh
      [,??])  have a SECONDS variable which tell how many seconds have
      elapsed since the invocation of the shell, or since the last
      time it was set.
   c. Determining leap year
      A leap year is defined as a year which is evenly divisible by 4,
      unless it's evenly divisible by 100, unless it's also evenly
      divisible by 400. It gets worse than that, but this is as far as
      I go :-)
      One possibility for a ksh function to do this is
	isleap()
	{
	  y=$1
	  four=$(( $y % 4 ))
	  hundred=$(( $y % 100 ))
	  fourhundred=$(( $y % 400 ))
	  [ $four -eq 0 -a $hundred -ne 0 \
	    -a $fourhundred -ne 0 ] && echo leap || echo noleap
	}
   d. Determining the day of the week for a given date.
      This algorithm is known as Zeller's congruence. An explanation
      of it is available from the Dictionary of Algorithms and Data
      Structures web page at NIST:
        http://www.nist.gov/dads/
      Also, a fuller explanation is available at
        http://www.merlyn.demon.co.uk/zeller-c.htm#ZC
      [ Examples ? I couldn't figure out how to make this work in
        bash, so if someone has an implementation it would be welcome ]
   e. Arbitrary date arithmetic
      To do arbitrary date calculations is more complicated. One
      possibility is to call an external utility, or a program in
      another scripting language, which has this built in. For
      example, perl has wrappers for the unix time functions built in,
      so it can provide some relief in the regard. C programs can also
      be easily written to do date arithmetic (see the examples
      section). One thing to keep in mind, however, is that unix time
      functions are, strictly speaking, limited to the range of time
      between January 1 1970 at midnight, and 19 Jan 2038 at
      3:14:07. C/Perl programs which calculate dates outside this
      range might work, or they might not, that would depend on the
      implementation.
      To do arbitrary date arithmetic in the shell itself is also
      possible. 
        http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=2guis7.7d6.ln%40allenhome.sky.net&rnum=10
	[ waiting on feedback from another source about a more general
	example ]     
7. Why did someone tell me to RTFM?
   Because you didn't :-)
   RTFM is part of Usenet lingo, and means "Read The F-ing Manual".
   Generally people say this when someone asks a question that is
   asked so often, and is answered plainly in some relevant man page,
   that they're tired of seeing it asked.
   So RTFM, and the FAQs first before asking. Also, if you're new to
   the group, search Google Groups
   http://groups.google.com/advanced_group_search 
   before asking questions. And please don't post your homework
   questions to the group unless you've tried to figure them out, and
   have some specific questions. People will generally be happy to
   help you with your homework if you post what you've got and ask
   specific questions.
9. How do I create a lock file?
   Very carefully :-)
   The scheduler can stop one process in the middle of a non-atomic
   operation, and run another one, which wants to perform the same
   operation. The second one, having a full timeslice, might finish
   the operation. When control returns to the first process, confusion
   will reign.
   The trick is to do something atomic, so that this won't
   happen. There are a couple ways to do this. One is to create a
   directory instead of a file, the other is to create a symbolic
   link. Both operations are defined to be atomic by POSIX/SUS, by
   virtue of the fact that they both require calls to the
   corresponding system calls, which are atomic. Even here though,
   there is a window of opportunity for bad things to happen. There is
   some startup code in the command line version of mkdir(1) and
   ln(1), which could get interrupted before the system call is
   invoked. However, this is as close as we can get to perfection in a
   shell script.
   For the same reason, beware of testing for the existence of the
   lock file before trying to create it. That will just increase the
   window of possibility for a race condition.
   Be even more afraid of trying to create ANY kind of lock file on an
   NFS partition. NFS pretty much eliminates anything like atomicity.
   If you're going to create a lock file, make sure you're doing it on
   a local partition, such as /tmp.
   Netscape/Mozilla uses the symbolic link method for its lockfile (in
   spite of the fact that it creates it in the user's home directory,
   which may be NFS mounted). When it starts up it creates a file
   named for the IP address of the machine it's running on, and the
   pid of the creating process. Then it tries to create a symbolic
   link named "lock", which points to that file. If this symlink
   already exists, link(2) will return an error. In a script this
   would work something like
   touch /tmp/$$
   ln -s /tmp/$$ /tmp/lockfile 2>/dev/null
   if [ $? -ne 0 ];then
     echo lockfile already exists
     rm /tmp/$$
     exit 1
   else
     echo success
     rm /tmp/$$
   fi
   If you have procmail installed, another possiblity is the
   lockfile(1) command that comes with it. Many of the same caveats
   given above apply to that as well though.
10. How can (CRLF|LF|LFCR) be translated into one of the other?
    Unix text files consist of lines delimited by an LF ("line-feed")
    character (ASCII 10). DOS uses the two characters CR LF ("carriage
    return", "line feed"; ASCII 13, 10) for the same purpose.
    To convert a DOS text into unix text format, the CR characters
    (control-M) at the end of a line have to be removed:
        sed 's/^M$//' dos.txt > unix.txt
    To create a DOS text file, the CR character should be added:
        sed 's/$/^M/' unix.txt > dos.txt
    Note that "^M" in this case is an embedded control character, (CR,
    ASCII 13). Many shells allow embedding control characters by
    entering ^V first (control-V), resulting in the sequence
        ^V^M
    for entering "^M".
    Note that zsh, bash or ksh93 allow for:
    sed $'s/$/\r/'
    There is one special case to be considered: DOS text files
    sometimes contain an explicit end-of-file character ^Z (ASCII 26),
    which has no correspondent character for unix text files, where
    the end-of-file condition is determined implicitly.
 
    [XXX Does anybody have an easy solution to handle this?
    sed '$s/^Z$//' may not work, because some sed implementations
    cannot handle lines that are not terminated by LF]
11. How can a shell prompt be set up to change the title of xterm?
    http://cns.georgetown.edu/~ric/howto/Xterm-Title/Xterm-Title-singlepage.html
    Gives escape sequences for xterm. For example, to change the name
    of the current window to "XXX" (in bash), do
      $ echo -en "\033]2;XXX\007"
    See also "Why doesn't echo do what I want?"
12. How do I batch a FTP download/upload?
    The best way to handle this is with ncftpput and ncftpget which
    are part of the ncftp program.  ncftpput -u username -p password
    somewebsite.com /pics *jpg The above usage of the username and
    password is not recomend though as it will be seen by anyone using
    "ps" while the script is running.  ncftp has a way to handle that
    as well.  Just create a file with the information in the following
    formate:
              host somewebsite.com
              user username
              pass password
    Then just use the -f option on the ncftp program:
    ncftpput -f /home/username/somefile somewebsite.com /pics *jpg
    ncftp can be found at <url:http://freshmeat.net/projects/ncftp/>
    
    If you want to do this interactively, there's no need to keep the
    password in a file. For example, if you're building a program on
    one machine, but testing it on another, and you have to keep
    ftp'ing the files, you can cut down on typing by doing something
    like
    #!/bin/ksh
    ftp -n 47.103.130.123 <<EOF
    user <username>
    cd <path to build directory>
    prompt
    bin
    mget "$@"
    bye
    EOF
    The ftp program will automatically ask you for the password, then
    do the rest for you.
13. How do I get the exit code of cmd1 in cmd1|cmd2
    First, note that cmd1 exit code could be non-zero and still don't
    mean an error. This happens for instance in
    cmd | head -1
    you might observe a 141 (or 269 with ksh93) exit status of cmd1,
    but it's because cmd was interrupted by a SIGPIPE signal when
    "head -1" terminated after having read one line.
    To know the exit status of the elements of a pipeline
    cmd1 | cmd2 | cmd3
    a. with zsh:
       The exit codes are provided in the pipestatus special array.
       cmd1 exit code is in $pipestatus[1], cmd3 exit code in
       $pipestatus[3], so that $? is always the same as
       $pipestatus[-1].
    b. with bash:
       The exit codes are provided in the PIPESTATUS special array.
       cmd1 exit code is in ${PIPESTATUS[0]}, cmd3 exit code in
       ${PIPESTATUS[2]}, so that $? is always the same as
       ${PIPESTATUS: -1}.
    c. with any other Bourne like shells
       You need to use a trick to pass the exit codes to the main
       shell.  You can do it using a pipe(2). Instead of running
       "cmd1", you run "cmd1; echo $?" and make sure $? makes it way
       to the shell.
       exec 3>&1
       eval `
         # now, inside the `...`, fd4 goes to the pipe
	 # whose other end is read and passed to eval;
	 # fd1 is the normal standard output preserved
	 # the line before with exec 3>&1
         exec 4>&1 >&3 3>&- 
	 {
	   cmd1 4>&-; echo "ec1=$?;" >&4
	 } | {
	   cmd2 4>&-; echo "ec2=$?;" >&4
	 } | cmd3
	 echo "ec3=$?;" >&4
       `
    d. with a POSIX shell
       You can use this function to make it easier:
       run() {
         j=1
         while eval "\${pipestatus_$j+:} false"; do
           unset pipestatus_$j
           j=$(($j+1))
         done
         j=1 com= k=1 l=
         for a; do
           if [ "x$a" = 'x|' ]; then
             com="$com { $l "'3>&-
                         echo "pipestatus_'$j'=$?" >&3
                       } 4>&- |'
             j=$(($j+1)) l=
           else
             l="$l \"\$$k\""
           fi
           k=$(($k+1))
         done
         com="$com $l"' 3>&- >&4 4>&-
                    echo "pipestatus_'$j'=$?"'
         exec 4>&1
         eval "$(exec 3>&1; eval "$com")"
         exec 4>&-
         j=1
         while eval "\${pipestatus_$j+:} false"; do
           eval "[ \$pipestatus_$j -eq 0 ]" || return 1
           j=$(($j+1))
         done
         return 0
       }
       
       use it as:
       
       run cmd1 \| cmd2 \| cmd3
       exit codes are in $pipestatus_1, $pipestatus_2, $pipestatus_3
14. Why do I get "script.sh: not found" 
    a. While script starts with "#!/bin/sh" (^M issue)
       That's the kind of error that occurs when you transfer a file
       by FTP from a MS Windows machine. On those systems, the line
       separator is the CRLF sequence, while on unix the line
       separator is LF alone, CR being just another ordinary character
       (the problem is that it is an invisible one on your terminal
       (where it actually moves the cursor to the beginning of the
       line) or in most text editors or pagers).
       So, if a MSDOS line is "#!/bin/sh", when on a Unix system, it
       becomes "#!/bin/sh<CR>" (other names for <CR> are \r, \015, ^M,
       <Ctrl-M>).
       So, if you run the file as a script, the system will look in
       /bin for an interpreter named "sh<CR>", and report it doesn't
       exist.
       $ sed 'l;d;q' < script.sh
       #!/bin/sh\r$
       shows you the problem ($ marks the end of line, \r is the CR
       character).
    b. PATH issue
       Sometimes a shell is installed someplace other than /bin or
       /usr/bin. For example, a shell which was not part of the OS
       installation might be installed into /usr/local/bin. If the
       script was written on a machine which had ksh located in
       /usr/bin, but was run on a machine where ksh was located in
       /usr/local/bin, the shebang line would not resolve correctly.
       This is unlikely to occur when using sh. However, if the shell
       is bash, zsh, et al, it might be installed in different places
       on different machines.
       One way around this is to use the env command in the shebang
       line. So instead of 
       #!/bin/sh
       use
       #!/usr/bin/env sh
       Of course, env might itself live in some other directory than
       /usr/bin, but it's not likely.
15. Why doesn't echo do what I want?
    The echo command is not consistent from shell to shell. For
    example, some shells (bash, pdksh [,?]) use the following
    arguments
      -n suppress newline at the end of argument list
      -e interpret backslash-escaped characters
      -E disable interpretation of backslash-escaped characters, even
	 on systems where interpretation is the default.
    However, pdksh also allows using \c to disable a newline at the
    end of the argument list.
    POSIX only allows \c to be used to suppress newlines, and doesn't
    accept any of the above arguments.
    ksh88 and ksh93 leave the interpretation of backslash-escaped
    characters up to the implementation.
    [descriptions of behavior of other shells welcome]
    In short, you have to know how echo works in any environment you
    choose to use it in, and its use can therefore be problemmatic. If
    available, print(1) or printf(1) would be better.
16. How do I loop through files with spaces in their name?
    So, you're going to loop through a list of files? How is this list
    stored? If it's stored as text, there probably was already an
    assumption about the characters allowed in a filename. Every
    character except '\0' (NUL) is allowed in a file path on Unix.  So
    the only way to store a list of file names in a file is to
    separate them by a '\0' character (if you don't use a quoting
    mechanism as for xargs input).
    Unfortunately most shells (except zsh) and most standard unix text
    utilities (except GNU ones) can't cope with "\0"
    characters. Moreover, many tools, like "ls", "find", "grep -l"
    output a \n separated list of files. So, if you want to
    postprocess this output, the simpler is to assume that the
    filenames don't contain newline characters (but beware that once
    you make that assumption, you can't pretend anymore your code is
    reliable (and thus can't be exploited)).
    So, if you've got a newline separated list of files in a
    list.txt file, Here are two ways to process it:
    1-
    while IFS= read -r file <&3; do
      something with "$file" # be sure to quote "$file"
    done 3< list.txt
    (if your read doesn't have the "-r" option, either make another
    assumption that filenames don't contain backslashes, or use:
    exec 3<&0
    sed 's/\\/&&/g' < list.txt |
    while IFS= read file; do
      something with "$file" <&3 3<&-
    done
    )
    2-
    IFS="
    " # set the internal field separator to the newline character
      # instead of the default "<space><tab><NL>".
    
    set -f # disable filename generation (or make the assumption that
           # filenames don't contain *, [ or ? characters (maybe more
           # depending on your shell)).
    
    for file in $(cat < list.txt); do
      something with "$file" # it's less a problem if you forget to
                             # quote $file here.
    done
    
    Now, beware that there are things you can do before building
    this list.txt. There are other ways to store filenames. For
    instance, you have the positionnal parameters.
    
    with:
    set -- ./*.txt
    
    you have the list of txt files in the current directory, and no
    problem with weird characters. Looping through them is just a
    matter of:
    
    for file
    do something with "$file"
    done
    
    You can also escape the separator. For instance, with
    
    find . -exec sh -c 'printf %s\\n "$1" | sed -n '"':1
      \$!{N;b1
      }
      s/|/|p/g;s/\n/|n/g;p'" '{}' '{}' \;
      
    instead of
    
    find . -print
    
    you have the same list of files except that the \n in filenames
    are changed to "|n" and the "|" to "|p". So that you're sure
    there's one filename per line and you have to convert back "|n"
    to "\n" and "|p" to "|" before refering to the file.
17. how do I change my login shell?
    See  http://www.faqs.org/faqs/unix-faq/shell/shell-differences
    Unless you have a very good reason to do so, do not change root's
    default login shell. By "default login shell" is meant the shell
    recorded in /etc/passwd. Note that "I login as root but don't like
    the default shell" isn't a good reason.
    The default shell for root is one which will work in single user
    mode, when only the root partition is mounted. This is one of the
    contexts root works in, and the default shell must accomodate
    this. So if you change it to a dynamically linked shell which
    depends on libraries that are not in the root partition, you're
    asking for trouble.
    The safest way of changing root's shell is to login as root and
    then 
      # exec <your preferred shell with login flag>
      e.g.
      # exec ksh -l
    Another possibility is to add something to root's .profile or
    .login which checks to see if the preferred shell is runnable, and
    then execs it. This is more complicated and has more pitfalls than
    simply typing "exec <shell>" when you login though. For example,
    one of the libraries that the desired shell relies on might have
    been mangled, etc. One suggestion that has been made is
      if [ -x /usr/bin/ksh ]; then
        SHELL=/usr/bin/ksh; export SHELL
        ENV=/root/.kshrc; export ENV
        /usr/bin/ksh && exit
      fi
    Which is safer than trying to exec the shell, and more convenient
    than typing "exec ksh"
18. When should I use a shel instead of perl/python/ruby/tcl...
    a. Portability
       In many cases it can't be assumed that perl/python/etc are
       installed on the target machine. Many customer sites do not
       allow installation of such things. In cases like this, writing
       a shell script is more likely to be successful. In the extreme,
       writing a pure Bourne shell script is most likely to succeed.
    b. Maintainability
       If the script is one which serves some important purpose, and
       will need to be maintained after you get promoted, it's more
       likely that a maintainer can be found for a shell script than
       for other scripting languages (especially less used ones such
       as ruby, rexx, etc).
    c. Policy
       Sometimes you're just told what to use :-)
19. Why shouldn't I use csh?
   http://www.softlab.ntua.gr/facilities/documentation/unix/grymoire/CshTop10.txt
   http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/
   
20. How do I reverse a file?
    Non-standard commands to do so are GNU tac and "tail -r".  sed
    '1!G;h;$!d' is subject to sed limitation on the size of its hold
    space and is generally slow.
    
    The awk equivalent would be:
    
    awk '{l[n++]=$0}END{while(n--)print l[n]}'
    It stores the whole file in memory.
    
    The best approach in terms of efficiency portability and resource
    cosumption seems to be:
    
    cat -n | sort -rn | cut -f2-
    "cat -n" is not POSIX but appears to be fairly
    portable. Alternatives are "grep -n '^'", "awk '{print NR,$0}'",
    "nl" can't be used as it processes page headings.
21. how do I remove last 10 lines?
    a. The easy way.
       Use GNU grep if you have it available.
       
    First we need to tell the code how many lines we want to cut
    from the bottom of a file.
      X=10
    Then We can do this:
      echo $(head -$(( $(wc -l file | awk '{print$1}') - $X )) file) >\
      $$ && cat $$ > file && rm $$  
       The break down:  
       1) $(wc -l file | awk '{print$1}')
          Find out how many lines are in the file.  We have to use the
	  awk statement because the output will look like 
          "     38 file" and we only want the first data field.
       2) $(( $lines_in_file - $X ))
          Take the output from step one and do some math to find out
	  how many lines we want to have when all is said and done.
       3) head -$lines_when_said_and_done file
	  extract all but the unwanted lines from the file
       4) echo $all_but_the_unwanted_lines_from_the_file > $$
          this puts those lines into a temp file that has the name of
          the pid of the current shell.
       5) && cat $$ > file
          if everything has worked so far then cat the temp file into
          the original file.  This is better than mv or cp because it
	  insures that the permissions of the temp file do not
	  override with the perms of the original file.
       6) && rm $$
          Remove the temp file.
22. how do I get file size, or file modification time?
    ls will tell you all the things you want to know.  From the man
    page for ls we learn about "ls -l" the file mode, the number of
    links to the file, the owner name, the group name, the size of the
    file (in bytes), the timestamp, and the filename.  For the file
    size in human readable formate use the "-h" option.
23. How do I get a process id by process argv[0]?
    This depends on the shell being used. sh and Bourne derived shells
    use '.' (period/dot) to indicate that the argument should be read
    into the current shell, while csh and csh derived shells use
    "source" for the same thing.
    For example, with ksh, this would source a file into the current
    environment. 
    $ . somefile
24. How do I get a script to update my current environment?
    Processes in unix cannot update the environment of the process
    that spawned them. Consequently you cannot run another process
    normally and expect it to do that, since it will be a child of the
    running process.
    Instead, you have to "source" the script. This means that you use
    whatever syntax your shell has to read the desired script into the
    current environment. 
    In Bourne derived shells (sh/ksh/bash/POSIX/etc) the syntax would
    be 
      $ . script
    In csh this would be 
      $ source script 
25. How do I rename *.foo to *.bar?
   In ksh/bash
     $ ls *.foo | while read f;do mv $f ${f%.*}.bar
26. How do I use shell variables in awk scripts
    Depending on the version of awk being used, either use the -v
    command line option, or add the variable after the command, as in
      $ awk '{print $0,var}' file var=xxx
    See the man page for awk on your system to see which is
    applicable.
27. How do I get input from the user with a timeout?
    TBD
28. How do I get one character input from the user?
    TBD
29. why isn't my .profile read?
    TBD
30. why do I get "[5" not found in "[$1 -eq 2]"?
    Because you didn't RTFM :-)
    "[" is an alias for the "test" command. As such, it's called by a
    script like any other command (this applies even if test is
    builtin). Since the command line uses spaces to separate a command
    from its arguments, you have to put a space between '[' and its
    argument. So:
      $ [ -f xxx ] isn't the same as
      $ [-f xxx ]
    In the latter case, the shell will think that "[-f" is the
    command, not "[" with arguments "-f xxx ]
31. How do I exactly display the content of $var (with a \n appended).
    A: on POSIX systems or with shells with builtin printf (bash2,
    ksh93, zsh4.1, dash...)
    printf '%s\n' "$var"
    (except for memory/environment full errors, should be expected
    to work at least if $var is not longer than LINE_MAX (supposed
    to be at least _POSIX2_LINE_MAX == 2048), no harcoded limits in
    zsh/ksh/bash/dash builtins)
    ksh, zsh:
    print -r -- "$var"
    zsh:
    echo -E - "$var"
    Other bourne like shells:
    cat << EOF
    $var
    EOF
    (creates a temporary file and forks a process)
    expr "x$var" : 'x\(.*\)'
    (limited to 126 characters with some exprs, may return a
    non-null exit code).
    With ash:
    (unset a; ${a?$var}) 2>&1
32. How do I exactly display the content of $var (without a \n
    appended). 
    printf %s "$var"      # posix
    print -rn -- "$var"   # zsh/ksh
    echo -nE - "$var"     # zsh
    awk 'NR>1{print ""}{printf("%s",$0)}' << EOF
    $var
    EOF
    (with awk limitations [line length and number of fields])
Appendix A: Examples
[ need some examples ]
Appendix B: References
  [1]  http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=5vbsra%241iu%40access1.digex.net&rnum=5&prev=/groups%3Fas_q%3D%26num%3D100%26as_scoring%3Dr%26hl%3Den%26ie%3DUTF-8%26oe%3DUTF-8%26btnG%3DGoogle%2BSearch%26as_epq%3D%26as_oq%3D%26as_eq%3D%26as_ugroup%3Dcomp.unix.shell%26as_usubject%3D%26as_uauthors%3DPaul%2Bhite%26as_umsgid%3D%26lr%3D%26as_drrb%3Dq%26as_qdr%3D%26as_mind%3D12%26as_minm%3D5%26as_miny%3D1981%26as_maxd%3D20%26as_maxm%3D7%26as_maxy%3D2003%26safe%3Dimages


Relevant Pages

  • Re: [ Attn: Randy ] Ad-hoc Parsing?
    ... > therefore you loose the primary advantage of writing a script and may ... > matching Unix. ... You use six external programs ('tidy', 'make-make', ... program and not by the shell, ...
    (alt.lang.asm)
  • Re: How to get the application path
    ... Why do you limit this to "Unix"? ... shell process waiting until the the binary proper returns. ... Could you tell me the equivalent of my script in CMD.exe? ... installation location in the registry and how I ...
    (comp.lang.ada)
  • Re: [ Attn: Randy ] Ad-hoc Parsing?
    ... > within the script whereas sed.exe is a separate file on ... > when the external program is available). ... >>Because the Windows shell has inadequate quoting facilities. ... One can in Unix. ...
    (alt.lang.asm)
  • Re: How to get the application path
    ... Why do you limit this to "Unix"? ... shell process waiting until the the binary proper returns. ... Could you tell me the equivalent of my script in CMD.exe? ... Application paths in Windows are supposed to be set in the registry; ...
    (comp.lang.ada)
  • comp.unix.shell FAQ - Answers to Frequently Asked Questions
    ... It spells "unix" in lower case letters ... The other level is how to write shell scripts. ... than standard unix utilities, it should be included in this FAQ. ... How do I get a script to update my current environment? ...
    (comp.unix.shell)