Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)
From: Stephane CHAZELAS (stephane_chazelas_at_yahoo.fr)
Date: 06/26/03
- Next message: Stephane CHAZELAS: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Previous message: Alexis Huxley: "Re: equivalent of 'VAR=VAL CMD' to unset VAR?"
- In reply to: Dan Mercer: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Next in thread: Dan Mercer: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Reply: Dan Mercer: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Date: 26 Jun 2003 09:29:41 GMT
Dan Mercer wrote:
[...]
>: 2- * is either '*' or a list of files
>:
>: echo *** title ***
>: will output "*** title ***" only if there's no file in the current
>: directory.
>:
>
> Yeah, but
>
> echo "*** title ***"
>
> won't have a problem.. So how is this a problem for any but the unitiated
> or ignorant?
Are you saying that David Korn is an unitiated or ignorant ksh
programmer? I wouldn't have dared ;).
He wrote:
getopts '[-]' opt --???man
quoted the first, not the second pattern, didn't quote "getopts"
(I guess he knew it was not a pattern, but to be sure, it
wouldn't have harmed to write:
'getopts' '[-]' 'opt' '--???man'
)
>: for file in *
>: will loop through files in the current directory. But you'll have a
>: nasty effect if there's no file.
>
> Not really. Not if you do it right:
>
> for file in *
> do
> [[ -a $file ]] || continue
> ...
> done
You tell it! So the right way to loop through the files in the
current directory is that? I don't find it much intuitive nor
legible.
Note that in src/cmd/ksh93/OBSOLETE in ksh93 source
distribution, one can read:
3. The newtest ([[ ... ]]) operator -a file is obsolete.
Use -e instead.
Although I agree it was more accurate. It doesn't check for
file existance, but for file accessibility.
So your loop loops through accessible files (which I agree
doesn't make much difference and even is preferable).
In zsh, by default, in
for file in *; do
...
done
if there's no file, you get an error message, and there's no
pass in the loop.
If you consider that it's a normal thing when there aren't files,
you write:
for file in *(N); do
...
done
>:
>: Worse:
>: rm file.[cC]
>: removes any of a file.c or file.C file if they exist. But, if they
>: don't, a file named "file.[cC]" might be unintentionnally removed.
>
> But this is a shell feature, not a shell programming feature. Any way you
> approach globbing will have trade offs.
In zsh, if I run this and there's no matching file, rm is not
run, I get an error message, so no way I can delete a file
unintentionnally..
[...]
>: As an example http://www.research.att.com/~dgk/ksh/script/env (on
>: David Korn's home page) contains "getopts '[-]' opt --???man".
>: --???man is a wildcard and David forgot to quote it. So if there's a
>: file named "--batman" in the current directory, his script will be
>: broken.
>:
>: ksh93 introduced many new globbing operators. So, for each character
>: you type, you have to be careful it's not a globbing operator.
>
> ksh93 has extended globbing sequences. It's unlikely any would show
> up as file names. However, is it really asking much to quote strings?
The "unlikely" thing is a bit unpleasant to the ear of a
programmer, unless you want to do approximate programming.
>
>:
>: So, to be sure, you should quote every word.
>
> That's ridiculous. You should provide an example if you believe that to be true.
I provided one (courtesy of David Korn).
In a real programming language, you have litteral strings,
litteral integer, keywords, variable references, function calls,
and a way to differenciate each (strings inside quotes, function
calls followed by (...)...). You can't take a function for a
string an integer for a keyword.
In shell, that's not the case. Everything is more or less mixed.
There's no clear boundary anywhere.
You may introduce one, by for example considering that a non
quoted word is either a keyword or a pattern, while a quoted one
is a string. So you could write things like:
for 'file' in *; do
'[' '-e' "$file" ']' || 'continue'
...
done
That would made syntax a little more consistant, help to avoid
problems in corner cases, but would make it even more illegible.
>
> But you can't quote
>: keywords. You have to perfectly know the syntax of ksh to be able to
>: use it reliably (did you know "while" was a keyword while "break"
>: was a builtin?).
>
> Of course - it takes an optional argument (the number of loops to exit).
> Admittedly, "break n" violates a lot of structured programming
> tenets, but it has its uses.
"time", "!" take arguments somehow, they are keywords
> It has that many corner cases that its almost
>: impossible. Look at David Korn's himself ksh scripts on its home
>: page, most contain numerous flaws even the shortest ones.
>
> Again, a single example would be instructive.
Other examples from Mr Korn. Again in its "env" script.
To clear the environment "typeset +x $(typeset +x)"
It only clears shell variable like environment variables, and
not the "_" one (well it's still exported to the next command
run).
Another one in the same file:
command unset $OPTARG 2> /dev/null
If the user thinks this env script is like the real env command,
he might want to unset the "*" variable.
In ksh, an unquoted variable can be considered as a list of
files (unless maybe if you know what's in it).
>:
>: 3- $var is either a pattern or a list or a string or combination of
>: those despite ksh93 has array type variables.
>:
>: ksh93 has arrays, you can do
>:
>: list=("file 1" "file 2")
>: for file in "${list[@]}"; do
>: something with "$file"
>: done
>:
>: So, the old Bourne
>: list="file 1,file 2" IFS=,
>: set -f
>: for file in $list; do
>: ...
>: done
>
> Why do you neeed to "set -f"
Because I consider $list as a list of strings, not as a list of
patterns. In that case, if I change $list to:
list="file [1],file [2]"
I really mean the file named "file [1]" and I expect my script
to keep running correctly.
On the other hand, if I wanted to consider the list as a list of
patterns, I could have ommitted the "set -f". But chances are I
would have prefered to do the filename generation at the
assignment level, not at the variable expansion one:
list=( "file "[1] "file [2]" *".txt" ) # in ksh
set -- file "[1] "file [2]" *".txt"
IFS=","
list="$*" # in sh
>:
>: should no longer be necessary. But ksh decided to keep the bourne
>: compatibility. Consequence is that you need to quote every *string
>: type* variable so that it is not considered as a Bourne like list or
>: a globbing pattern (or you must issue a "IFS=; set -f" at the
>: beggining of the script).
>
> again, I don't get your point. Can you show an example of how this
> would be a problem?
See the OPTARG example above. And half the scripts posted in
this newsgroup. In 95% of the time, people forget to quote
variable references.
>: Each time you want to split a string, you have to disable filename
>: generation (on by default).
>
> Again, I need to see an example of something broken.
For example, one of your posts:
Message-ID: <Cm6Fa.2548$fe.16173@twister.rdc-kc.rr.com>
function split {
# split Arrayname delimiter values
typeset _vname=${1:?} _IFS=${2:?}
shift 2
set -A $vname -- $*
}
Well, I guess you meant vname and IFS instead of _vname and _IFS=.
What happens if I do:
'split' 'array' ',' '*,+,/,-'
?
Or
'split' 'array_name' '_' '1_2_3_4'
[...]
>: 4- cryptic syntax.
>:
>: copying two arrays is:
>:
>: set -A new_array -- "${old_array[@]}"
>:
>: compare with rc:
>: new_array = $old_array
>
> Yes, but in ksh $array and ${array[0]} are synonymous. This allows you
> to do many things. Indeed, the standard method in ksh88 of not
> having ENV set for scripts depended on this.
Yes $var is either a string, a list of pattern, the 0th element of an array...
That's not what I call a consistant typing. Compare with rc:
Default (and only) variable type in rc is array:
var = 'string'
is actually a shortcut for
var = ('string')
It's the same in ksh93 but who knows it?
in rc, there's no word splitting of variable expansion, when you do
command $var
it's the same as ksh's:
command "${var[@]}"
In ksh, "command $var" passes the element of indice 0 of $var on
which is performed filename generation, tilde expansion and
word splitting.
zsh arrays and string variables are of different types.
var1="string" # defines a "scalar" type variable
var2=("string") # defines a "array" type variable
command $var1
# passes *one* "scalar" argument to command
command $var2
# passes as many arguments as there are elements in the array to
# command.
Don't you find it a more intuitive and consistant behavior?
[...]
> Gee, I did it for 11 years. Lots of dtksh scripts too, used by hundreds of users
> with no defects reported.
I'm sure *you* can write correct scripts. But admit ksh93
doesn't "ROCK" (as it was what I was replying to) and shouldn't
be advised to a beginner and certainly not for "programming".
>: 5- command substitution removes too many NLs
[...]
> This is documented. You have a choice - every trailing newline or none.
> Having every trailing newline would create more problems than having none.
So, you agree?
>: So, you'd think
>:
>: print -r -- "$(cmd)"
>: outputs the same as
>: cmd
>
> Not if you read the man page.
But "yes", if you'd expect ksh programming to be a bit intuitive
and consistant.
>: It doesn't, except when the value returned by the command doesn't
>: end with a newline character.
>:
>: To work around this, you have to do:
>: var=$(cmd; echo .)
>: var=${var%?} var=${var%?}
>
> I wouldn't know. In 11 years it never came up.
That's again about the "unlikely". As long as you don't write
software for powerplants or CGIs, you may not care about
writing approximate software that work most of the time.
I wouldn't write powerplant software in perl either. But many
write CGI scripts in perl, perl being a programming language
(with its flaws too, but far less than in ksh).
>: 6- danger of using a shell both as a shell and as a programming
>: language.
[...]
>: In a shell, most forget to do
>: cd -- "$var" || exit
>: because they are used to only typing
>: cd $var
>: at the prompt and look at the result.
>
> You can't blame bad programming technique on anything but bad
> programmers. "Look at this stupid gun. I put it to my head and pull the trigger
> and ..."
>
> You can't both argue that ksh is not a programming language and then blame it
> for people using it in a manner not appropriate for a programming language.
Yes, but you and David Korn and I and every one fall in that
trap.
>: Moreover, commands are often made to be run interactively. The
>: information on failure is a text intended for a human on standard
>: error. So, generally, a script will be unable to parse that error
>: message to take appropriate action.
>:
>: That's one of the most important reasons why a shell shouldn't be
>: seen as a programming language.
>
> Oh gee, and there's no way we could capture stderr!
Yes, uneasily, making scripts even more illegible, you'd have to
cope with message localization, write a complex parser for every
error. In other programming languages, library functions return
a errno suitable for analysis by an /automate/ (and *then* you
display an error message to the human user). Shell library
functions are commands whose output is directly intended to a
human user, in a narrower extent to an automate (the exit status
is often limited to success/error).
[...]
>: echo "${variable//\\/\\\\}"
>: (provided that for some obscur reason, for example the content of
>: the PATH variable, echo doesn't recognize options)
>
> echo is a shell builtin. What possible effect could the PATH variable have on it?
src/cmd/ksh93/bltins/print.c:
int B_echo(int argc, char *argv[],void *extra)
{
[...]
if(!bsd_univ)
return(b_print(0,argv,&prdata));
[...]
if(argv[1] && strcmp(argv[1],"-n")==0)
prdata.echon = 1;
Look at src/lib/libast/port/astconf.c
for how bsd_univ is determined
~$ PATH=/usr/ucb:/bin /usr/local/bin/ksh93 -c 'echo -n'
~$ PATH=/bin:/usr/ucb /usr/local/bin/ksh93 -c 'echo -n'
-n
~$
[...]
> What if I want to remove a file name -P? We see that one often enough. Again,
> this is a tradeoff with using getopts and the inability of software to guess what
> you want.
Yes, that inability is proper to shell programming. In every
real language, there's no problem with
unlink("-P")
chdir("-P")
[...]
>: cd -- "$1" && rm -f -- *
>:
>: The problem is that some commands don't accept that syntax, and it
>: can't be applied to some others because they have not the shape:
>: command [options] arguments
>
> Now you are just getting silly.
Thanks, why? Because I try to imagine ways how to do programming
with shells? I think I have to agree, that's a bit silly.
[...]
>: 11- behavior of a script depends on its name!
>:
>: cd /tmp
>: PATH=$PATH:
>: echo '#! /usr/local/bin/ksh93' > script.ksh
>: chmod 755 script.ksh
>: ln -s script.ksh ./--man
>: ln -s script.ksh ./-i
>:
>: "script.ksh", "-i" and "--man" are supposed to be the same script.
>: However, the first one does nothing, the second runs an interactive
>: shell, and the third one displays a short ksh93 man page!
>
> Again, is this really a problem?
Not really, except in the case of setuid scripts (check the
unix-faq). And you'll have problem with perl or python too,
that's more a shebang issue than a shell issue.
> Guess what, you can't name a script
> "while" or "test" or "for" etc. either.
Sure I can, why couldn't I?
That's even another arguments toward always quoting every
argument.
If I write:
'while' 'arg1' 'arg2'
I kind of force the typing. I'm sure 'while' is taken as a
command argv[0]. Maybe in ksh93p, there will be a new "perhaps"
keyword, so you'll be happy that your scripts correctly written
as:
function 'perhaps' {
'typeset' 'IFS= '
'print' '-r' '--' "Maybe I should $*"
}
'perhaps' 'try another shell'
will continue to work.
[...]
>: 12- Documentation.
>:
>: ksh93 documentation is disseminated. The man page doesn't describe
>: half of the behavior. To know about a builtin, you have to run ksh93
>: and issue the (undocumented):
>
> It certainly is not undocumented. Under "BUILTIN COMMANDS" it clearly
> states to enter "cmd --man" to get a man page for the command.
Yes --man is. I missed that. "--api", "--author", "--html",
"--about", "--nroff", "--long", "--short", "--help", "--keys"...
are not.
>:
>: builtin --man
>: or
>: builtin --help
>: or
>: builtin --html
>: (yes all the documentation is in the binary ksh93 executable! This
>: certainly goes against most system documentation policy)
>
> Not against mine. I've always included documentation with everything
> I've written.
I generally include it in a man page. To learn about a software,
I type "man <software>", not "<software> --man 2>&1 | ${PAGER-more}"
>:
>: That doesn't work with every builtin (echo, :),
>
> Of course not. Only with commands that already take arguments.
> Otherwise you would not be backwards compatible.
That's why I say, that was a silly way to do it. and that bash's
help builtin was better.
> and for keywords
>
> keywords don't take arguments. They are part of the syntax.
Yes, same as above. In bash, you can do "help for".
>
>: (for, while...). At least, even if it's arguable to put the
>: documentation in the code, bash chose a dedicated builtin for that
>: (help) and has a full man/info/html page.
> different strokes for different folks
Yes, the poster I replied to said "ksh93 was far superior to
bash". Not on this.
>: 13- Poor interactive shell
>:
>: shell vocation is to be used at prompt. ksh93 is one of the most
>: poorly featured one: no programmed (or even programmable)
>: completion, no incremental search, poor history facilities, poor
>: extended key handling, poor prompt facilities
>
> No, different prompt facilities. You just don't know how to use them, appparently.
No, I didn't even try. I had just to compare zsh's prompt
expansion, prompt themes with ksh's.
>
> , no multiline command
>: editing,
>
> Of course there is. You can even choose the editor.
No, I meant a shell command line editor; you can't search the
shell history when you've started vi, for instance.
> no spelling correction, no rc file except via the dumb ENV
>
> What's so dumb about it?
Many reasons. First the name. Who can imagine a less specific
name than "ENV" for an environment variable. Every program could
have used this name for a variable used to set up its
environment. I would have expected at least KSH_ENV or
KSH93_ENV.
Its content is read for every shell (interactive or script
interpreters). By "rc" file, I meant a shell customization file.
If, at prompt, I like the noclobber option, and put it in the
ENV file, I'll break most scripts.
This var is shared for too different purposes (more if you
consider this variable is read by every POSIX compliant shell),
why not having used two separate mechanisms (look zsh and its
zshenv and zshrc files).
That's why you see ugly things like
export INTERACTIVE_ENV="$HOME/.kshrc"
export ENV='${INTERACTIVE_ENV[(Z$-=0)+(Z=1)-Z${-%%*i*}]}'
(which doesn't allow for a ENV for non-interactive shells).
> At least ksh's policy doesn't violate
> structured programming tenets.
How can you speak of "structured" which such a fuzzy syntax...
[...]
>: for i in 1 2
>: {
>: print -r -- "$i"
>: }
>:
> cool
Isn't it? It's in every Bourne like shell.
[...]
>: 16- arithmetic expansion that depends on locale
>:
>: #! /usr/local/bin/ksh93 -
>: typeset -F pi=3.14159265359
>: echo "cos(pi/2) is $(( cos(pi/2) ))"
>:
>: when run in a french locale gives:
>:
>: ksh93: .[2]: typeset: 3.14159265359: arithmetic syntax error
>: because in french, the decimal point is a comma.
>
> Good to know.
And I didn't speak of all the issues related to LC_COLLATE, LC_CTIME...
In other programming languages, you have to explicitely activate
localization. In shells, you have to disable it.
>: 17- "export var" when var is unset doesn't put "var" in the environment.
>
> And it shouldn't. It only marks it for export. "export var=" will put it in the
> environment.
Yes, you're right even if that depends on how you interpret POSIX:
The shell shall give the export attribute to the variables
corresponding to the specified names, which shall cause them
to be in the environment of subsequently executed commands.
Most modern shells seem to agree with you.
[...]
>: 19- arrays start at index 0
[...]
> Oh, c'mon - that's really picky.
But that's annoying. That's a small one of ksh's many
inconsistancies.
[...]
>: Compare with "es":
[..]
>: or zsh:
[...]
> So use them instead.
Of course I do. My point here is to say that no, we can't say
"KSH93 ROCKS", use zsh instead for interactive shell, and a
programming language for programming stuff, ksh93 is good at
neither one.
[...]
>: 24- exit status policy different from any other shell's
>:
>: $? contains 256 + signum when a command was killed instead of 128 +
>: signum in every other shell.
>
> Every other shell got it wrong?
No. I prefer the ksh93 way, but it breaks compatibility
[...]
> , kind of regular
>: expressions, math library function...), but they are brought over a
>: really broken basis. You don't build a spaceship on top of a bicycle
>: frame!
>
> But the Wright bros built an airplane that way. Look, you may not
> want to fly an airplane on a ksh script, but it's damn useful in a hundred other
> ways.
Yes, in response to "KSH93 ROCKS", I say no,
It's a really poor bicycle, use zsh instead.
It's a really broken airplane, use python instead.
-- Stéphane
- Next message: Stephane CHAZELAS: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Previous message: Alexis Huxley: "Re: equivalent of 'VAR=VAL CMD' to unset VAR?"
- In reply to: Dan Mercer: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Next in thread: Dan Mercer: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Reply: Dan Mercer: "Re: [Long] about ksh93 (Was: Bourne Shell Programming on Windows)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|