Is there some restriction (or "kernel"parameter) for the depth of subshells (and functions in ksh) on AIX5.3-64bit after which these switch between correct and incorrect from level to level ?



I have a very alarming (not to say "shocking!") behaviour on a
customer's AIX (5.3-64 bit, the exact bos I haven't determined yet)
that a ksh-script called from a deeper level (of shell-script-calls)
inside of our Application fails to handle errorelvels correctly, in 2
different constructs, that is: receving the value of a return from a
function wrongly and later getting the wrong errorlevel after the exit
of a subshell, whereas the same script called directly from
commandline behaves perfect.

I try to make the situation simple but would be VERY happy for any
ideas, so I could go into more details of course later...
(I name the scripts here with extension referring to their shell, in
real life they are all named *bat or without extension)

from commandline directly:

A.ksh calls
B.ksh calls
C.ksh calls
D.ksh which has a function
f1() which calls
E.csh which has as last line
exit 1
the next line inside of f1 consists of
I=$?
and normally there is I==1, the function goes on, returns
(errorlevel irrelevant here)
life goes on, B.ksh calls another
M.csh which also calls the same substructure (just one level deeper,
in fact some logwritermodule ...) again
C.ksh calls
D.ksh which has a function
f1() which calls
E.csh which has as last line
exit 1
I=$? which also is ==1
B.ksh then later calls it's own function func1 with
T=$(func1) where func1 echoes a text in one of it's lines and does
return 1 as last line so
R=$? in the next line should be (and is when calling A.ksh directly)
1 also.

Called from a surrounding application which is called from commandline
(consisting of some additional levels of shells [and a microfocus
cobol programm making a SYSTEM-call of a script which calls A.ksh])
they get the values of 1 inside as well, that is E.csh makes exit 1 in
both cases and func1 returns 1
BUT:
the first exit 1 of E.csh is received as 0 in D.ksh, whereas the
second (one level deeper [! not one less but 1 level more !]) is
received correctly as 1
and the return 1 of func1 in B.ksh is misinterpreted as 0 (even higher
in call-hiarchy than the E.csh-exits anyway).

I debugged all relevant scripts B.ksh, D.ksh and E.csh including
debugging of the functions in B and D to get those values and I might
as well stop programming if I can't believe an exit or return
anylonger.

I can't reproduce the effect on my own AIX so I have a real problem of
comparing configurations of a customer and our development machine
which is bad I know (and worse not being root in any case) but I'm
fishing for ideas.

If it wouldn't work in any case (direct call of A.ksh and in the more
complex way) I would think there's something installed bad anyway on
the customer's machine but...
Needless to say that the same did work well in both methods in a
former release (of AIX 5.2.whatsoever...) and everything is
reproduceable at the customer's machines (old: simple=ok, complex=ok;
new: simple=ok,complex=notok) and what's worse: he's already using
this new machine in production so there's no way back or possibility
to delay some migration or so. They are working in that environment,
and some important parts of the application just fail!

Thank you for your patience and any comments welcome!
bine

.



Relevant Pages