Re: Interpreting program core dump in mdb
- From: Giorgos Keramidas <keramida@xxxxxxxxxxxxxxx>
- Date: Sat, 29 Mar 2008 17:05:23 +0200
On Thu, 27 Mar 2008 09:22:22 -0400, "Mr. Uh Clem" <uhclem@xxxxxxxxxxxxxxxxxx> wrote:
At $DAY_JOB, we've got a customer who has installed our product on a
Solaris 10 Sparc system and is getting a mysterious segment violation in
one of our background processes. Of course, this problem does not occur
on any of our inhouse systems.
We did get the customer to send us a core file, but aren't very handy
with the debug tools on Solaris.
# mdb prog core
Loading modules: [ libc.so.1 ld.so.1 ]
::stackstrncpy+0x5d0(20, 7182f4, 1b, 726f6f74, 0, 20)
secure+0x1b8(2e4088, b1978, c6068, 1f, 717298, 0)
process_request+0x41c(2e7d8, 1, c60e4, 1, 5750bc, 0)
open_socket+0x310(0, c8bf0, 5, 7efefeff, 81010100, ffbff9bc)
main+0x664(1, ffbffc1c, ffbffc24, c6000, c80fc, 3)
_start+0x108(0, 0, 0, 0, 0, 0)
I've googled up countless articles telling me that ::stack gets a
stack dump, but have yet to find one which tells me what the
values in the display **ARE**.
It looks like the daemon is overrunning a buffer inside strncpy().
Tracking down this sort of memory corruption can be tricky if it happens
in a child process (forking daemon), but you can use the libumem library
and mdb to debug this.
Early on, it calls secure() which is linked from a different .o file:
char user_name[USER_LENGTH + 1]; /* global in .c containing secure */
secure(host)
char *host;
{
...
struct passwd *pw;
...
pw = getpwuid(getuid());
if (pw != NULL)
strncpy(user_name, pw->pw_name, sizeof(user_name)-1);
We seem to blow up on trying to move the user name from pw->pw_name,
which is very strange given that pw is supposed to point to static
space allocated by getpwuid().
Is it possible that you have corrupted the stack elsewhere?
You can try enabling the debugging and auditing features of libumem.so
by running your program inside an mdb session, after setting up the
environment like this:
$ UMEM_DEBUG=default ; export UMEM_DEBUG
$ UMEM_LOGGING=transaction ; export UMEM_LOGGING
$ LD_PRELOAD=libumem.so.1 ; export LD_PRELOAD
$ mdb a.out
Then when inside mdb, set up a breakpoint at _exit and run the program:
> ::sysbp _exit
> ::run
After it crashes, load libumem.so and try the memory allocation tricks
described at:
http://developers.sun.com/solaris/articles/libumem_library.html
.
- References:
- Interpreting program core dump in mdb
- From: Mr. Uh Clem
- Interpreting program core dump in mdb
- Prev by Date: Problem using sockets
- Next by Date: Re: list of posix typedefs?
- Previous by thread: Re: Interpreting program core dump in mdb
- Next by thread: Process state information in winXP
- Index(es):
Relevant Pages
|