Re: system crash tickled by 450.status-security (fwd)

From: Jorn Argelo (jorn_at_wcborstel.nl)
Date: 01/28/04

  • Next message: Chris Pressey: "Re: showing total/free memory"
    To: juhlig@parc.com
    Date: Wed, 28 Jan 2004 22:44:19 +0100
    
    

    Well, I recall FreeBSD 5.1 having problems with the RAID controller
    that is being used by the PE 2650 (a DELL PERC 3/Di or something
    wasn't it?). I don't know how it is with 4.9 though, never tried that.

    We were using Nagios and MRTG on that Box, which is a monitoring tool.
    And well, it had to get about 5 or 6 SNMP checks plus several port
    checks from about 175 servers, so it had quite a load. Thus it
    resulted into a complete system crash frequently.

    Unfortunately I can't give you a real solution. The funny thing was,
    I tried upgrading it to FreeBSD 5.1-CURRENT but that wasn't working
    at all. So I reinstalled it again to RELEASE, recompiled the kernel
    with the same configuration file as I did with the previous one, and
    suddenly it was all fine. It has an uptime from 31 days now.

    I know this message isn't going to help you too much, but I thought
    it might be handy to know that you were not the only one having
    problems with the Dell PowerEdge 2650.

    Cheers,

    Jorn

    On Wed, 28 Jan 2004 13:27:29 PST, John Uhlig <juhlig@parc.com> wrote:

    >
    > We are running FreeBSD 4.9 on 2 Dell poweredge 2650's as fileservers
    > each with 1 TB of RAID disk file space. Both crash and reboot every few
    > days at approx. 3:15AM. It appears that the systems are running
    > /etc/periodic/
    > daily/450.status-security script when the crash occurs. Running the daily
    > cronjobs more frequently induces the crash more often.
    >
    > We have a kernel core dump and have included some of the gdb output
    > below. I would appreciate any pointers or suggestions that can help
    > us resolve this problem.
    >
    > thanks,
    > John Uhlig
    >
    > ===================================================================
    > uname output
    > ====================================================================
    > platoon# uname -a
    > FreeBSD platoon.parc.xerox.com 4.9-RELEASE-p1 FreeBSD 4.9-RELEASE-p1 #0:
    > Wed Jan 28 08:45:33 PST 2004
    > juhlig@platoon.parc.xerox.com:/usr/obj/usr/src/sys/PARCGBNIC.debg i386
    >
    > =================================================================
    > initial gdb output
    > ==================================================================
    > SMP 4 cpus
    > IdlePTD at phsyical address 0x0051f000
    > initial pcb at physical address 0x0044e560
    > panicstr: page fault
    > panic messages:
    > ---
    > Fatal trap 12: page fault while in kernel mode
    > mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
    > fault virtual address = 0xbfc00000
    > fault code = supervisor write, page not present
    > instruction pointer = 0x8:0xc0356149
    > stack pointer = 0x10:0xffbe1e04
    > frame pointer = 0x10:0xffbe1e10
    > code segment = base 0x0, limit 0xfffff, type 0x1b
    > = DPL 0, pres 1, def32 1, gran 1
    > processor eflags = interrupt enabled, resume, IOPL = 0
    > current process = 701 (sed)
    > interrupt mask = none <- SMP: XXX
    > trap number = 12
    > panic: page fault
    > mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
    > boot() called on cpu#0
    >
    > syncing disks... 52
    > done
    > Uptime: 20m43s
    > amr0: flushing cache...done
    >
    > dumping to dev #aacd/0x40001, offset 5243136
    >
    > ===================================================================
    > List code at instruction pointer address
    > ====================================================================
    > (kgdb) list *0xc0356149
    > 0xc0356149 is in pmap_qenter (/usr/src/sys/i386/i386/pmap.c:848).
    > 843 void
    > 844 pmap_qenter(vm_offset_t va, vm_page_t *m, int count)
    > 845 {
    > 846 while (count-- > 0) {
    > 847 pt_entry_t *pte = vtopte(va);
    > 848 *pte = VM_PAGE_TO_PHYS(*m) | PG_RW | PG_V |
    > pgeflag;
    > 849 #ifdef SMP
    > 850 cpu_invlpg((void *)va);
    > 851 #else
    > 852 invltlb_1pg(va);
    >
    > =====================================================================
    > backtrace
    > =====================================================================
    > (kgdb) backtrace
    > #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
    > #1 0xc01d85c3 in boot (howto=256) at
    > /usr/src/sys/kern/kern_shutdown.c:316
    > #2 0xc01d8a1c in poweroff_wait (junk=0xc03d3819, howto=-1069731121)
    > at /usr/src/sys/kern/kern_shutdown.c:595
    > #3 0xc035a4d8 in trap_fatal (frame=0xffbe1dc4, eva=3217031168)
    > at /usr/src/sys/i386/i386/trap.c:974
    > #4 0xc035a169 in trap_pfault (frame=0xffbe1dc4, usermode=0,
    > eva=3217031168)
    > at /usr/src/sys/i386/i386/trap.c:867
    > #5 0xc0359cdb in trap (frame={tf_fs = 24, tf_es = -67108848, tf_ds =
    > 134545424,
    > tf_edi = -67978584, tf_esi = 0, tf_ebp = -4317680, tf_isp =
    > -4317712, tf_ebx = 3,
    > tf_edx = -1043397044, tf_ecx = 0, tf_eax = 1122230275, tf_trapno =
    > 12, tf_err = 2,
    > tf_eip = -1070243511, tf_cs = 8, tf_eflags = 66054, tf_esp =
    > 134606848,
    > tf_ss = 134606848}) at /usr/src/sys/i386/i386/trap.c:466
    > #6 0xc0356149 in pmap_qenter (va=0, m=0xfbf2baa8, count=4)
    > at /usr/src/sys/i386/i386/pmap.c:848
    > #7 0xc01e91fe in pipe_build_write_buffer (wpipe=0xfbf2ba80,
    > uio=0xffbe1ed0)
    > at /usr/src/sys/kern/sys_pipe.c:594
    > #8 0xc01e93c4 in pipe_direct_write (wpipe=0xfbf2ba80, uio=0xffbe1ed0)
    > at /usr/src/sys/kern/sys_pipe.c:709
    > #9 0xc01e9766 in pipe_write (fp=0xcb801000, uio=0xffbe1ed0,
    > cred=0xc875cc00, flags=0,
    > p=0xfc001080) at /usr/src/sys/kern/sys_pipe.c:827
    > #10 0xc01e7ae9 in dofilewrite (p=0xfc001080, fp=0xcb801000, fd=1,
    > buf=0x805b000,
    > nbyte=16384, offset=-1, flags=0) at /usr/src/sys/sys/file.h:163
    > #11 0xc01e79a2 in write (p=0xfc001080, uap=0xffbe1f80)
    > at /usr/src/sys/kern/sys_generic.c:329
    > #12 0xc035a809 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
    > tf_edi = 134590464,
    > tf_esi = 672071960, tf_ebp = -1077937248, tf_isp = -4317228, tf_ebx
    > = 672072428,
    > tf_edx = 672071960, tf_ecx = 0, tf_eax = 4, tf_trapno = 7, tf_err =
    > 2,
    > tf_eip = 672025636, tf_cs = 31, tf_eflags = 663, tf_esp =
    > -1077937292, tf_ss = 47})
    > at /usr/src/sys/i386/i386/trap.c:1175
    > #13 0xc034517b in Xint0x80_syscall ()
    > #14 0x280e2902 in ?? ()
    > #15 0x280e2871 in ?? ()
    > #16 0x280df756 in ?? ()
    > #17 0x28088fb5 in ?? ()
    > #18 0x804b81f in ?? ()
    > #19 0x804a926 in ?? ()
    > #20 0x8048f96 in ?? ()
    >
    > ================================================================================
    >
    >
    >
    >
    > _______________________________________________
    > freebsd-questions@freebsd.org mailing list
    > http://lists.freebsd.org/mailman/listinfo/freebsd-questions
    > To unsubscribe, send any mail to
    > "freebsd-questions-unsubscribe@freebsd.org"

    _______________________________________________
    freebsd-questions@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-questions
    To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"


  • Next message: Chris Pressey: "Re: showing total/free memory"

    Relevant Pages

    • Re: ADS after deploying image receive error 0X0000007B
      ... Dell PERC RAID Support Section ... DELL PERC 2/DC RAID Controller ... this error occurs whenever the SCSI driver does not match the SCSI> controller or when the SCSI controller itself has gone bad. ...
      (microsoft.public.windows.server.general)
    • Re: Adding a Drive to RAID
      ... This depends on the raid controller in your server. ... Is this a Dell Perc 4DI ... a 4th drive to grow the available disk space and I have Dell Array Manager ... So, when I add this 4th drive and configure it using the Array Manager, ...
      (microsoft.public.windows.server.sbs)