Re: sysctl lock, system lockup

From: Tim Robbins (tim_at_robbins.dropbear.id.au)
Date: 05/31/04

  • Next message: Vincent: "Re: USB not working on centrino based laptop"
    Date: Mon, 31 May 2004 14:22:34 +1000
    To: Don Bowman <don@sandvine.com>
    
    

    On Sun, May 30, 2004 at 11:31:14PM -0400, Don Bowman wrote:
    > From: Tim Robbins [mailto:tim@robbins.dropbear.id.au]
    > > On Sun, May 30, 2004 at 10:18:34PM -0400, Don Bowman wrote:
    > > > From: Tim Robbins [mailto:tim@robbins.dropbear.id.au]
    > > > > On Sun, May 30, 2004 at 04:35:55PM -0400, Don Bowman wrote:
    > > > > > From: Don Bowman [mailto:don@sandvine.com]
    > > > > > > On the console i ran 'top', but it wouldn't start,
    > > > > > > giving:
    > > > > > >
    > > > > > > load: 0.00 cmd: top 4282 [sysctl lock] 0.00u 0.00s 0% 180k
    > > > > > >
    > > > > > > as the status. I can't ^C it, can't ssh in.
    > > > > > > can still ping the device.
    > > > > > >
    > > > > > > It was doing a backgound fsck from an earlier hang.
    > > > > > >
    > > > > > > i have called panic from db, not sure if the core will
    > > > > > > work properly or not.
    > > > > >
    > > > > > As a followup... i did get a vmcore, and matching kernel.debug,
    > > > > > if someone can suggest what i might look @?
    > > > >
    > > > > print sysctllock (or just sysctllock.sx_xholder if you
    > > don't have a
    > > > > serial console set up.)
    > > >
    > > > (kgdb) print sysctllock
    > > > $1 = {sx_object = {lo_class = 0xc070dacc, lo_name =
    > > 0xc06ce43d "sysctl
    > > > lock",
    > > > lo_type = 0xc06ce43d "sysctl lock", lo_flags = 3866624,
    > > lo_list = {
    > > > tqe_next = 0xc074f9e0, tqe_prev = 0xc0747ab0}, lo_witness =
    > > > 0xc0751410},
    > > > sx_lock = 0xc0748e80, sx_cnt = -1, sx_shrd_cv = {
    > > > cv_description = 0xc06ce43d "sysctl lock", cv_waiters = 0},
    > > > sx_shrd_wcnt = 0, sx_excl_cv = {cv_description =
    > > 0xc06ce43d "sysctl lock",
    > > >
    > > > cv_waiters = 9}, sx_excl_wcnt = 9, sx_xholder = 0xc8ee2150}
    > >
    > > Hmm. How about the value of sysctllock.sx_xholder->td_proc?
    > > Then, if possible,
    > > switch to that process (with gdb's proc command) and try to
    > > get a backtrace.
    > > (I admit to not having used this feature recently; I'm not
    > > completely sure
    > > that it still works. You may need to pass it a thread pointer
    > > instead.)
    >
    >
    > (kgdb) p sysctllock.sx_xholder->td_proc
    > $1 = (struct proc *) 0xc8eddc08
    > (kgdb) proc 0xc8eddc08
    > (kgdb) bt
    > #0 0xc0550340 in sched_switch (td=0xc8ee2150)
    > at /usr/src/sys/kern/sched_4bsd.c:666
    > #1 0xc0545dfe in mi_switch (flags=1945947512)
    > at /usr/src/sys/kern/kern_synch.c:359
    > #2 0xc055d382 in sleepq_switch (wchan=0x0)
    > at /usr/src/sys/kern/subr_sleepqueue.c:374
    > #3 0xc055d53f in sleepq_wait (wchan=0xe15dbc28)
    > at /usr/src/sys/kern/subr_sleepqueue.c:478
    > #4 0xc0545ac6 in msleep (ident=0xe15dbc28, mtx=0xc0774a00, priority=76,
    > wmesg=0xc06d4ad5 "biord", timo=0) at /usr/src/sys/kern/kern_synch.c:250
    > #5 0xc058193f in bwait (bp=0xe15dbc28, pri=76 'L', wchan=0xc06d4ad5
    > "biord")
    > at /usr/src/sys/kern/vfs_bio.c:3766
    > #6 0xc0580525 in bufwait (bp=0xe15dbc28) at
    > /usr/src/sys/kern/vfs_bio.c:3048
    > #7 0xc057c9be in breadn (vp=0xc937ba28, blkno=-18688012, size=16384,
    > rablkno=0x0, rabsize=0x0, cnt=0, cred=0x0, bpp=0x0)
    > at /usr/src/sys/kern/vfs_bio.c:749
    > #8 0xc057c724 in bread (vp=0xc937ba28, blkno=-18688012, size=16384,
    > cred=0x0,
    > bpp=0xf835e9d8) at /usr/src/sys/kern/vfs_bio.c:684
    > #9 0xc061ab93 in ffs_balloc_ufs2 (vp=0xc937ba28, startoffset=0, size=16384,
    >
    > cred=0xc53d5180, flags=131072, bpp=0xf835eadc)
    > at /usr/src/sys/ufs/ffs/ffs_balloc.c:702
    > #10 0xc0621191 in ffs_snapremove (vp=0xc937ba28)
    > at /usr/src/sys/ufs/ffs/ffs_snapshot.c:1463
    > #11 0xc0626a70 in softdep_releasefile (ip=0xc9309460)
    > at /usr/src/sys/ufs/ffs/ffs_softdep.c:3266
    > #12 0xc063303d in ufs_inactive (ap=0x0) at
    > /usr/src/sys/ufs/ufs/ufs_inode.c:88
    > #13 0xc063a21f in ufs_vnoperate (ap=0x0)
    > at /usr/src/sys/ufs/ufs/ufs_vnops.c:2819
    > #14 0xc058c60e in vput (vp=0xc937ba28) at vnode_if.h:953
    > #15 0xc0618992 in sysctl_ffs_fsck (oidp=0x0, arg1=0xf835ec90, arg2=0,
    > req=0x0)
    > at /usr/src/sys/ufs/ffs/ffs_alloc.c:2292
    > #16 0xc0547553 in sysctl_root (oidp=0x0, arg1=0xf835ec90, arg2=0,
    > req=0xf835ec08) at /usr/src/sys/kern/kern_sysctl.c:1220
    > #17 0xc0547714 in userland_sysctl (td=0x0, name=0xf835ec84, namelen=3,
    > old=0xf835ec08, oldlenp=0x0, inkernel=0, new=0x8059f00, newlen=0,
    > retval=0xf835ec80) at /usr/src/sys/kern/kern_sysctl.c:1317
    > #18 0xc05475d5 in __sysctl (td=0xc8ee2150, uap=0xf835ed14)
    > at /usr/src/sys/kern/kern_sysctl.c:1254
    > #19 0xc06813a7 in syscall (frame=
    > {tf_fs = 47, tf_es = 47, tf_ds = -1078001617, tf_edi = 3, tf_esi = 0,
    > tf_ebp = -1077941560, tf_isp = -130683532, tf_ebx = 1746122828, tf_edx =
    > 134584952, tf_ecx = 0, tf_eax = 202, tf_trapno = 12, tf_err = 2, tf_eip =
    > 1745649783, tf_cs = 31, tf_eflags = 658, tf_esp = -1077941620, tf_ss = 47})
    > at /usr/src/sys/i386/i386/trap.c:1004
    > #20 0x680c8077 in ?? ()
    > Cannot access memory at address 0xbfbfeac8
    > (kgdb)

    I'm not sure where to go from here. A deadlock doesn't seem likely, but
    it's possible that background fsck could lock up the system for quite
    some time by using this sysctl. How long did you wait before dropping to
    ddb (approximately)?

    Tim
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Vincent: "Re: USB not working on centrino based laptop"

    Relevant Pages

    • Re: Locking fixes for sf(4)
      ... On Friday 12 August 2005 08:15 am, Christian Brueffer wrote: ... If you could just get the backtrace from ddb that would probably be ... To unsubscribe, ...
      (freebsd-current)
    • Re: ffs snapshot lockup
      ... then running kgdb on the core may show ... and backtrace of the paniced thread. ... YOu can also do 'show msgbuf' from DDB. ... locked the kernel and we broke to debugger from the watchdog timeout ...
      (freebsd-stable)
    • Re: ffs snapshot lockup
      ... then running kgdb on the core may show ... and backtrace of the paniced thread. ... YOu can also do 'show msgbuf' from DDB. ... locked the kernel and we broke to debugger from the watchdog timeout ...
      (freebsd-stable)
    • Re: 2.6.15-rc1-mm2 0x414 Bad page states
      ... Backtrace: ... but a reboot is needed ... Thanks for taking care of this, ... To unsubscribe from this list: send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: ACPI trouble with EPIA-M
      ... At the time of the hang a 'ps' in DDB shows two screenful's of ... Doing a simple 'tr' just gives the backtrace of how I got ... into DDB which - I presume - is not relevant to this problem. ... 0000200 new swapper ...
      (freebsd-current)