Re: panic in deadlkres



2010/6/28 John Baldwin <jhb@xxxxxxxxxxx>:
On Friday 25 June 2010 4:52:22 pm pluknet wrote:
On 25 June 2010 13:50, Anton Yuzhaninov <citrin@xxxxxxxxx> wrote:
I've got panic on 9-current from Jun 25 2010

May be this is bug in deadlock resolver

panic: blockable sleep lock (sleep mutex) process lock @
/usr/src/sys/kern/kern_clock.c:203

db> show alllocks
Process 0 (kernel) thread 0xc4dcd270 (100047)
shared sx allproc (allproc) r = 0 (0xc0885ebc) locked @
/usr/src/sys/kern/kern_clock.c:193

db> show lock 0xc4dcd270
 class: spin mutex
 name: D
 flags: {SPIN, RECURSE}
 state: {OWNED}

(kgdb) bt
#0  doadump () at pcpu.h:248
#1  0xc05ae59f in boot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:416
#2  0xc05ae825 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:590
#3  0xc048ff45 in db_panic (addr=Could not find the frame base for
"db_panic".
) at /usr/src/sys/ddb/db_command.c:478
#4  0xc0490533 in db_command (last_cmdp=0xc086ef1c, cmd_table=0x0,
dopager=1) at /usr/src/sys/ddb/db_command.c:445
#5  0xc0490662 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
#6  0xc04923ef in db_trap (type=3, code=0) at
/usr/src/sys/ddb/db_main.c:229
#7  0xc05dade6 in kdb_trap (type=3, code=0, tf=0xc4b31bd0) at
/usr/src/sys/kern/subr_kdb.c:535
#8  0xc078696b in trap (frame=0xc4b31bd0) at
/usr/src/sys/i386/i386/trap.c:692
#9  0xc076ca0b in calltrap () at /usr/src/sys/i386/i386/exception.s:165
#10 0xc05daf30 in kdb_enter (why=0xc07ea02d "panic", msg=0xc07ea02d
"panic") at cpufunc.h:71
#11 0xc05ae806 in panic (fmt=0xc07efd94 "blockable sleep lock (%s) %s @
%s:%d") at /usr/src/sys/kern/kern_shutdown.c:573
#12 0xc05ee30b in witness_checkorder (lock=0xc5148088, flags=9,
file=0xc07e3b20 "/usr/src/sys/kern/kern_clock.c", line=203, interlock=0x0)
   at /usr/src/sys/kern/subr_witness.c:1067
#13 0xc05a093c in _mtx_lock_flags (m=0xc5148088, opts=0, file=0xc07e3b20
"/usr/src/sys/kern/kern_clock.c", line=203)
   at /usr/src/sys/kern/kern_mutex.c:200
#14 0xc05706a9 in deadlkres () at /usr/src/sys/kern/kern_clock.c:203
#15 0xc0588721 in fork_exit (callout=0xc05705ea <deadlkres>, arg=0x0,
frame=0xc4b31d38) at /usr/src/sys/kern/kern_fork.c:843
#16 0xc076ca80 in fork_trampoline () at
/usr/src/sys/i386/i386/exception.s:270

Hi!

[throw in ideas (just ignore them if they're dumb, thinking badly atm).]

AFAIK, that indicates that some thread already has
a spin mutex and then it tries to acquire a sleep mutex.

Looks like kern/kern_clock.c v1.213 (SVN rev 206482)
has a regression in handling ticks wrap-up
w.r.t. it doesn't release a thread mutex, does it?

This looks like a correct analysis to me.

From subr_witness.c:
1062:                 * Since spin locks include a critical section, this
check
1063:                 * implicitly enforces a lock order of all sleep
locks before
1064:                 * all spin locks.
1065:                 */
1066:                if (td->td_critnest != 0 && !kdb_active)
1067:                        panic("blockable sleep lock (%s) %s @ %s:%d",
1068:                            class->lc_name, lock->lo_name, file, line);

From kern_clock.c, v1.213 (in several places, while holding a thread lock):
+                                     /* Handle ticks wrap-up. */
+                                     if (ticks < td->td_blktick)
+                                             continue;

Should not it be like the next:
+                                     /* Handle ticks wrap-up. */
+                                     if (ticks < td->td_blktick) {
+                                             thread_unlock(td);
+                                             continue;
+                                     }

The precondition idea to reproduce it is to lock a subject thread
in some deadlkres callout, handle re-wrap condition, then try
to lock a process to witch the thread belongs in (n+m)'th deadlkres
callout, or in different context.

Thanks, that may be fixed in r209577.

Attilio


--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Recursive mutex that can be waited upon (pthread)
    ... While you can use a mutex to avoid that data is changed, for me having a mutex does not mean that data is not changed, it only means that data is not changed by a different thread. ... My own thread may of course change the data, hence functions I call may want to change the data and if they do so, they must be sure that these changes are atomically, hence they must lock the object and they simply can't rely that I locked the object before -> thus I need recursive locks. ... Then I could as well throw out threads of my code and just use a single thread going through an event queue. ... you have a predicate condition on an invariant. ...
    (comp.programming.threads)
  • Re: options WITNESS and locks
    ... Using the WITNESS kernel I cannot load my driver with any MTX_SPIN mutex. ... lock spin mutex does not match earlier ... It means that somewhere you are treating a mutex with that name as a sleep mutex and in other places as a spin mutex. ...
    (freebsd-questions)
  • [patch 6/8] mutex subsystem, core
    ... mutex implementation, core files: just the basic subsystem, no users of it. ... straightforward mutexes with strict semantics: ... + * Block on a lock - add ourselves to the list of waiters. ...
    (Linux-Kernel)
  • [patch 6/8] mutex subsystem, core
    ... mutex implementation, core files: just the basic subsystem, no users of it. ... straightforward mutexes with strict semantics: ... + * Block on a lock - add ourselves to the list of waiters. ...
    (Linux-Kernel)
  • [PATCH 4/8] adaptive real-time lock support
    ... The Real Time patches to the Linux kernel converts the architecture ... compromising the integrity of critical sections protected by the lock. ... while retaining both the priority inheritance protocol as well as the ... the RT Mutex has been ...
    (Linux-Kernel)