Re: Recent changes to pseudofs causing panics -- leaking a vnode lock?

On Fri, Jan 9, 2009 at 2:49 PM, Joe Marcus Clarke <marcus@xxxxxxxxxxx> wrote:
On Thu, 2009-01-08 at 18:48 -0600, Richard Todd wrote:
I've noticed that ever since updating to a kernel after the recent changes
to the pseudofs code late last month, that I've occasionally gotten the
following sort of panic:

System call readlink returning with the following locks held:
exclusive lockmgr pseudofs (pseudofs) r = 0 (0xffffff00ba581cc8) locked @ /usr/src/sys/fs/pseudofs/pseudofs_vncache.c:193
panic: witness_warn

The line in question is the one I marked by an arrow in this chunk of the
pfs_vncache_alloc code:
if ((pn->pn_flags & PFS_PROCDEP) != 0)
(*vpp)->v_vflag |= VV_PROCDEP;
pvd->pvd_vnode = *vpp;
vn_lock(*vpp, LK_EXCLUSIVE | LK_RETRY); <==== this lock here
error = insmntque(*vpp, mp);

So somehow, a vnode is getting locked here and not getting unlocked.
I suspect the code in the retry2: loop later, simply because that's
the code that got added in the late December commits, but I'm not
clear on how exactly. I've tried littering the code with extra
printfs to try to clarify what's going on, but alas, I'm still not
really sure what's going on. I do have a good coredump that I can get
info out of, if someone can suggest to me what would be useful things
to dump. Anyway, here's the patch for the debugging printfs I added,
and the console messages produced by those printfs from the most
recent coredump/panic. The console msgs do seem to indicate some sort
of race condition going on, though, as they seem to show two or more processes
simultaneously hitting the pseudofs code and hitting my debugging print
statements (alas, making the console log rather a confused mess.)

I believe I have fixed this in HEAD. Kib gave his review and approval,
and the fix really should prevent this hang. Please report back if you
still see the problem.


When did you do this commit / what's the SVN revision #?
