Re: panic: No NCF_TS



On 1/22/12 7:05 PM, Kostik Belousov wrote:
On Mon, Jan 23, 2012 at 05:36:42AM +0400, Yuri Pankov wrote:
Seems to be reproducible here running r230467 as the NFS client and
r230135 as NFS server. NFSv4 not enabled.

# mount
[...]
sirius:/data/distfiles on /usr/ports/distfiles (nfs)

# /usr/bin/env /usr/bin/fetch -AFpr -S 4682084 -o /usr/ports/distfiles/sqlite-src-3071000.zip http://www.sqlite.org/sqlite-src-3071000.zip
/usr/ports/distfiles/sqlite-src-3071000.zip 100% of 4572 kB 379 kBps 00m00s
# rm /usr/ports/distfiles/sqlite-src-3071000.zip

immediately followed by:

panic: No NCF_TS
cpuid = 2
KDB: enter: panic
[ thread pid 1603 tid 100494 ]
Stopped at kdb_enter+0x3e: movq $0,kdb_why
db> bt
Tracing pid 1603 tid 100494 td 0xfffffe0089585460
kdb_enter() at kdb_enter+0x3e
panic() at panic+0x245
cache_lookup_times() at cache_lookup_times+0x6b5
nfs_lookup() at nfs_lookup+0x190
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x8b
lookup() at lookup+0x7e9
namei() at namei+0x74c
kern_statat_vnhook() at kern_statat_vnhook+0x90
sys_lstat() at sys_lstat+0x30
amd64_syscall() at amd64_syscall+0x221
Xfast_syscall() at Xfast_syscall+0xfb
--- syscall (190, FreeBSD ELF64, sys_lstat), rip = 0x80093ff3c, rsp = 0x7fffffffd8d8, rbp = 0x7fffffffd979 ---


Yes, my bad. I wrote the r230441 with the assumption that filesystems
are consistent in their use of cache_enter_time(). And my net-booting
test box did not catched this, which is at least interesting.

Please try the change below. Actually, it should be enough to only apply
the changes for fs/nfsclient (or nfsclient/ if you use old nfs). If this
does not help, then please try the whole patch.

I think we should have the existing assertion. If cache_lookup_times()
is called with a timestamp requested, then returning a name cache entry
without a timestamp is just going to result in that name cache entry not
being used. Panic'ing in that case is correct.

With regard to the NFS changes below, all of these are bugs that didn't
really work right before. Specifically, adding a positive entry without
setting n_ctime previously would have just resulted in the name cache
entry being purged on the next lookup anyway. Keep in mind that the
timestamp for a give name cache entry in NFS needs to match an actual
timestamp returned by the NFS server as post-op attributes in some RPC.
Using the timestamp from vfs_timestamp() is completely bogus. Instead,
the timestamp for a negative entry needs to be the mtime of the parent
directory. If we don't have that timestamp handy, then we should just
not add a namecache entry at all. Also, the vap->va_ctime used below
for mknod() and create() is not a timestamp from the server, so it is
also suspect. I can look at this in more detail on Wednesday, but my
best guess is that nearly all (if not all) of these cache_enter() calls
will simply need to be removed.

Note that other filesystems like UFS don't bother creating name cache
entries for things like create or mknod. It's debatable if the NFS
client should even be creating any name cache entries outside of lookup
and when handling a READDIR+ reply.

Even if not needed for correctness, I think vfs_cache.c change could
be made useful by adding KASSERT to it.

commit 9771eb4739170d014fcebfbad07bfed4076c6d85
Author: Konstantin Belousov <kostik@xxxxxxxxxx>
Date: Mon Jan 23 04:36:08 2012 +0200

Make NCF_TS mismatch not fatal.
Consistently use cache_enter_time in nfs clients.

diff --git a/sys/fs/nfsclient/nfs_clvnops.c b/sys/fs/nfsclient/nfs_clvnops.c
index 2747191..2841647 100644
--- a/sys/fs/nfsclient/nfs_clvnops.c
+++ b/sys/fs/nfsclient/nfs_clvnops.c
@@ -1432,7 +1432,7 @@ nfs_mknodrpc(struct vnode *dvp, struct vnode **vpp, struct componentname *cnp,
}
if (!error) {
if ((cnp->cn_flags & MAKEENTRY))
- cache_enter(dvp, newvp, cnp);
+ cache_enter_time(dvp, newvp, cnp, &vattr.va_ctime);
*vpp = newvp;
} else if (NFS_ISV4(dvp)) {
error = nfscl_maperr(cnp->cn_thread, error, vap->va_uid,
@@ -1591,7 +1591,7 @@ again:
}
if (!error) {
if (cnp->cn_flags & MAKEENTRY)
- cache_enter(dvp, newvp, cnp);
+ cache_enter_time(dvp, newvp, cnp, &vap->va_ctime);
*ap->a_vpp = newvp;
} else if (NFS_ISV4(dvp)) {
error = nfscl_maperr(cnp->cn_thread, error, vap->va_uid,
@@ -1923,6 +1923,7 @@ nfs_link(struct vop_link_args *ap)
struct componentname *cnp = ap->a_cnp;
struct nfsnode *np, *tdnp;
struct nfsvattr nfsva, dnfsva;
+ struct timespec ts;
int error = 0, attrflag, dattrflag;

if (vp->v_mount != tdvp->v_mount) {
@@ -1966,8 +1967,10 @@ nfs_link(struct vop_link_args *ap)
* must care about lookup caching hit rate, so...
*/
if (VFSTONFS(vp->v_mount)->nm_negnametimeo != 0 &&
- (cnp->cn_flags & MAKEENTRY))
- cache_enter(tdvp, vp, cnp);
+ (cnp->cn_flags & MAKEENTRY)) {
+ vfs_timestamp(&ts);
+ cache_enter_time(tdvp, vp, cnp, &ts);
+ }
if (error && NFS_ISV4(vp))
error = nfscl_maperr(cnp->cn_thread, error, (uid_t)0,
(gid_t)0);
@@ -1987,6 +1990,7 @@ nfs_symlink(struct vop_symlink_args *ap)
struct nfsfh *nfhp;
struct nfsnode *np = NULL, *dnp;
struct vnode *newvp = NULL;
+ struct timespec ts;
int error = 0, attrflag, dattrflag, ret;

vap->va_type = VLNK;
@@ -2030,8 +2034,10 @@ nfs_symlink(struct vop_symlink_args *ap)
* must care about lookup caching hit rate, so...
*/
if (VFSTONFS(dvp->v_mount)->nm_negnametimeo != 0 &&
- (cnp->cn_flags & MAKEENTRY))
- cache_enter(dvp, newvp, cnp);
+ (cnp->cn_flags & MAKEENTRY)) {
+ vfs_timestamp(&ts);
+ cache_enter_time(dvp, newvp, cnp, &ts);
+ }
*ap->a_vpp = newvp;
}

@@ -2063,6 +2069,7 @@ nfs_mkdir(struct vop_mkdir_args *ap)
struct vattr vattr;
struct nfsfh *nfhp;
struct nfsvattr nfsva, dnfsva;
+ struct timespec ts;
int error = 0, attrflag, dattrflag, ret;

if ((error = VOP_GETATTR(dvp, &vattr, cnp->cn_cred)) != 0)
@@ -2116,8 +2123,10 @@ nfs_mkdir(struct vop_mkdir_args *ap)
* must care about lookup caching hit rate, so...
*/
if (VFSTONFS(dvp->v_mount)->nm_negnametimeo != 0 &&
- (cnp->cn_flags & MAKEENTRY))
- cache_enter(dvp, newvp, cnp);
+ (cnp->cn_flags & MAKEENTRY)) {
+ vfs_timestamp(&ts);
+ cache_enter_time(dvp, newvp, cnp, &ts);
+ }
*ap->a_vpp = newvp;
}
return (error);
diff --git a/sys/kern/vfs_cache.c b/sys/kern/vfs_cache.c
index c11f25f..647dcac 100644
--- a/sys/kern/vfs_cache.c
+++ b/sys/kern/vfs_cache.c
@@ -233,6 +233,24 @@ nc_get_name(struct namecache *ncp)
return (ncp_ts->nc_name);
}

+static void
+cache_out_ts(struct namecache *ncp, struct timespec *tsp, int *ticksp)
+{
+
+ if ((ncp->nc_flag & NCF_TS) == 0) {
+ if (tsp != NULL)
+ bzero(tsp, sizeof(*tsp));
+ if (ticksp != NULL)
+ *ticksp = 0;
+ return;
+ }
+
+ if (tsp != NULL)
+ *tsp = ((struct namecache_ts *)ncp)->nc_time;
+ if (ticksp != NULL)
+ *ticksp = ((struct namecache_ts *)ncp)->nc_ticks;
+}
+
static int doingcache = 1; /* 1 => enable the cache */
SYSCTL_INT(_debug, OID_AUTO, vfscache, CTLFLAG_RW, &doingcache, 0,
"VFS namecache enabled");
@@ -506,17 +524,7 @@ retry_wlocked:
dvp, cnp->cn_nameptr, *vpp);
SDT_PROBE(vfs, namecache, lookup, hit, dvp, "..",
*vpp, 0, 0);
- if (tsp != NULL) {
- KASSERT((ncp->nc_flag & NCF_TS) != 0,
- ("No NCF_TS"));
- *tsp = ((struct namecache_ts *)ncp)->nc_time;
- }
- if (ticksp != NULL) {
- KASSERT((ncp->nc_flag & NCF_TS) != 0,
- ("No NCF_TS"));
- *ticksp = ((struct namecache_ts *)ncp)->
- nc_ticks;
- }
+ cache_out_ts(ncp, tsp, ticksp);
goto success;
}
}
@@ -563,14 +571,7 @@ retry_wlocked:
dvp, cnp->cn_nameptr, *vpp, ncp);
SDT_PROBE(vfs, namecache, lookup, hit, dvp, nc_get_name(ncp),
*vpp, 0, 0);
- if (tsp != NULL) {
- KASSERT((ncp->nc_flag & NCF_TS) != 0, ("No NCF_TS"));
- *tsp = ((struct namecache_ts *)ncp)->nc_time;
- }
- if (ticksp != NULL) {
- KASSERT((ncp->nc_flag & NCF_TS) != 0, ("No NCF_TS"));
- *ticksp = ((struct namecache_ts *)ncp)->nc_ticks;
- }
+ cache_out_ts(ncp, tsp, ticksp);
goto success;
}

@@ -602,14 +603,7 @@ negative_success:
cnp->cn_flags |= ISWHITEOUT;
SDT_PROBE(vfs, namecache, lookup, hit_negative, dvp, nc_get_name(ncp),
0, 0, 0);
- if (tsp != NULL) {
- KASSERT((ncp->nc_flag & NCF_TS) != 0, ("No NCF_TS"));
- *tsp = ((struct namecache_ts *)ncp)->nc_time;
- }
- if (ticksp != NULL) {
- KASSERT((ncp->nc_flag & NCF_TS) != 0, ("No NCF_TS"));
- *ticksp = ((struct namecache_ts *)ncp)->nc_ticks;
- }
+ cache_out_ts(ncp, tsp, ticksp);
CACHE_WUNLOCK();
return (ENOENT);

@@ -797,8 +791,8 @@ cache_enter_time(dvp, vp, cnp, tsp)
n2->nc_nlen == cnp->cn_namelen &&
!bcmp(nc_get_name(n2), cnp->cn_nameptr, n2->nc_nlen)) {
if (tsp != NULL) {
- KASSERT((n2->nc_flag & NCF_TS) != 0,
- ("no NCF_TS"));
+ if ((n2->nc_flag & NCF_TS) == 0)
+ continue;
n3 = (struct namecache_ts *)n2;
n3->nc_time =
((struct namecache_ts *)ncp)->nc_time;
diff --git a/sys/nfsclient/nfs_vnops.c b/sys/nfsclient/nfs_vnops.c
index a39b29b..2fc4d2c 100644
--- a/sys/nfsclient/nfs_vnops.c
+++ b/sys/nfsclient/nfs_vnops.c
@@ -1531,7 +1531,7 @@ nfsmout:
vput(newvp);
} else {
if (cnp->cn_flags & MAKEENTRY)
- cache_enter(dvp, newvp, cnp);
+ cache_enter_time(dvp, newvp, cnp, &vap->va_ctime);
*vpp = newvp;
}
mtx_lock(&(VTONFS(dvp))->n_mtx);
@@ -1671,7 +1671,7 @@ nfsmout:
}
if (!error) {
if (cnp->cn_flags & MAKEENTRY)
- cache_enter(dvp, newvp, cnp);
+ cache_enter_time(dvp, newvp, cnp, &vap->va_ctime);
*ap->a_vpp = newvp;
}
mtx_lock(&(VTONFS(dvp))->n_mtx);


--
John Baldwin
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: panic: No NCF_TS
    ... the changes for fs/nfsclient. ... struct nfsvattr nfsva, dnfsva; ... * must care about lookup caching hit rate, ... "VFS namecache enabled"); ...
    (freebsd-current)
  • [GIT] NFS client updates for 2.6.22...
    ... NFS: Clean-up: use correct type when converting NFS blocks to local blocks ... NFS: Clean-up: Refactor IP address sanity checks in NFS client ... SUNRPC: Allow rpcbind requests to be interrupted by a signal. ... Replace vfsmount and dentry in nfs_open_context with struct path ...
    (Linux-Kernel)
  • Re: [PATCH 24/27] NFS: Use local caching [try #2]
    ... * - Release the private state associated with a page if undergoing complete ... struct address_space *mapping; ... * Definition of the auxiliary data attached to NFS inode storage objects. ... * - The netfs data from the cookie being used as the source is ...
    (Linux-Kernel)
  • Improving NFS write performance by a factor of 2.
    ... A problem with the current NFS server is that it does not cluster ... This resulted in NFS write gathering effectively ... static int nfs_commit_blks; ... struct nfsrvstats nfsrvstats; ...
    (freebsd-current)
  • Re: mount -p and NFS options
    ... doesn't have to be in fstab format; I just need to be able to see ... struct statfs), and it mostly works. ... The problem goes away if I change rsize/wsize from 32K down to 16K or the default 8K, at least with TCP mounts. ... I don't know why it was libc consistently getting misread and why other files seem to always work, or why mounting other NFS filesystems with a 32K blocksize work fine... ...
    (freebsd-stable)