Re: '/etc/nfs stop' stopping more than NFS
- From: Bela Lubkin <filbo@xxxxxxxxxx>
- Date: 29 Dec 2005 01:24:44 -0500
Roger Cornelius wrote:
> > 5. Mount a filesystem from a different Linux client; test.
>
> No other linux client is available.
>
> > 6. Mount a filesystem from some other sort of system (non-OSR5); test.
>
> I'll have a Mac OS X system next week, but no non-OSR5 systems until
> then.
>
> > 7. Mount a filesystem from another OSR5 system; test.
>
> Don't have one of these either.
These are unfortunate gaps in the testing, but I guess we have to work
with what you have. Besides, it's at best unlikely that they would have
shown any differences...
> > 8. Start two background processes, one of which definitely does not, and
> > one definitely does, have a file open on that filesystem:
> >
> > sleep 1000 < /etc/termcap &
> > sleep 1000 < /u/nfs/importfs/cmilinux/SOME-FILE-THAT-EXISTS &
> >
> > Now run the `fuser` command, saving its output in a file. Search that
> > file for the PIDs of those two processes. `fuser` will normally report
> > at PID twice if it has the target file open twice. Do you see that --
> > i.e. do you see one mention of the /etc/termcap process and two of the
> > other? Also use `lsof` to probe this. Ummm, I'm not sure if `lsof
> > cmilinux:/u/nfs/exportfs/cmilar-001` or `lsof /u/nfs/importfs/cmilinux`
> > is more likely to produce useful results -- check that here, where you
> > have a definite user of the filesystem.
>
> 'fuser cmilinux:/u/nfs/exportfs/cmilar-001' included once, the pid of
> the process with the file open on the nfs filesystem, but did not
> include the process with /etc/termcap open. 'lsof
> /u/nfs/importfs/cmilinux' correctly lists the process with the open
> file on the nfs filesystem. lsof does not accept the host:file syntax.
So it _is_ able to properly distinguish on some level.
BTW, you now have the bare bones of a workaround: fix `numountall` to
use `lsof` on the NFS mount point, instead of `fuser` on the host:dir.
You would still have the mysterious bad behavior in `fuser`, but I think
`numountall` is the only thing that cares if it works with NFS. (Might
be the only script in the system that uses `fuser` at all?)
> There is an interesting correlation between the number of times fuser
> reports a pid and the number of streams/network files (lsof's
> terminology) reported for the pid by lsof -p. I can't include a sample
> of the results without google word-wrapping it, so the full list of
> pids and lsof output can be found here:
>
> http://tenzing.org/cusm/fuser-lsof.out
>
> The first line of each group (self-explanitorily) lists the pid and
> number of times fuser reported it. Subsequent lines are the output of
> lsof -p for that pid. One exception to the correlation seems to be the
> data for pid 322 which fuser reported 4 times but which lsof reports 6
> open streams files.
>
> The script that produced the list is here:
>
> http://tenzing.org/cusm/fuser-lsof.sh
>
> So can anything be made of the correlation? It looks like fuser is
> erroneously attributing any open streams/network files it finds.
Hmmm. The correlation is very strong (`inetd` 12 files, `snmpd` 9,
etc.) I would readily believe that the _first_ step of fuser's
procedure to determine whether a file is on NFS is to ask whether it is
STREAMS-based. There must be many steps past that, which are apparently
somehow getting short-circuited.
A couple of other things to try:
1. relink the kernel, allowing it to "rebuild the kernel environment".
Boot from that -- any change? I'm thinking that somewhere along the way
it must be asking "is this device major/minor I dug out of the inode
table one that I believe relates to NFS?". If some device nodes are out
of sync with the kernel build, who knows what could happen.
2. What is the major number of /dev/nfsd? There have historically been
a few bugs associated with major numbers above 127. I thought they had
all been flushed out long ago, but you never know.
2a. If it _is_ 128 or higher, you can surgically change it. This would
be a fiddly procedure, let's not discuss it until we know whether it
applies.
3. You're on OSR507 + MP4, which is newer than I've used. I don't know
if MP4 replaced the `fuser` binary. You can find out, generically, by
doing (as root):
# cd /opt/K/SCO/Unix
# find . -name fuser -print
If this outputs more than one line, the pathnames will tell you which
patch(es) replaced it. Dredge up the various old binaries (just copy
them elsewhere with different names -- "/tmp/fuser.mp3" or whatever).
The patch installs will have made the old binaries not executable, perms
000. chmod them to 700 and test them. Any differences?
>Bela<
.
- Follow-Ups:
- Re: '/etc/nfs stop' stopping more than NFS
- From: Roger Cornelius
- Re: '/etc/nfs stop' stopping more than NFS
- References:
- '/etc/nfs stop' stopping more than NFS
- From: Roger Cornelius
- Re: '/etc/nfs stop' stopping more than NFS
- From: Roger Cornelius
- '/etc/nfs stop' stopping more than NFS
- Prev by Date: Remote printing problem with Open Server 5.0.6
- Next by Date: OpenSSh on OpenServer
- Previous by thread: Re: '/etc/nfs stop' stopping more than NFS
- Next by thread: Re: '/etc/nfs stop' stopping more than NFS
- Index(es):
Relevant Pages
|
|