[HPADM] SUMMARY -- nfs hang due to stad or lockd?

From: Jeff Cleverley (jeff_cleverley_at_agilent.com)
Date: 08/26/05

  • Next message: Vidal, Ignacio - (Arg): "[HPADM] 2nd fibre channel path"
    Date: Fri, 26 Aug 2005 10:52:25 -0600
    To: hpux-admin@DutchWorks.nl
    
    

    Greetings,

    I'd like to thank the following for their replies and suggestions:

    Marc Ahrendt
    Prashant Zanwar
    Jeff Lightner
    Kevin O'Donovan
    James J. Perry

    Unfortunately, none of them fixed the issue. The most promising was
    from Prashant that suggested using clear_locks (man 1m) and then
    bouncing the lockd daemon. I tried this including bouncing the statd
    also. While the processes were stopped, I removed the client file in
    /var/statmon/sm, and then re-started the daemons. No luck.

    It looks like the server will need a reboot at some point in time. For
    the short term, the affected clients are compute servers. We're going
    to rename them to something else. This should work unless the locks
    somehow pick up any system information such as software id or mac address.

    Thanks for all the help. The original post is listed below.

    Jeff

    Greetings,

    We have a nfs server that has at least 50 systems mounting file systems
    exported from it at any given time. We've found a few machines in the
    lab that cannot do a "ll" of a couple of mount points. The same mount
    points list properly from other machines as user root, so we know it's
    not permission problems.

    We have rebooted the client and even moved /etc/mnttab out of the way
    before a reboot just to make sure it goes away and doesn't have any
    corruption associated with it. It still won't list. It will list
    sub-directories if you know what they are.

    Because of this, we believe there is a caching issue on the server for a
    couple of these mount points. There are directories under
    /var/statmon/sm for the affected clients, along with all the unaffected
    clients.

    We're were thinking of killing and then res tarting the lockd and statd
    processes. We were concerned about what this may do to existing mounts
    for clients and also new mounts during this time. We've tried this on
    some test boxes and it seems fine, but we don't have any way to generate
    the number of requests the server gets, nor does the test box have any
    corrupted cache that we can tell if it works.

    Any information about what we need to do to clear this up? Rebooting
    the server will be a really unpopular decision.

    --
                 ---> Please post QUESTIONS and SUMMARIES only!! <---
            To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
           Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
     
     Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
                http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)
    

  • Next message: Vidal, Ignacio - (Arg): "[HPADM] 2nd fibre channel path"

    Relevant Pages

    • Re: Creeping Oulook connectivity problems
      ... have you done any message tracing to see where they might be getting hung ... Clients are all XP Pro SP2, ... I hard coded the mail server IP into the hosts file. ... > reboot the machine only if we absolutely have to. ...
      (microsoft.public.exchange.admin)
    • Re: Uptime for OpenVMS
      ... client off an NFS server, ... wrong) Linux versions of NFS are, in some way, stateful and a reboot ... of the server requires a reboot of all the clients. ...
      (comp.os.vms)
    • Re: Questions about DHCP and DNS migration
      ... I can't think of a reason you'd have to reboot the clients. ... broadcast for the next and should contact the new dhcp server. ... For the DHCP, I did some reserach and used the command "hetsh dhcp server ...
      (microsoft.public.windows.server.active_directory)
    • Re: Win98 Authentication Error On Upgraded NT to 2000 Server
      ... domain controller check out local security policy for "effective settings". ... downlevel clients that do not have Active Directory client installed. ... Digitally sign client communications and digitally sign server ... but my experience is a reboot works faster. ...
      (microsoft.public.win2000.security)
    • [HPADM] Re: nfs hang due to stad or lockd?
      ... clear_locks is your bet, and also see if application is using any lock files, those are in place or not. ... We have a nfs server that has at least 50 systems mounting file systems ... lab that cannot do a "ll" of a couple of mount points. ... /var/statmon/sm for the affected clients, ...
      (HP-UX-Admin)