Re: NFS server fail-over - how do you do it?

From: adp (dap99_at_i-55.com)
Date: 05/31/04

  • Next message: adp: "Re: NFS server fail-over - how do you do it?"
    To: "Matthew Seaman" <m.seaman@infracaninophile.co.uk>
    Date: Mon, 31 May 2004 12:31:24 -0500
    
    

    We can live with the chance that a file write might fail as long as we can
    switch over to another NFS server if the primary fails. So amd will help us
    avoid the "client hung" issue? I will have to take a look. That is the worst
    thing of all when it comes to a failed NFS server. You can't even remotely
    reboot the NFS client! Someone has to power reset the damn thing. That's
    bad.

    On Sun, May 30, 2004 at 02:43:37AM -0500, adp wrote:
    > I am running a FreeBSD 4.9-REL NFS server. Once every several hours our
    main
    > NFS server replicates everything to a backup FreeBSD NFS server. We are
    okay
    > with the gap in time between replication. What we aren't sure about is how
    > to automate the fail-over between the primary to the secondary NFS server.
    > This is for a web cluster. Each client mounts several directories from the
    > NFS server.
    >
    > Let's say that our primary NFS server dies and just goes away. What then?
    > Are you periodically doing a mount or a file look-up of a mounted
    filesystem
    > to check if your NFS server died? If so are you just unmounting and
    > remounting everything using the backup NFS server?
    >
    > Just curious how this problem is being solved.

    If you're mounting those NFS partitions read/write, then there really
    isn't a good solution for this problem[1] -- you need your NFS server up
    and running 24x7.

    If you are NFS mounting those partitions read-only, then you can in
    principle construct a fail-over system between those servers. Some
    Unix OSes let you specify a list of servers in fstab(5) (eg. Solaris)
    and clients will mount from one or other of them. Unfortunately you
    can't do that with standard NFS mounts under FreeBSD. You could try
    using VRRP -- see the net/freevrrpd port for example -- but I'm not
    sure how well that would work if the system failed-over in the middle
    of an IO transaction.

    In any case -- certainly if your NFS partitions are read/write, but
    also for read-only, perhaps the best compromise is to use the
    automounter amd(8) This certainly does help with the 'nightmare
    filesystem' scenario, where loss of a server prevents the clients
    doing anything, even rebooting cleanly. You can create a limited and
    rudimentary form of failover by using role-base hostnames in your
    internal DNS -- eg nfsserv.example.com as a CNAME pointing at your
    main server, and then modify the DNS when you need the failover to
    occur. It's a bit clunky and needs manual intervention, but it beats
    having nothing at all.

     Cheers,

     Matthew

    [1] Well, I assume you haven't got the resources to set up a storage
    array with multiple servers accessing the same disk sets.

    --
    Dr Matthew J Seaman MA, D.Phil.                       26 The Paddocks
                                                          Savill Way
    PGP: http://www.infracaninophile.co.uk/pgpkey         Marlow
    Tel: +44 1628 476614                                  Bucks., SL7 1TH UK
    _______________________________________________
    freebsd-questions@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-questions
    To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
    

  • Next message: adp: "Re: NFS server fail-over - how do you do it?"

    Relevant Pages

    • Re: nfs tranfers hang in state getblck or nfsread
      ... Boot the 5.1-CURRENT nfs server ... Boot the 5.1-CURRENT diskless client ... <insert about a 10 second delay here> ...
      (freebsd-current)
    • Re: NFS through firewall
      ... >> of the partitions exported by the NFS server. ... you're not root on the client when you're trying to access the ... the nfs server does not treat a remote root ... >> the client and try to access an exported partition that belongs to, say, ...
      (Fedora)
    • NFS Server/client issue
      ... on the client side and the connectivity broke with the NFS server. ... After 7 minutes the mv command failed giving the below error. ... On the NFS Server we restarted the NFS Service. ...
      (SunManagers)
    • Re: Reproducable, possibly NFS related, fatal double fault in 6.2-R-p7
      ... Both NFS server and client are running 6.2-RELEASE-p7. ... I have three kernel crash dumps available. ... Fatal double fault: ...
      (freebsd-stable)
    • Re: nfs -- file creation ctime
      ... of NFS server, ... so the ctime is set to the current time (and mtime and atime ... time is taken from the server, not the client. ... system containing the file and inspect the device and type fields. ...
      (comp.unix.programmer)