SUMMARY: Nobody can login our productive server

From: Marques, Virginia (virginia.marques_at_eds.com)
Date: 06/27/03

  • Next message: Andreas Hoeschler: "Booting from disk mirror with Solstice DiskSuite"
    To: sunmanagers@sunmanagers.org
    Date: Fri, 27 Jun 2003 09:23:32 +0100
    
    

    9 Out of office

    Thank you very much to every one who take a minute to respond my mail (in
    order of appearance):
    Steve Maher
    Bruntel, Mitchell (not very constructive ;-))
    Hichael Morton
    Carl Ma
    Richard Markham
    Davorin Bengez
    Mario Williams
    Kris Briscoe
    Darren Dunham
    David Markovitz
    Eugene Schmidt
    Brett Lymn

    ------------------------
    First of all:
    almost all say it is very important that root has sh shell as explained by
    Davorin:
    "you should have /sbin/sh for root's shell as it is statically linked - does
    not depend on any external library; this time not a big issue but other time
    around..."

    And by Eugene:
    "On another note, root "should actually" usr /usr/sbin/sh - statically
    linked. Even if a library used by ksh gets clobbered, u are still down ;-("

    In this way and if you want to or have to work in ksh Brett Lymn suggests
    two ways:
    "1) have a shell function that execs a new shell of your choice..DO NOT
       automatically exec this shell on login as this will break your root
       login regardless of what you do - make it a manual process, make it
       as short as you like (e.g. k execs /usr/bin/ksh for you) but you
       must make it a manual action.

    2) create another account with uid 0 and a shell of /usr/bin/ksh and
       use that to do day to day root work and keep the real root login
       for emergencies."

    ------------------------------

    And now the SUMMARY for may original question: "...is there a better way to
    do this in order to reduce the time in then the server is down?. I mean to
    reduce the reboots (4 in my version)."
    And Darren tell me two ways:

    "I wouldn't have unencapsulated it. There are a couple of ways to
    handle this..

    1) (no touch disks)
    Make the change on both disks, then boot to single user. Use VxVM
    commands to detach (det) one side of the root mirror, then reboot. The
    remaining mirror will have fsck run on it, and the other mirror will be
    reattached. Pretty quick, no big deal.

    2) (touch disks)
    While the machine is down, remove, power-off, or otherwise disconnect
    one disk from the root mirror. Mount the remaining and fix the link,
    then boot the machine. When it boots, it'll mark the plexes on the
    other half as stale. Replace the disk, and add the plexes back, and
    it'll remirror."

    the first way don't seems to be possible for me, because I can't change the
    mirrored disk, because it hasn't partitions, just volumes. But the second
    way sounds very good. I have a test machine with VxVM and I'll test it. I
    like this solution, very original.

    Also some of you tell me that SUN recommends using disksuite to mirror the
    boot drives and not VxVM. But I think SDS has the same problem as VxVM,
    perhaps easily commands, but no reduction of time (I'll be wrong, I don't
    know very much about SDS).

    A college suggest to give root permission to one user (my user) for a useful
    command (maybe chmod) using sudo. I know that's not very standard but in
    this case (that I had an opened session with my user) I would chmod the /
    partition for me, recreate the links and chmod again for root.

    So, again thank you everybody, this list is great,
    Virginia

    > -----Mensaje original-----
    > De: Marques, Virginia
    > Enviado el: jueves, 26 de junio de 2003 15:34
    > Para: 'sunmanagers@sunmanagers.org'
    > Asunto: Nobody can login our productive server
    >
    > One person this morning did a rm * on / partition with user root. After
    > that no one can login to the server this person logoff before notifying us
    > what he did.
    > The problem was that no one has access to ksh because in the / partition
    > there is a link that was removed:
    >
    > #pwd
    > /
    > #ls -l bin
    > lrwxrwxrwx 1 root root 9 Jun 26 09:35 bin -> ./usr/bin
    >
    > and every user in my /etc/passwd has the shell /bin/ksh (also root)
    >
    > We decided to ask the customer to shutdown the server in order to start-up
    > from cdrom, recreate the links and start-up again. But the main problem is
    > that we have VxVM with encapsulated boot disk(s). So we had to:
    >
    > - Shutdown (Stop-A)
    > - Ok> Boot cdrom -sw
    > - mount / partition:
    > #mount -F ufs /dev/dsk/c0t0d0s0 /a
    > - recreate removed links
    > # cd /a
    > # ln -s ./usr/bin bin
    > # ln -s ./usr/lib lib
    > - Now comes the VxVM section we had to modify /etc/system and /etc/vfstab
    > files in order to tell VxVM not to start. And also in directory
    > /etc/vx/reconfig.d/state.d we had to:
    >
    > # rm root-done
    > # touch install-db
    > # init 0
    > ok> boot disk
    >
    > - Next step: encapsulate boot disk with vxinstall and leave other disks
    > alone.
    > - After two more reboots system is up with VxVM now we had to mount all
    > other partitions and customers begin to work again.
    >
    > (now we have to initialise three other disks we had in rootdg and make the
    > boot disk mirror)
    >
    >
    > That all took 50 minutes. My question is (and please, excuse the long mail
    > and the bad English): is there a better way to do this in order to reduce
    > the time in then the server is down?. I mean to reduce the reboots (4 in
    > this version).
    >
    > Kind regards,
    > Virginia
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers


  • Next message: Andreas Hoeschler: "Booting from disk mirror with Solstice DiskSuite"

    Relevant Pages