SUMMARY: Solaris volume manager -- weird metasync

From: Anshuman Kanwar (anshuman_at_expertcity.com)
Date: 05/30/03

  • Next message: Joe Block: "iPlanet web server"
    To: "'sunmanagers@sunmanagers.org'" <sunmanagers@sunmanagers.org>
    Date: Fri, 30 May 2003 10:56:23 -0700
    
    

    Hi Managers,

    I was not able to get an answer from this list so I opened a case with Sun.
    It seems they have at least 3 internal bugs open regarding Solaris Volume
    manager in RAID 5 configurations. One of their suggestions actually fixed
    the problem:

            # mv /usr/lib/drv/preen_md.so.1 /usr/lib/drv/preen_md.so.1.old
            # reboot

    I had rebuilt the machine earlier just to check if the error was
    reproducable or not. It was.

    Strangely enough the bug appeared only after I created the RAID5 volume.
    SImply mirroring the boot disk (and rebooting) worked fine. Seems to be a
    combination of fsck and volume manager that caused the "wait: No child
    processes".

    Though it is resolved I still do not understand why.

    Thanks,
    Anshuman Kanwar
    Unix SysAdmin
    Expertcity Inc.

    --
    (805) 690-5714   [off]   ansh@expertcity.com
    (805) 895-4231   [cel]   5385 Hollister Ave 
    (805) 690-6471   [fax]   Goleta, CA.  93111
    > -----Original Message-----
    > From: Anshuman Kanwar 
    > Sent: Monday, May 19, 2003 4:39 PM
    > To: 'sunmanagers@sunmanagers.org'
    > Subject: Solaris volume manager -- weird metasync
    > 
    > 
    > Hi Managers,
    > 
    > I set up disk mirroring on a 420R. It has 2 internal drives 
    > (c0t0 and c0t1) and is connected to 11 drives in a A5200 
    > (c1t0 through c1t10).
    > 
    > The internal disks are mirrored and the external disks are 
    > configured as a raid 5 volume with one of the disks as a standby.
    > 
    > Every seems to work correctly till I boot, at which point 
    > this happens:
    > 
    > Rebooting with command: boot                                          
    > Boot device: disk  File and args: 
    > SunOS Release 5.9 Version Generic_112233-04 64-bit
    > Copyright 1983-2002 Sun Microsystems, Inc.  All rights reserved.
    > Use is subject to license terms.
    > WARNING: forceload of misc/md_trans failed
    > WARNING: forceload of misc/md_sp failed
    > configuring IPv4 interfaces: hme0.
    > Hostname: e420-1.sjc
    > The system is coming up.  Please wait.
    > checking ufs filesystems
    > /dev/md/rdsk/d5: is clean.
    > wait: No child processes
    > 
    > WARNING - Unable to repair one or more filesystems.
    > Run fsck manually (fsck filesystem...).
    > Exit the shell when done to continue the boot process.
    > 
    > 
    > Type control-d to proceed with normal startup,
    > (or give root password for system maintenance): 
    > single-user privilege assigned to /dev/console.
    > Entering System Maintenance Mode
    > 
    > May 19 15:01:42 su: 'su root' succeeded for root on /dev/console
    > 
    > e420-1.sjc#metastat
    > d8: Mirror
    >     Submirror 0: d10
    >       State: Needs maintenance 
    >     Submirror 1: d20
    >       State: Needs maintenance 
    >     Pass: 1
    >     Read option: roundrobin (default)
    >     Write option: parallel (default)
    >     Size: 4096602 blocks
    > 
    > d10: Submirror of d8
    >     State: Needs maintenance 
    >     Invoke: metasync d8
    >     Size: 4096602 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t0d0s0          0     No            Okay   Yes 
    > 
    > 
    > d20: Submirror of d8
    >     State: Needs maintenance 
    >     Invoke: metasync d8
    >     Size: 4096602 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t1d0s0          0     No            Okay   Yes 
    > 
    > 
    > d4: Mirror
    >     Submirror 0: d14
    >       State: Needs maintenance 
    >     Submirror 1: d24
    >       State: Needs maintenance 
    >     Pass: 1
    >     Read option: roundrobin (default)
    >     Write option: parallel (default)
    >     Size: 4096602 blocks
    > 
    > d14: Submirror of d4
    >     State: Needs maintenance 
    >     Invoke: metasync d4
    >     Size: 4096602 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t0d0s4          0     No            Okay   Yes 
    > 
    > 
    > d24: Submirror of d4
    >     State: Needs maintenance 
    >     Invoke: metasync d4
    >     Size: 4096602 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t1d0s4          0     No            Okay   Yes 
    > 
    > 
    > d1: Mirror
    >     Submirror 0: d11
    >       State: Needs maintenance 
    >     Submirror 1: d21
    >       State: Needs maintenance 
    >     Pass: 1
    >     Read option: roundrobin (default)
    >     Write option: parallel (default)
    >     Size: 8193204 blocks
    > 
    > d11: Submirror of d1
    >     State: Needs maintenance 
    >     Invoke: metasync d1
    >     Size: 8193204 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t0d0s1          0     No            Okay   Yes 
    > 
    > 
    > d21: Submirror of d1
    >     State: Needs maintenance 
    >     Invoke: metasync d1
    >     Size: 8193204 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t1d0s1          0     No            Okay   Yes 
    > 
    > 
    > d5: Mirror
    >     Submirror 0: d15
    >       State: Needs maintenance 
    >     Submirror 1: d25
    >       State: Needs maintenance 
    >     Pass: 1
    >     Read option: roundrobin (default)
    >     Write option: parallel (default)
    >     Size: 54330534 blocks
    > 
    > d15: Submirror of d5
    >     State: Needs maintenance 
    >     Invoke: metasync d5
    >     Size: 54330534 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t0d0s5          0     No            Okay   Yes 
    > 
    > 
    > d25: Submirror of d5
    >     State: Needs maintenance 
    >     Invoke: metasync d5
    >     Size: 54330534 blocks
    >     Stripe 0:
    >         Device     Start Block  Dbase        State Reloc Hot Spare
    >         c0t1d0s5          0     No            Okay   Yes 
    > 
    > 
    > d9: RAID
    >     State: Okay         
    >     Hot spare pool: hsp001
    >     Interlace: 32 blocks
    >     Size: 315892480 blocks
    > Original device:
    >     Size: 315893952 blocks
    >         Device     Start Block  Dbase        State Reloc  Hot Spare
    >         c1t0d0s0       5042        No         Okay   Yes 
    >         c1t1d0s0       5042        No         Okay   Yes 
    >         c1t2d0s0       5042        No         Okay   Yes 
    >         c1t3d0s0       5042        No         Okay   Yes 
    >         c1t4d0s0       5042        No         Okay   Yes 
    >         c1t5d0s0       5042        No         Okay   Yes 
    >         c1t6d0s0       5042        No         Okay   Yes 
    >         c1t7d0s0       5042        No         Okay   Yes 
    >         c1t8d0s0       5042        No         Okay   Yes 
    >         c1t9d0s0       5042        No         Okay   Yes 
    > 
    > hsp001: 1 hot spare
    >         Device      Status      Length          Reloc
    >         c1t10d0s0   Available    35104400 blocks        Yes
    > 
    > Device Relocation Information:
    > Device    Reloc Device ID
    > c0t1d0    Yes   id1,sd@SFUJITSU_MAJ3364M_SUN36G_01M41510____
    > c0t0d0    Yes   id1,sd@SSEAGATE_ST336704LSUN36G_3CD1PPV2000071306LU5
    > c1t10d0   Yes   id1,ssd@w20000020375b0eac
    > 
    > 
    > e420-1.sjc#metadb
    >         flags           first blk       block count
    >      a m  p  luo        16              8192            
    > /dev/dsk/c0t0d0s7
    >      a    p  luo        16              8192            
    > /dev/dsk/c0t1d0s7
    >      a    p  luo        8208            8192            
    > /dev/dsk/c0t1d0s7
    > 
    > 
    > If I do this:
    > 
    > 
    > bash-2.05# metasync d1
    > bash-2.05# metasync d4
    > bash-2.05# metasync d5
    > bash-2.05# metasync d8
    > bash-2.05# exit
    > exit
    > resuming mountall
    > 
    > 
    > then the machine boots and mounts all the file systems 
    > correctly. I have tried creating metadbs on seperate slices 
    > (unused), the number and location of db's does not seem to 
    > make any difference in this behavior. 
    > 
    > We had this identical problem with  280R, but had to reformat 
    > and reinstall without adequate investigation.
    > 
    > Any ideas what might be wrong ? Is this a known issue ?
    > 
    > Thanks,
    > Anshuman Kanwar   
    > Unix SysAdmin
    > Expertcity Inc.
    > --
    > (805) 690-5714   [off]   ansh@expertcity.com
    > (805) 895-4231   [cel]   5385 Hollister Ave 
    > (805) 690-6471   [fax]   Goleta, CA.  93111
    > 
    > 
    > 
    > 
    > --------prtdiag-------------
    > 
    > e420-1.sjc#prtdiag
    > System Configuration:  Sun Microsystems  sun4u Sun Enterprise 
    > 420R (4 X UltraSPARC-II 450MHz)
    > System clock frequency: 113 MHz
    > Memory size: 4096 Megabytes
    > 
    > ========================= CPUs =========================
    > 
    >                     Run   Ecache   CPU    CPU
    > Brd  CPU   Module   MHz     MB    Impl.   Mask
    > ---  ---  -------  -----  ------  ------  ----
    >  0     0     0      450     4.0   US-II    10.0
    >  0     1     1      450     4.0   US-II    10.0
    >  0     2     2      450     4.0   US-II    10.0
    >  0     3     3      450     4.0   US-II    10.0
    > 
    > 
    > ========================= IO Cards =========================
    > 
    >      Bus   Freq
    > Brd  Type  MHz   Slot        Name                          Model
    > ---  ----  ----  ----------  ----------------------------  
    > --------------------
    >  0   PCI    33     On-Board  network-SUNW,hme                 
    >                 
    >  0   PCI    33     On-Board  scsi-glm/disk (block)         
    > Symbios,53C875     
    >  0   PCI    33     On-Board  scsi-glm/disk (block)         
    > Symbios,53C875     
    >  0   PCI    33        PCI 2  SUNW,hme-pci108e,1001         
    > SUNW,qsi-cheerio   
    >  0   PCI    33     PCI 1 66  scsi-pci1077,2100.1077.1.4       
    >                 
    > 
    > No failures found in System
    > ===========================
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers
    

  • Next message: Joe Block: "iPlanet web server"

    Relevant Pages

    • X4500 cant boot from disksuite mirror
      ... Submirror 0: d31 ... State: Needs maintenance ... Invoke: metasync d30 ... Stripe 0: ...
      (SunManagers)
    • Disksuite SubMirror "Needs Maintenance"
      ... d30: Mirror ... Submirror 0: d10 ... State: Needs maintenance ... Stripe 0: ...
      (comp.unix.solaris)
    • Re: Metastat error ,
      ... Submirror 0: d181 ... State: Needs maintenance ... Stripe 0: ... If you got to this state AFTER doing a metasync that failed 23% in, ...
      (comp.unix.solaris)
    • Solaris 8 Disk Suite
      ... d0: Mirror ... Submirror 0: d10 ... State: Needs maintenance ... Stripe 0: ...
      (SunManagers)
    • How to remove all metadevices
      ... in order for the system to boot off the mirror? ... Submirror 0: d6 ... State: Needs maintenance ... Stripe 0: ...
      (SunManagers)