Bad SDLT 320?

From: Chris Cameron (Chris.Cameron_at_NetThruPut.com)
Date: 08/26/04

  • Next message: Rachid BOUKHARI: "SUMMARY: Colour problem on PGX64/Blade100"
    To: sunmanagers@sunmanagers.org
    Date: Thu, 26 Aug 2004 10:48:53 -0600
    
    

    Before I go to all the trouble of trying to get Sun to replace my tape
    drive, I wanted to tap some of the experience that's on this list to
    see if what I'm seeing would point to a bad tape drive.

    Have a V240 hooked up to a Sun SDLT 320 drive. Up until this week it was
    backing up ~60 Gigs worth of data using AMANDA (been doing so for 7
    months). That 60 gigs is spread across 3 servers, 1 of which 1 is local
    and the rest are remote.

    When AMANDA does backups, it'll consistently fail on a remote 25 gig
    partition (and only this partition). The errors that AMANDA gives
    (which from my understanding are just passed along dump errors) are:

      devl2 /dev/md/dsk/d8 lev 0 FAILED [out of tape]
      devl2 /dev/md/dsk/d8 lev 0 FAILED ["data write: Broken pipe"]
      devl2 /dev/md/dsk/d8 lev 0 FAILED [dump to tape failed]

    After this the tape drive is unresponsive to any commands, and the
    following errors show up in /var/adm/messages:

    Aug 26 09:10:54 prod2 scsi: [ID 107833 kern.warning]
    WARNING: /pci@1c,600000/scsi@2,1/st@5,0 (st12):
    Aug 26 09:10:54 prod2 SCSI transport failed: reason 'incomplete':
    retrying command
    Aug 26 09:10:56 prod2 scsi: [ID 107833 kern.warning]
    WARNING: /pci@1c,600000/scsi@2,1/st@5,0 (st12):
    Aug 26 09:10:56 prod2 SCSI transport failed: reason 'incomplete':
    retrying command
    Aug 26 09:10:57 prod2 scsi: [ID 107833 kern.warning]
    WARNING: /pci@1c,600000/scsi@2,1/st@5,0 (st12):
    Aug 26 09:10:57 prod2 SCSI transport failed: reason 'incomplete':
    giving up
    Aug 26 09:26:39 prod2 scsi: [ID 365881
    kern.info] /pci@1c,600000/scsi@2,1 (glm1):
    Aug 26 09:26:39 prod2 Cmd (0x1b37578) dump for Target 5 Lun 0:
    Aug 26 09:26:39 prod2 scsi: [ID 365881
    kern.info] /pci@1c,600000/scsi@2,1 (glm1):
    Aug 26 09:26:39 prod2 cdb=[ 0xa 0x0 0x0 0x80 0x0 0x0 ]
    Aug 26 09:26:39 prod2 scsi: [ID 365881
    kern.info] /pci@1c,600000/scsi@2,1 (glm1):
    Aug 26 09:26:39 prod2 pkt_flags=0x0 pkt_statistics=0x61 pkt_state=0x7
    Aug 26 09:26:39 prod2 scsi: [ID 365881
    kern.info] /pci@1c,600000/scsi@2,1 (glm1):
    Aug 26 09:26:39 prod2 pkt_scbp=0x0 cmd_flags=0x18e1
    Aug 26 09:26:39 prod2 scsi: [ID 107833 kern.warning]
    WARNING: /pci@1c,600000/scsi@2,1 (glm1):
    Aug 26 09:26:39 prod2 Disconnected command timeout for Target 5.0
    Aug 26 09:26:39 prod2 genunix: [ID 408822 kern.info] NOTICE: glm1: fault
    detected in device; service still available
    Aug 26 09:26:39 prod2 genunix: [ID 611667 kern.info] NOTICE: glm1:
    Disconnected command timeout for Target 5.0
    Aug 26 09:26:39 prod2 glm: [ID 160360 kern.warning] WARNING:
    ID[SUNWpd.glm.cmd_timeout.6016]
    Aug 26 09:26:39 prod2 scsi: [ID 107833 kern.warning]
    WARNING: /pci@1c,600000/scsi@2,1/st@5,0 (st12):
    Aug 26 09:26:39 prod2 SCSI transport failed: reason 'timeout': giving
    up

    Cycling the tape drive will have it respond to mt again.

    If I try to do a dump manually from the local machine, I'll consistently
    get:

    </> # ufsdump -0f /dev/rmt/1n /dev/md/dsk/d6
      DUMP: Writing 32 Kilobyte records
      DUMP: Date of this level 0 dump: Thu Aug 26 09:05:39 2004
      DUMP: Date of last level 0 dump: the epoch
      DUMP: Dumping /dev/md/rdsk/d6 (prod2:/prod) to /dev/rmt/1n.
      DUMP: Mapping (Pass I) [regular files]
      DUMP: Mapping (Pass II) [directories]
      DUMP: Estimated 35790376 blocks (17475.77MB).
      DUMP: Dumping (Pass III) [directories]
      DUMP: Dumping (Pass IV) [regular files]
      DUMP: Write error 106032 feet into tape 1
      DUMP: NEEDS ATTENTION: Do you want to restart?: ("yes" or "no")

    This happens on a number of tapes, so I doubt it's a tape error.

    Is this a clear case of a bad tape drive? Tonight I'll try on a second
    V240 with a different SCSI cable just to be sure. I have run a cleaning
    tape through it for good measure.

    Thanks,
    Chris
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers


  • Next message: Rachid BOUKHARI: "SUMMARY: Colour problem on PGX64/Blade100"

    Relevant Pages

    • Multi-tape backup with "dump"
      ... dump wedges when I put in a new tape. ... as much cpu as they can, ... <ACPI PCI bus> on pcib0 ...
      (freebsd-questions)
    • ufsrestore not /dev/fd friendly
      ... I've recently begun keeping dump tapes' contents online. ... Verify tape and initialize maps ... Unfortunately, on Solaris, ufsrestore seems to incorrectly interpret ... resync restore, skipped 1 blocks ...
      (comp.unix.solaris)
    • Bad SDLT 320 - updates?
      ... WARNING: /pci@1f,4000/scsi@2/st@4,0: ... DUMP: Date of last level 0 dump: the epoch ... Write error 31340 feet into tape 1 ... DUMP: NEEDS ATTENTION: Do you want to restart?: ...
      (SunManagers)
    • Re: Dumping SMF directly to TAPE
      ... Actually our SMF dump job dumps to a daily tape which in the next step is mod on to a weekly tape using IEBGENER. ... This proc used the old IBM "SMFDUMP" program to dump all non-empty SMF datasets to DASD GDS. ...
      (bit.listserv.ibm-main)
    • Re: Poor SCSI Tape Drive Performance
      ... DUMP: Volume 1 took 6:04:29 ... DUMP: Volume 1 transfer rate: 2849 kB/s ... LTO-3 drives with good results. ... blocksize for my VXA-1 and VXA-2 tape drives is 65536 bytes. ...
      (comp.os.linux.misc)