Re: support quality (Re: dump | restore fails: unknown tape headertype 1853384566)



At 07:25 PM 3.25.2009 +1030, Daniel O'Connor wrote:
On Wednesday 25 March 2009 18:37:17 Bartosz Stec wrote:
Yes, dump is broken for you, deal with it. It is quite possible your FS
is corrupt, and/or your disk is damaged.

..and/or it is some other hardware problem, maybe you also should test
your memory with memtest or something similiar? I'm using dump/restore
very frequently and I had never seen such problem. Neither on -RELAESE,
-STABLE, nor -CURRENT.
So I think you should make sure that your problem is not
hardware/filesystem dependent before you point dump/restore as a couse
of the problem. Peter Jeremy already gives you good hints to do that.

One other thing would be to make absolutely sure that your version of dump &
restore are in sync, the are very machine/version dependent.

--

I've been watching this thread with some interest since we've had some
similar problems with dump/restore which we use every morning via cron
scripts on a number of servers to produce bootable clones as part of our
backup program. Have been doing this for years and also never saw a problem
as most of you say. We prefer dump/restore for backups.

However, last month upon upon upgrading those servers from FBSD-6.3px
(RELEASE) to 7.0px (RELEASE) we found that about one-half of the servers
had a similar problem as the original poster while the other half did not.
All of the servers (rackmounts) use the same (type) hardware. We spent many
hours trying to solve the problem with those that failed to dump/restore.
Also, searched for any others with the problem and only found a very few,
but without solutions to this issue. (Indeed, the only one was a reference
to any efforts to restore an older OS version which didn't apply here).

And, indeed we tried everything suggested here to fix the proble without
success. Sometimes the problem was dump which would reach 99% and never
finish -- it would stick there and would overlap with another cron start
the next day, and the next day, and the next day. (The servers that did
work fooled us and we found out about this issue on the others when the
overlaps appeared and drew our attention). That's when our work to try and
solve the issues started and went on for days.

Our script that has always worked contained this (after scraping and making
fresh FS):
/sbin/dump -D /root/dumpdates -0auL -f - / | /sbin/restore -rf -

Indeed, the first thing we did was to remove the pipe and tried to restore
from a file. However, because the dumps would not go past the 99%, no file
to restore from! There were some exceptions when the dump would complete,
but was not reliable. When these reached the restore level, restore would
go crazy with errors.

SOLUTION
The "clones" are a very important pasrt of our backup program. Since the
dump side of the problems simply stuck and provided no error message at all
and the errors from any restores were not useful, our only solution was to
revert back to FBSD-6.3 on those servers with this issue and dump/restore
went back to working again. We left those that were working on FBSD-7.0-R
and they continue to work okay.

We could only conclude that the problem was perhaps something with
hardeware, perhaps the way memory was handled in 7.0, but that is only a
guess.

Once again, every suggestion on this thread was tried during our long
efforts to fix the issue. Perhaps there is yet another suggestion? In the
meantime, we've decided to wait for 7.2R (7.1 did not fix the problems
either).

/Jack

(^_^)
Happy trails,
Jack L. Stone

System Admin
Sage-american
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: running restore non-interactive
    ... > dump/restore to a secondary drive. ... > The dump works fine, but at the end of each session, restore always ... but nowhere have I found a way to make restore ... > across a few patches floating around that is supposed to fix this, ...
    (freebsd-questions)
  • Re: dump/restore indexing question
    ... > dumped to a single tape since I don't have daily acs to swap the tapes ... you could do a 'restore -t' on the tape and pipe the output to a file. ... Lack of a good index feature is a major deficiency of dump/restore from ... I hope you do an occasional dump 0 and not all just dump 9. ...
    (freebsd-questions)
  • Re: DFSMSdss Data Loss Exposure (UPDATE)
    ... PATCH BYTE FOR DS1DSCHA / DS1IND20 ON FULL AND TRACKS RESTORE ... TRACKS RESTORE when DFSMSdss is not invoked via the API. ... This future enhancement would possibly include a change in default RESTORE behavior based on use of "RESET" in DUMP, with a new RESTORE option to allow for overriding default behavior. ... The crux of the problem is that the only practical way to make a physical copy of a volume with DFSMSdss for moving the volume or recovering it elsewhere is with "DUMP FULL" physical dump, and if this is recovered to another device with the obvious counterpart "RESTORE FULL", the result is currently not an identical volume, but a volume with all the VTOC "changed" bits on the volume reset. ...
    (bit.listserv.ibm-main)
  • DFSMSdss Data Loss Exposure (Was Re: DFSMSdss DOC APAR OA20117)
    ... which is the default usage, indicating the dump is not intended as a replacement for individual dataset dumps -- to save the image of a DASD volume and expecting at some point to use this dump with a "RESTORE FULL" to move the volume to another DASD drive, as part of a Data Center move or migration to new equipment, or for Data Center recovery at a remote site, THEN MOST LIKELY YOU ARE CURRENTLY EXPOSED TO SOME FORM OF DATA-LOSS! ... The crux of the problem is that the only practical way to make a physical copy of a volume with DFSMSdss for moving the volume or recovering it elsewhere is with "DUMP FULL" physical dump, and if this is recovered to another device with the obvious counterpart "RESTORE FULL", the result is currently not an identical volume, but a volume with all the VTOC "changed" bits on the volume reset. ...
    (bit.listserv.ibm-main)
  • Re: DFSMSdss Data Loss Exposure (UPDATE)
    ... APAR OA20907 was opened by IBM to provide a temporary fix to the DFSMSdss RESTORE problem in the form of ADRDSSU patch byte that can be set by an installation or on a specific invocation of ADRDSSU to inhibit the reset of the DS1DSCHA bit on "RESTORE FULL" or "RESTORE TRACKS". ... This future enhancement would possibly include a change in default RESTORE behavior based on use of "RESET" in DUMP, with a new RESTORE option to allow for overriding default behavior. ... The crux of the problem is that the only practical way to make a physical copy of a volume with DFSMSdss for moving the volume or recovering it elsewhere is with "DUMP FULL" physical dump, and if this is recovered to another device with the obvious counterpart "RESTORE FULL", the result is currently not an identical volume, but a volume with all the VTOC "changed" bits on the volume reset. ...
    (bit.listserv.ibm-main)