Re: BACKUP/IGNORE=INTERLOCK (was: Re: OpenVms Backup)
Date: Tue, 22 Jul 2003 02:07:56 GMT
On Mon, 21 Jul 2003 21:37:20 GMT, firstname.lastname@example.org (Hoff Hoffman) wrote:
>In article <EeB0mFx4f4af@malvm7.mala.bc.ca.>, email@example.com (Malcolm Dunnett) writes:
>:In article <KyWSa.521$5X.firstname.lastname@example.org>,
>: hammond@email@example.com (Charlie Hammond) writes:
>:> In a production environment, a backup strategy that uses /INGORE=INTERLOCK
>:> is bogus. Period.
>: That seems a bit extreme.
> All recent OpenVMS System Management Utilities manuals clearly flag the
> possibility of file data inconsistencies with BACKUP/IGNORE=INTERLOCK.
> Check the manual, and specifically the documentation around the keyword.
> The interlocks that are being ignored were implemented for a reason --
> there are folks that assume they know what BACKUP/IGNORE=INTERLOCK does,
> and don't understand the implementation nor fully realize the risks, and
> particularly don't consider the meaning of the interlocks. (Some platforms
> avoid this situation by not bothering with file-level interlocks. OpenVMS
> chooses the cautious course, and blocks these "corruptable" file-access
>:What about the case of an Oracle database
>:where one wants to do a "hot backup". Of course one needs to follow
>:all Oracle's instructions about using ARCHIVELOG mode and setting the
>:tablespaces in BACKUP mode, but given all that it's still necessary to
>:use /IGN=INTERLOCK to actually copy the "hot" tablespaces.
> Please check with Oracle -- if Oracle is willing to assure you that
> there is nothing in cache and everything is out on disk on this, then
> the /IGNORE=INTERLOCK might well produce the desired results. (While
> I'd not generally expect a consistent file to be locked, there could
> well be application-specific reason(s) to keep a lock on a file that
> is consistent. I don't know that.)
> Oracle Rdb certainly knows how to do this with the RMU tool, and
> entirely avoids the requirements for /IGNORE=INTERLOCK. Once Rdb
> tosses the data out, the resulting RMU/BACK archive files can then
> be run through BACKUP.
>: What about the case where it's files such as PAGEFILE.SYS that
>:are open, but they're marked "NOBACKUP" anyway - there's no danger in
>:saving the information needed to re-create the file, but it will fail if
>:/IGN=INTERLOCK isn't used ( or at least that seems to be the case
>:based on this thread ).
> For specific files, the contents are not required and thus the
> interlocks are not relevent -- more often than not, you are correct
> and these files can also be marked /NOBACKUP.
>: In many other cases there may be a potential for some "lost" data, but
>:it may be inconsequential ( YMMV ). For example, I don't really care if
>:I'm missing a few entries from OPERATOR.LOG or ERRLOG.SYS when I do an on-line
>:image backup of a system disk. I expect that the queue database may not copy
>:correctly ( so one may need to recreate entries manually ). I know there's
>:an (extremely remote) chance that SYSUAF.DAT could end up with a corrupted
>:bucket ( bucket check failure ). Since I do an image backup of my system disk
>:every day and keep several weeks of history I should be able to restore an
>:earlier version of SYSUAF.DAT in that case ( there's virtually no chance they'll
>:all be corrupted ).
> Remove the blade-guards at your risk. SYSUAF, RIGHTSLIST, the queue
> files, and any open files can potentially have cached data.
> The requirement for standalone BACKUP or the bootable environment
> when the operating system system disk is involved has been in place
> for as long as I can remember -- and for just the reasons we are
> discussing here.
>: I understand why the safe thing to say is that "/IGN=INTERLOCK is bogus",
>:but it would be more useful if HP accepted that most of us who run VMS systems
>:run them in a 24x7 environment, and shutting down every day to back up
>:disks isn't an option. What we really need is to fully understand the risks
>:of IGN=INTERLOCK and design backup procedures to mitigate them ( even if
>:they can't be completely eliminated ). A full explanation of how IGN=INTERLOCK
>:works and how it interacts with things like file caching would be a lot more
>:helpful than a simple warning to not use it.
> Interlocks are put in place to keep the visible file data consistent,
> and to provide a mechanism to indicate and prevent access collisions.
> When interlocks are ignored, the OpenVMS BACKUP utility may or may not
> flag these collisions, and may or may not copy consistent data.
> The recommendation is to incorporate the interlocks and the on-line BACKUP
> support processing at least partially into the application -- applications
> that operate continuously have constraints that traditional applications
> tend not to have, including file version number maintenance and, obviously,
> BACKUP support. There are documentation updates queued in this area, and
> yes, having more details on how to interlock access would be useful -- a
> set of updates with details on /IGNORE=INTERLOCK are queued for existing
> OpenVMS manuals, and will appear as these manuals are opened and updated
> for releases. (That said, there is extensive documentation on file-level
> operations and locking available within the OpenVMS documentation set, and
> there are products and approaches which can provide on-line operations.
> Rdb, RTR, existing RMS operations, etc.)
> AFAIK, there is no way to use /IGNORE=INTERLOCK with complete safety save
> on data you don't care about -- and then, why bother with the BACKUP?
> The application must be involved within its own BACKUP (or BACKUP-like)
> operations, or the application must be (able to be) quiesced, or the
> application must be shut down, or you must be using something akin to
> Rdb or distributed capabilities. One of the obvious options (within an
> application) is to use the BACKUP callable API, either directly or with
> metadata or specific extensions. Or to provide a way to flush the
> cached data out pending a BACKUP operation, possibly via a nudge or
> an application-specific BACKUP checkpoint operation, etc.
> Yes, BACKUP/IGNORE=INTERLOCK usually works, and the data on the disk
> is usually consistent. The cost of ensuring complete consistency
> within the archived application data might well be prohibitive, or
> it might not be economical. Or the data involved might well be
> recoverable. Etc.
> AFAIK, any operating system that provides on-line BACKUP involves
> the application either directly or through the use of a database
> or similar API, and OpenVMS certainly follows this model. The
> APIs and tools are available, as are database and like options.
> ---------------------------- #include <rtfaq.h> -----------------------------
> For additional, please see the OpenVMS FAQ -- www.hp.com/go/openvms/faq
> --------------------------- pure personal opinion ---------------------------
> Hoff (Stephen) Hoffman OpenVMS Engineering hoff[at]hp.com
But in all of this there is one inherant contradiction.
Typically System Admins choose one of a couple of options when backing
up their System Disks.
or Drop a Member of a shadow set and back that up.
Neither option is fully supported, standalone / offline backup is the
generally recognised supported backup methodology for the System Disk.
Alas, that kind of flies in the face of 24 x 7, hence the
contradiction. VMS is high availability, but if you want to back up
the system disk you have to take it down! Sorry but you can't.
There are other options but they're not obvious and I was alerted to
many by the paper John Gillings did in the VMS Technical Journal.
So, I use a merry mixture.
Drop a member of a shadow set.
Use Convert/share to take copies of my queue and authorisation files
(they're not on the System Disk anyway) and secure those with Backup.
And hope it doesn't come to that!
Now compared to some I'm a newcomer, only been involved with VMS since
4.4, but I've never had to recover a System Disk.
Re-built plenty but never had need of a restore. I guess that comes
down to the robustness of the OS, HBVS and the File System.