AIX, JFS 2, Snapshots and power loss





I have recently been tasked with improving the backup strategy of a
legacy server supporting 150 users via a terminal based interface. The
issue is currently that the server has a single backup taken at 2am,
and due to the nature of the application suite and languages involved
(each data file is discrete and there is no enforced referential
integrity between data files, but a record can be spread across
multiple data files - each data file is written to sequentially in the
application suite, and thus you have the potential for one file to be
updated while another is not, creating an inconsistent record), the
server needs to be unused during this time.

As such, we stand to lose a significant amount of work if the server
was to fail in some fashion at the end of the working day, but prior
to the backup being taken in the early morning.

As the server is running AIX 5.x, I have decided to implement JFS2
snapshots on the file systems that require backup, which means I can
reduce the 'time off system' in the early morning backup to just that
required to actually take hte backup. This will be our 'guaranteed
backup'.

However, I also wish to try and mitigate the risk of a full days data
loss via the taking of two 'non-guaranteed backups' during the day,
without removing users from the system.

The justification here is that, if we were to encounter a full
powerloss situation on the server, significant portions of the data
files will get corrupted - this occurred a month ago (the UPS blew the
protected circuit, taking down the server - one of those things that
are never supposed to happen). However, the act of taking a snapshot
will not result in corrupted data files within the snapshot, just the
potential for corrupted records currently being worked on. Or, in
other words, controllable, managable corruption levels that can be
checked for if everyone understands that it exists in the first place.

So, the question I need to ask is:

How well does JFS2 Snapshots handle complete powerloss situations? In
the incident we had last month, we lost approx 60% of our data through
corruption, but how would a snapshot of that partition have faired?
Would it also have suffered corruption, or would it have been OK?

For example, I have /mydata/ and I snapshot it to /mysnapshot at 6pm.
At 7pm we encounter the 'worst case scenario' and /mydata is left
significantly corrupt. Will the snapshot also be corrupt? How does AIX
and JFS2 handle this in the background? Will the snapshot be usable?

I hasten to add that there are also tape and remote file copy backups
being taken during the 2am window, so we are not relying on snapshots
as the actual backup, just a means to an end of improving the backup.
The extra snapshots during the day are a certain nicety rather than
anything we would be reliant on.


Cheers
Richard Price
.



Relevant Pages

  • Re: salvage erased files?
    ... I'm still pretty new to windows server software. ... >> to large profile folders. ... >> data files on the server. ... > the files won't be locked/in use when the backup runs, ...
    (microsoft.public.win2000.general)
  • Re: explorer.exe stopped working
    ... I was basing that action on the provided information that the OP believes his server *has* been compromised. ... their data from a backup. ... Restore only data files and use ...
    (microsoft.public.windows.server.sbs)
  • Re: Setting Up A Backup Server
    ... The only way I know to implement such a solution is to use a Backup, and, ... > I need help on the best way to set up a backup Exchange server for my> domain. ... I only boot> up the backup server when I need to take down my main server and use NAT on> my router to reroute to the backup server. ... I tried pointing the new server> data files to the same location as the main server but it says the files> already exist and doesn't want to use the existing ones. ...
    (microsoft.public.exchange2000.general)
  • Re: Online backup solutions
    ... Although there is something to be said for a "bare-metal restore", I find that if a restore is necessary it is usually data that a user has accidentally deleted. ... I know that by not doing this I stand to "lose" all of my settings and personalization of the OS but my experience has been that if I lose my server it is a physical loss. ... I have a small network here of 8 users/computers so backing up the really important stuff (data files) is more valuable to me than a full OS backup. ...
    (microsoft.public.windows.server.sbs)
  • Re: Shadowcopy with Exchange 2003
    ... It is a service that is generally used by backup software. ... It provides the ability to take a snapshot by quiescing Exchange ... assuming you have a decent server. ...
    (microsoft.public.exchange.admin)