Re: 306GB drives!

From: David McKenzie (david.mckenzie_at_paradigm-shift.biz)
Date: 08/23/03


Date: Sat, 23 Aug 2003 18:49:27 +1000

Nice to see your contributions again Keith

I particularly enjoy your so well reasoned and scientific contributions

I do wonder however about three site clusters. At some stage in the abstract
I winder at what stage complexity, replaces reliability. As an early player
in ths area I always had problems with a quorum node in a DT as distinct for
DR node.

The issue then was the voting node was a small box that did not have the
storage replicated. If you really can set up and afford the three way
replication you then have to look very carefully at the true independance of
the links between the sites.

As alwyas and I believe we have had this discussion, ultimately one has to
consider the procedures that gaurentee no single point of failure. In all
cases one has to review how complex such a set up such as three nodes
introduces, compared with the gain in reliabilty.

Personally my problems with two sites revolved around the peripheral issues
rather than the concept.

Possibly you would like to discuss keeping three sites up to the mark as
opposed to two.

I would start with what gain, compared to the increase in complexity of
procedures

however YMMV.

"Keith Parris" <keithparris_NOSPAM@yahoo.com> wrote in message
news:cf15391e.0308221032.7fa596d6@posting.google.com...
> Michael Austin <maustin@no-more-spam.firstdbasource.com> wrote in message
news:<Gxc1b.1132$Ct.799318088@newssvr30.news.prodigy.com>...
> > Keith Parris wrote:
> > > This strategy works fine until the controller [pair] fails, or the
> > > datacenter in which the EVA is located is destroyed by a disaster.
> >
> > unless your redundant controller is in another building up to
> > 30-50Kilometers away in another city along with an entirely redundant
> > cluster.. So in essence, you have 4 controllers (2 controller pairs).
> > The question is really "how much $$$ is your data worth?".
> >
> > EVA's have the ability to replicate itself to an entirely different
> > controller over a Wide-area SAN using dark-fibre.
>
> As do HSG, XP, etc. controllers.
>
> Controller-based data replication is certainly an appropriate solution
> for disaster tolerance in many cases. Often it's the ONLY option for
> any sort of disaster-tolerance for platforms with poor cluster
> support. And it's very popular for disaster recovery, where Recovery
> Time Objectives are less-stringent.
>
> But more commonly in the OpenVMS Cluster world, HBVS tends to be
> preferred in disaster-tolerant clusters, for the following reasons:
> 1) With controller-based replication, failover between sites is not
> automatic, and is difficult to automate (at best, one typically sees
> pre-written scripts for failover that get initiated manually), so it's
> only appropriate if your Recovery Time Objective is loose enough to
> allow the time for this failover to take place. HBVS allows faster,
> and fully-automated, failover, without requiring application downtime.
> 2) With controller-based replication, data is typically accessible
> from only one site at a time (at best, read-only access is available
> at the remote site). All I/Os (both reads and writes) to the
> mirrorset from systems at the remote site must be done remotely, and
> suffer the inter-site latency. With HBVS, reads can be directed to
> the shadowset member disks at the same site; only writes to the remote
> shadowset members must go remotely.
> 3) Inter-site links are expensive, particularly at 30-50 km distances.
> With a VMS cluster, you need an SCS interconnet between sites. To use
> controller-based replication, you need an additional interconnect that
> can carry Fibre Channel traffic (or conversion boxes to allow FC
> traffic to share the bridged interconnect used for SCS). Using HBVS,
> you have the choice of MSCP-serving remote I/Os over the same SCS
> interconnect, and/or using access to disks through a FC-capable
> interconnect.
> 4) VMS Clusters have a Quorum scheme to prevent uncoordinated access
> to the multiple copies of the data that mirrorset members at different
> sites represents. With controller-based data replication, you must
> basically use humans to replace the Quorum Scheme and the HBVS
> Generation Number algorithms, keeping track of (through failure
> events) which copy of the data is the most up-to-date and only
> allowing user access and updates to that copy at any given point in
> time.
>
> There are disaster-tolerant VMS clusters in place which actually have
> 3 datacenters (plus a quorum node in a 4th site), where each of the 3
> sites has a valid copy of the data, and the cluster can continue
> operating despite loss of any 2 of those 3 datacenters without data
> loss or downtime, thanks to HBVS.



Relevant Pages

  • Re: Print Server CLustering
    ... Replication of data, so in order to have a GeoCluster using MNS you would ... Server Cluster Network Requirements and Best Practices ... Windows Clustering: Storage Area Networks ...
    (microsoft.public.windows.server.clustering)
  • Re: Cluster newbie: Majority Node Set
    ... regarding Quorum and Majority Node Set, you would only want to use MNS ... Replication of data, so in order to have a GeoCluster using MNS you would ... Also if the hardware is not on the Windows Server Catalog under ... Windows NT/2000/2003 Cluster Technologies ...
    (microsoft.public.windows.server.clustering)
  • Re: SQL Server 2000 + Remote Mirroring
    ... cluster where the nodes are split between data centers. ... The disk replication taking place within a geo-cluster is VERY different ... log shipping you are going to have one LS session per database as well. ...
    (microsoft.public.sqlserver.clustering)
  • Re: cluster scripting suggestions?
    ... if not %errorlevel% == 0 goto online ... REM The cluster group is online, ... REM Add normal processes to initiate replication ... disaster recovery site. ...
    (microsoft.public.windows.server.clustering)
  • Re: 99.9 service availability
    ... In the SCC model, if the source is A/P, then the standby cluster can be a ... The CCR model would certainly be a lower cost solution and it may well work ... data to site A using snapmirror to other 2 nodes stand by cluster ... cluster and snapmirror replication to a DR site using standby clusters. ...
    (microsoft.public.exchange.design)