Re: 10-year cluster uptime anniversary



On May 31, 5:56 pm, JF Mezei <jfmezei.spam...@xxxxxxxxxxxxx> wrote:
Ed Wilts wrote:
Its primary purpose is to run a commercial publishing application,
Datalogic Pager.

Is this in an university environment ? Could you describe in what way
this would be considered "mission critical" enough to justify 5 nodes in
2 datacentres running a legacy OS instead of just some linux box in a
rack somewhere ?

No, this is not an educational environment - it's commercial.

Have you had to fight pressure to replace it, or is it regarded as a key
asset by your employer ?

It's a key asset. The business unit has attempted to replace it but
has been unable to find anything that can do the job. We have a lot
of custom apps wrapped around Pager. We certainly have a bunch of
Linux systems, some of which I helped build many years ago before I
transitioned into our storage team, and some which help run other key
parts of our infrastructure. Some of the output from VMS goes through
Linux systems for further processing. We also have other OSes, but
none with the uptime of my cluster :-). I have interactive users in 3
continents to the same cluster.

One thing Linux systems suck at is as nice generic batch engines, and
that's a key VMS cluster feature. We crank out a *lot* of batch jobs,
nicely balanced across the cluster.

Hace you been lucky in being able to use 5 spare nodes to provide this
service and give high uptime without being asked for it, or was this
planned/required from the start ?

Apparently this application ran on PDPs before I got here. It was a
mixed Vax/Alpha environment when I arrived (7 nodes in one data center
I think), and it's all Alpha now. My original employment contract
demanded a pre-set (and confidential) high level of uptime. It's not
luck - this cluster didn't have the same level of availability before
I got here. VMS alone doesn't provide HA - it needs proper
administration too. I used to help run a cluster that had (and may
still have) the record for the most number of simultaneous interactive
users - over 3,300 (and 6 nodes!).

We frequently work with tight deadlines where downtime has significant
implications for both us and our customers. An old sign on my door
said "OpenVMS - when downtime is not an option".

.../Ed
.



Relevant Pages

  • Re: KB889708
    ... As this is a new KB article, and I don't have clustered environment to test ... when the noise word file location is not replicated between the nodes in the ... When SQL Server 2000 is installed in a cluster". ...
    (microsoft.public.sqlserver.fulltext)
  • Re: Newbie question: Installing two node(Active-passive) test environm
    ... I am planning to install two node windows 2003 SP1 cluster on a test ... .Can the cluster be installed on a cooked-up hardware (2 dell power edge ... It's only a test environment and not a production. ...
    (microsoft.public.windows.server.clustering)
  • Re: Why Cluster in a test Environment?
    ... If your apps are not failover aware, ... SQL Server 2000 then takes anywhere from 30 seconds to minutes before ... a cluster but not appear in a standalone server. ... A non-clustered environment would not allow accurate stress testing do ...
    (microsoft.public.sqlserver.clustering)
  • Re: Can I setup cluster in workgroup configuration
    ... > In an everyday environment that has access to AD, ... > cluster) that you need / want to be seperate (maybe its a hosted service ... In such a case, I would get two low-end servers (they are pretty cheap, you ... I try to think towards the future and possible issues with support and ...
    (microsoft.public.windows.server.clustering)