Re: OpenVMS Hardware Troubleshooting? (was: Re: walking into a war zone

From: Hoff Hoffman (hoff_at_hp.nospam)
Date: 01/24/04


Date: Sat, 24 Jan 2004 00:57:46 GMT

In article <e6c6a64b.0401231240.7ebaf9a@posting.google.com>, emailforwes@earthlink.net (Wes Emerson) writes:
:Okay, let me paint the picture for you, I am a seasoned Solaris, Linux
:and BSD admin with a few years and emergencies under my belt.

  Then you clearly know the value of picking relevent subject lines for
  your postings (to allow you to get the quickest answer, and to avoid
  the unfortunate chance that your postings are interpreted as spam),
  and describing what you want to do not only in terms of commands or
  examples from other operating systems but also in more generic terms
  for those less than familiar with enviroments you know well.

  You will also obviously be aware that concepts often map from one
  operating system to another, but the implementations and command
  syntax involved can differ widely -- or wildly.

  I would expect that any UNIX adminstrator would expect me to become
  at least somewhat familiar with basic UNIX system administration and
  commands and UNIX documentation, for instance, before entertaining
  my questions on dmesg or sysconfig, obviously.

:Now some far flung part of the company wants me to troubleshoot an
:unknown Dec Vax box somewhere up the road.

  (Without reading too far into your message, y'all don't appear to be
  thrilled about this -- I might well be misinterpreting, obviously. :-)

  Hardware? Software? Application? Etc. This is obviously a rather
  open-ended statement. Given some phrasing, I will assume a hardware
  problem -- but that is far from certain.

  VAX systems run a variety of operating systems -- since you are
  asking your question here in comp.os.vms, I will assume the target
  operating system is OpenVMS VAX -- but again, that's far from being
  a certainty.

:Now my question to the multitudes, are there any commands that can be
:used to check hardware such as cpu's, memory, and discs to look for
:hardware failures.

  Sorry. I'm just a person, and not a multitude. :-)

  As others have mentioned in previous replies, SHOW ERROR is the most
  typical command used for a quick peek at error activity, followed by
  one of the common error analysis tools.

  This command might or might not help, depending on the particular
  component and the severity of the error. (If the OpenVMS VAX system
  is not bootstrapping or the VAX system hardware is not powering up,
  for instance, the SHOW ERROR command is not going to be available. :-)

  Commands reviewing hardware errors won't help for software errors,
  either -- whether an operating system problem, a layered product
  error, or an application software problem.

:The Unix equivalents would be something lik dmesg (for general
:status), the messages file (/var/log/messages or /var/adm/messages),
:if I'm really lucky prtdiag on solaris, and something that checks the
:hard drives (solaris iostat -En.

  You can log the system startup messages (dmesg apparently provides at
  least this) using a system parameter STARTUP_P2, and specifically by
  setting the system parameter to D. (OpenVMS conversational bootstrap
  discussions are in the OpenVMS FAQ -- help text for the various system
  parameters is available, too.) You can also simply watch the various
  startup messages scroll by -- on a slow console terminal set for smooth
  scroll, this is an obvious (albiet ugly) approach.

  The prtdiag analog is probably the ANALYZE/SYSTEM utility's commands
  CLUE CONFIG and CLUE FRU. This latter command requires DECevent or
  HP (Compaq) Analyze, a tool which is usually installed and is also
  available for downloads via the pointers in the OpenVMS FAQ -- the
  former command shows you the hardware configuration.

  The root question is, however, what's wrong with the box -- if you're
  not familiar with the target hardware and software (and no offense is
  intended -- I'd be asking the same questions if sent out to troubleshot
  a bad frobnitz on the walkabout platform, for instance) this is going
  to take some time.

  In addition to developing and determining some additional details around
  the problem or the failure, the OpenVMS VAX operating system version, the
  particular VAX model involved, and details on the hardware and/or software
  specific to the error will all be of interest. (These details can help
  localize the error and allow answers to be better tailored to your
  environment.)

  Have you also considered engaging a consultant, a third-party VAX hardware
  service organization, or HP Services? (Assuming bad hardware, of course.)

  Assuming the VAX is not in a literal war zone -- I prefer to wait until
  the projectiles have settled out before entering such areas -- I'm sure
  there are folks around that can assist you in resolving this quickly.

 ---------------------------- #include <rtfaq.h> -----------------------------
    For additional, please see the OpenVMS FAQ -- www.hp.com/go/openvms/faq
 --------------------------- pure personal opinion ---------------------------
        Hoff (Stephen) Hoffman OpenVMS Engineering hoff[at]hp.com



Relevant Pages

  • Re: how to get Ultrix on MicroVax 3100?
    ... One of the best things about OpenVMS both VAX and Alpha, ... In addition to the engineering groups and their huge collection of hardware, ... the support for the new system would blended back into the main ... >> of the platforms were often substantial. ...
    (comp.sys.dec)
  • Re: Whither VMS?
    ... ways - people don't by and large do graphics on the HP3000 like they do on the VAX, DBMS has a few features I'd like to have seen in MPE's TurboImage, and I'm told that clustering lets VAXes scale more nicely. ... But I've never been a hardware guy; hardly even a software guy beyond the point where I opened and used the Intrinsics manual. ... So I guess while the HP3000 was going through about 6 hardware reincarnations, and those who were close to the metal were having to relearn a whole bunch of stuff, I was happily working to a largely unchanging API, except for the enhancements, and for when HP switched from what it called the Classic architecture to the Precision architecture. ... trendy RISC architecture. ...
    (comp.os.vms)
  • Re: could not access disk
    ... > SYSBOOT prompt. ... node name of the Vax, or $2$DKA etc where 2 is the allocation class. ... > now plz tell what command i should use to ... information about the hardware, e.g. what sort of VAX, what sort of disks ...
    (comp.os.vms)
  • Re: Linux 2.6.25-rc4
    ... the particular command in question. ... We should make the core IDE code *robust*. ... Your patch is utter crap. ... tekign the response of the hardware into account, ...
    (Linux-Kernel)
  • Segmentation Fault during boot up.
    ... Hardware is a Dell PIII 700 Mhz with 512 Mb RAM ... In the past few weeks I have been getting segmentation faults normally ... when I try to remove a file with the rm -f command or when the system ... new hard drive then restore from my backup. ...
    (alt.os.linux.redhat)

Loading