SUMMARY: Qlogic HBA failover loses disk



Hello,

Many thanks to those who replied, and apologies for not replying to some
queries. It seems that the SAN we use (from ipstor), requires a piece of
their own software to be used on Solaris in order for failover to work.
(We weren't told this before.) So, as instructed, I have disabled mpxio
and installed their software. It has helped a bit, but we are still
having a problem with the failover. This has been passed to the ipstor
support people.

Replies were from:
francisco roque
Dean Ross-Smith
Lisa Kachold



Regards,

John.


Original message:

On Fri, 2009-07-17 at 22:36 +0100, John Horne wrote:
Hello,

We have two Sun T2000 (SPARC) servers running Solaris 10. Each server
has two Qlogic 2460 HBAs. Generally they work fine with our SAN.
However, when the SAN admins cause a failover (the HBA's use
multipathing), Solaris loses connection with the disks completely and
pretty much immediately. Trying to access the disks causes 'I/O error'
errors. We can unmount the disks, but cannot remount them (same error).
So we have to reboot the server.

We have looked at the QLA timers and they seem fine. Likewise we have
looked at Solaris timers such as 'fp_retry_count' and
'fp_offline_ticker'. However, the problem is that the connection is lost
in a very short time - around 5 or 6 seconds. The timers default values
are all way above this, and so should be fine.

If I run the Qlogic sansurfer software ('scli') while the disks are
'lost', then it detects both HBA's and reports them as being online.

Anyone any idea as to what is causing the loss of the disks in such a
short time? It seems to be more a Solaris problem than with the QLA.
Access to the SAN from Windows servers, with Qlogic HBA's, works fine
when failing over. However, the timer values are the same as set on the
Solaris servers.



Thanks,

John.

--
---------------------------------------------------------------
John Horne, University of Plymouth, UK Tel: +44 (0)1752 587287
E-mail: John.Horne@xxxxxxxxxxxxxx Fax: +44 (0)1752 587001
_______________________________________________
sunmanagers mailing list
sunmanagers@xxxxxxxxxxxxxxx
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



Relevant Pages

  • Qlogic HBA failover loses disk
    ... Sun is supposed to lose connection, ... We have two Sun T2000 servers running Solaris 10. ... Trying to access the disks causes 'I/O error' ...
    (SunManagers)
  • Re: Attaching Redhat to SAN
    ... Read the post again, I never said SAN on Solaris1.x, to quote my own ... HP-UX, AIX, Solaris and IRIX... ... not that I am bashing the Linux OS, but being written for the masses on ... solaris 2.6 servers into a fabric envirronment. ...
    (comp.os.linux.hardware)
  • M4000 jumpstart problem: no disks found
    ... We have two M4000 servers, and we want to install Solaris 10 08/07 on ... Solaris does not find any disks: ...
    (SunManagers)
  • Re: Setting up cluster
    ... If you are using a SAN, what you should do it make sure that the SAN ... and that both servers have connections to the SAN. ... Are you booting off of the SAN or is the OS installed on local disks? ... > quorum disk and shared ...
    (microsoft.public.windows.server.clustering)
  • Qlogic HBA failover loses disk
    ... We have two Sun T2000 servers running Solaris 10. ... Generally they work fine with our SAN. ... Trying to access the disks causes 'I/O error' ...
    (SunManagers)