SUMMARY: sd_max_throttle settings in /etc/system and in jni configuration file

From: Miller, Anthony, A, Tech Dev, VF UK (Anthony.Miller_at_gb.vodafone.co.uk)
Date: 11/28/03

  • Next message: ±è¿µ¼º: "autorepeat key off somtimes"
    Date: Fri, 28 Nov 2003 09:36:35 -0000
    To: "SUN Managers (E-mail)" <sunmanagers@sunmanagers.org>
    
    

    All...

    Apologies for the late delivery of this posting. I forgot I hadn't sent it
    out. The original posting is shown at the end of this mail for reference.

    Many thanks to the following who. There were others who replied who didn't
    want to be named. Many thanks to you all:

    Johan Hartzenberg [mailto:jhartzen@csc.com]
    Kris Briscoe [krisb@tti-telecom.com]
    Gunnar Brasack [gbr@asv.de]

    There was some discussion as to whether the sd_max_throttle setting was EMC
    only. We are using XP512 (rebadged HDS arrays) so this may not be a relevant
    parameter for us.

    Others reported that that had had discussion with sun about the
    sd_max_throttle parameter in the /etc/system file. The advice was <256 /
    number of LUN's>.

    There was some discussion around disabling DMP to see if performance
    declined/improved etc. If it improved, maybe investigate adding additional
    paths.

    Basically - we haven't resolved the problem and investigations continue. We
    tested values of 2 (/etc/system) & 64 (JNI file), 8&8, 4&4, 2&2, 20&20. The
    throughput never changed more than a few percent, but always in the downward
    direction. We have therefore concluded that this is not a significant cause
    of the problems and set the values back to their defaults of 2&64.

    Miraculously, our slow disk service times have now gone away but nobody knows
    why. Always worrying! The SAN team made a few tweaks to the infrastructure
    but wouldn't admit to changing anything significant.

    Also, HP have made some changes on the array to cater for other projects but
    say that this would not affect us.

    The box itself is now under low stress but the throughput has not increased.
    There are some Oracle tweaks our DBA is going to test out and we are going to
    investigate the network outside of the host itself.

    If I come up with anything devastating, I'll publish a further summary.

    thanks and best regards

    Tony Miller

    ======================Original
    Posting========================================
    All...

    I have a Solaris-8 F15K domain with 2 x JNI FCX2-6562 dual port adapters
    and V5.3 of the JNI driver.

    The JNI's are connected to brocade-3900 switches and are forced to 2Gb. An
    XP512 disk array is also connected to this dedicated disk SAN. We have 120
    LUNS (Open'E's) presented to each port of the first card (i.e.., 240 LUN's
    in total). VxVm DMP is used and we can see these 2 x 120 LUNS down the
    alternate paths on the second card.

    The LUNs are under VxVm control, (V3.5) with VxFs V3.5 file systems on top.
    We have done some file system tuning via the /etc/vx/tunefstab and set the
    relevant mount points up with read_pref_io=64k,read_nstream=1.

    I wont go into much detail but we are experiencing some disk write
    throughput issues. This is under investigation but specifically the issue
    relates to a dramatic throughput drop off in a highly parallel workload
    environment.

    Shown below is the tail end of the JNI configuration file - for the last
    two LUNs only:

    jnic146x3-target0_lun118_hba="jnic146x3";
    jnic146x3-target0_lun118_throttle=64;
    jnic146x3-target0_lun119_hba="jnic146x3";
    jnic146x3-target0_lun119_throttle=64;

    Below is an extract of my /etc/system file:

    * Allow SCSI transfers up to 8MB on VXVM layers
    set vxio: vol_maxio=16384
          : :
          : :
    *
    * Required for XP512
    *
    set sd:sd_io_time=0x3c
    set sd:sd_max_throttle=2
    set maxphys=8388608

    I have searched the archives and various other mailing lists. Several
    articles refer to the /etc/system file entries as being the 'proffered'
    ones with 'optimal performance'.

    I have a specific question relating to the lun throttle settings (of 64 in
    the JNI driver file) and its interaction with the sd_max_throttle setting
    of 2 in the /etc/system file. I checked the Solaris-8 kernel tunables guide
    but
    couldn't find anything useful with regard to this.

    Question 1 - do the LUN level throttle settings (in the JNI config file)
    override those in the /etc/system config file or vice versa? Or are they
    independent (i.e.., the /etc/system file entries relating to the direct attach
    SCSI disks only).

    Question 2 - Are these suitable throttle settings (for 120 luns per HBA
    port). It has been suggested that values of 4-6 would be more appropriate
    (1024
    for an XP512 divided by 120 LUN's per port = 8.5)

    Sorry to pose such vague questions, but your advice is appreciated.

    Many thanaks and best regards

    Tony Miller

    ===================End Of Original
    Posting========================================

    +----------------------------------------+
    | TONY MILLER
    | Team Leader : Technical Projects,
    | VODAFONE LTD,
    | Derby House,
    | Newbury Business Park,
    | Newbury, Berkshire.
    |
    | Phone +44 (0)1635-677687(local)
    | Mobile +44 (0)7766-028752
    | Email anthony.miller@vf.vodafone.co.uk
    | FAX +44 (0)1635-233517
    +-------------------------------------------
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers


  • Next message: ±è¿µ¼º: "autorepeat key off somtimes"

    Relevant Pages

    • Re: HP EVA4000 / IBM DS4300 / EMC CX3-20/40
      ... disk array with the virtual raidsets on top. ... So, the system admin, and the DBAs had to create and manage lots of ... separate LUNs and *manually* manage the performance among them to ... applications on the SAN. ...
      (comp.arch.storage)
    • RE: Clustering 2005 SQL server using hyper v
      ... What you NEED is to expose two LUNs, quorum and "database", to both ... nodes, however I do recommend a better configuration, in which you use ... instance move log files to another SAN or disk set on the SAN. ...
      (microsoft.public.sqlserver.clustering)
    • Re: Migrated (SAN_Copy) LUNs to new SAN, now we see too many disks
      ... Those unreadable LUNs are probably "LUNZ" devices (I'm assuming Clariion ... this would likely cause a signature to change on one of the ... physical disk cluster resource and recreate them. ... Microsoft DiskPart version 5.2.3790.1830 ...
      (microsoft.public.windows.server.clustering)
    • Re: high system cpu load during intense disk i/o
      ... And the fact that this happens only when running two i/o processes but when running only one everything is absolutely snappy, makes me sure that this is a kernel bug. ... This probably means that processor needs access to PCI bus in order to read ACPI timer register. ... 20GiB disk probably can send data at 20MiB/s rate. ... However I find it quite possible to have reached the throughput limit because of software problems. ...
      (Linux-Kernel)
    • Device mapper/LVM throughput problem
      ... I've just tested the disk throughput on my machine using hdparm and found some ... When I test the throughput on a raid partition or the LVM ... HDIO_DRIVE_CMD(wait for flush complete) failed: Inappropriate ioctl for ...
      (Linux-Kernel)