Re: Stripe sizing and maxcontig




"Mark" <mark.round@xxxxxxxxx> wrote in message news:1154013691.709865.90720@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hi all,

I've been reading up a lot recently on filesystem tuning, and stripe
sizing, mainly because we've just got a shiny new 3510FC array (dual
RAID controllers) and are looking at moving our databases over onto it
(Solaris 10 update 1 / AMD64 architecture, UFS filesystems).

First off - the person from the VAR who came round to demo it said that
a RAID 5 volume was perfectly OK, and we wouldn't incurr that much of a
performance hit from choosing that over, say RAID 1+0. Is this correct
? Our databases are much more read-intensive than they are write, but
still... Alarm bells are ringing at this advice even though the array
does have a hefty whack of cache on board.

Talk to your DBA. It is likely different parts of the database may be
suited by different configurations. With newer firmware, you have
much more freedom to tailor stripe size and so on differently on
each logical drive. Check the release notes on sun.com.

If you need to update your firmware, do so before you start, of course.


Secondly - I could just use some clarification regarding my
methodologies - I would massivley appreciate someone sanity checking my
thinking here :)

I have observed our current database's IO patterns using iostat over
the course of a hour. I discovered it was doing 2921.8 kr/s and 140.1
kw/s (197.4 r/s and 13.7 w/s). This gave me an average read size of
around 14Kb, and average write size of roughly 10Kb.

I'm not sure this is meaningful. You need to cope with the peak
rate, not just the average, and measure what the database is doing
rather than just your current filesystem parameters: does three 14kB
reads really mean the database wanted 30kB, for instance?
And what is your database block size?

Speak to your DBA. Are your measurements valid for each table?
Can read-only tables be isolated? Are database log files random access?
(Mind you, does Oracle still recommend SAME and hope for the best?)


My assumptions are : For random access, I'm assuming the best thing to
do is to set the stripe size of a RAID volume to the average transfer
size - so if I have, say, 4 disks in a volume they could all in theory
be handling different requests. If I'm doing mostly sequential
transfers, I'm better off setting the stripe size to (number of disks *
transfer size), so a single transfer is spread out amongst as many
disks as possible. Is this correct ?


Strictly, if you are using raid-5 then the discs cannot all be handling
different requests because each request will involve more than one
disc because of the parity.

Sequential transfers are generally favoured by wide stripes and
big blocks whereas for random i/o the reverse tends to be true.
Assuming random i/o of small size, it is probably best to set
the stripe size to a bit larger than your i/o ops (say, 16kB) so they
can be satisfied in one go.

There are some guidelines in the 3510 documentation on sun.com
as well as limitations.


If I've got that right, my next question relates to maxcontig. If I set
the stripe size of the RAID volume to 16Kb (closest to 14Kb value
mentioned above), I know I then need to tune maxcontig accordingly. If
I lower this from the default setting of 7 to 6, this means the system
should bunch together transfers until they reach a size of 48Kb. If I
have 4 disks in my RAID 5 volume (3 available volumes), and a stripe
size of 16Kb so that the total stripe width is also 48Kb, does that
mean that every transfer will be spread evenly across all disks ?


If you have three 4-disc raid-5 volumes then you have no online spares:
is this wise?

Are you sure the default maxcontig is 7 and not 128? Or is this
a sparc vs x86 thing? Or is my memory failing? It depends whether
you are doing sequential or random i/o. In any case, you may do
better to mount the filesystem with forcedirectio and not worry
about it.

This isn't really going anywhere so how about a change of tack?

Option 1: (the cynic's option) your db is not doing much so the 3510
will more than adequately keep up with it using all the default settings.

Option 2: take the opportunity to experiment with different parameters
before you go live. Probably your db supplier publishes guidelines.
This will have the additional benefit of making you familiar with
the 3510 when a controller fails at 3 o'clock in the morning.

--
John.



.



Relevant Pages

  • Re: RAID 10-LUN Question
    ... My thoughts are take most even number of possible disks(leaving private ... system partition area) and create the raid groups and kick a separate LUNS ... drives, 12 drives, etc. Depending on how 24/7 self-managed you want the ... Stripe size effects performance but also storage ...
    (freebsd-questions)
  • Re: IBM FastT vs. EMC Clarion
    ... To my mind raid 10 is Mirror the disks across pairs then stripe the ... mirrors the entire stripe onto the other half of the physical disks. ...
    (AIX-L)
  • Re:
    ... Is using LVM2 for striping safer than raid 0? ... RAID0 means striping. ... don't put it on a stripe, put it on a mirror: ... dividing the MTBF by the number of disks. ...
    (Ubuntu)
  • Stripe sizing and maxcontig
    ... I've been reading up a lot recently on filesystem tuning, and stripe ... RAID controllers) and are looking at moving our databases over onto it ... Alarm bells are ringing at this advice even though the array ... I'm better off setting the stripe size to (number of disks * ...
    (comp.unix.solaris)
  • Re: New PCI IDE controller or new mainboard - advice needed
    ... Currently '\Program Files' is a junction that points to a folder on a RAID ... Installing to stripe will also allow windows to start faster and all I/O ... When I purchased the two new IDE disks 2 months ago, ...
    (microsoft.public.windowsxp.hardware)