Re: Slow Filesystem I/O

From: Dave Froble (davef_at_tsoft-inc.com)
Date: 05/01/05


Date: Sun, 01 May 2005 04:38:07 -0400

Bill Todd wrote:
> Dave Froble wrote:
>
> ...
>
>> Well, Ok, the way I read it is that some of the hardware on top end
>> stuff is able to mask problems with writing large amounts of data
>> quickly. Get a bunch of cache and battery backup and you can mask
>> slower write performance, whether based on the filesystem design or
>> other. Only problem would be if you rammed the data at the storage
>> for a long period of time, say several days.
>>
>> What did I miss?
>
>
> 1. Stable write-back controller cache can *help* write performance
> somewhat, but (as Rob himself noted) it still has latency on the order
> of a millisecond, so *at best* it can improve upon the latency of a fast
> disk by a factor of less than 10. Write-back system cache, as I noted,
> can improve upon the latency of a fast disk by a factor of 100 or more -
> plus whatever additional gains you get from not having to write out a
> significant amount of data at all (the data that gets deleted while it's
> still in the cache), plus whatever the far greater write bandwidth to
> system cache gets you in terms of avoiding bottlenecks.

Ok, I see the point about system cache.

The latest caching that VMS has seems to do a decent job. I've seen
cases of significant overall improvement, which is really apparent when
you turn off the caching. :-)

> Rob's contention was not the entirely reasonable one that write-back
> controller cache can *help* mask *some* of the major
> default-write-performance handicap that VMS has compared with systems
> that use larger and better-streamed write-back buffering: it was
> (first) that it masked the problem "for the most part" and (later)
> flatly that "the 'problem' of a slow filesystem is a non-problem" and
> "Won't matter how slow the filesystem is, the hardware will be there to
> mask it". It's pretty incompetent to claim that spotting your
> competition a 10:1 write-latency advantage (and an even greater write
> bandwidth advantage) is a 'non-problem' simply because you've reduced
> that handicap from more like 100:1 if you didn't use the controller
> caching at all - but then I'm guessing that Rob doesn't have much
> experience with modern Unix performance (and possibly that you don't,
> either).

No, I don't have any Unix experience.

I'm guessing that it's very dependant upon the type of workload. As
I've said elsewhere, some people use computer storage as a data
destination, while others use it much more as a source of data.

> 2. Your 'long period of time' is nothing like 'several days'. Let's
> take RMS's pathetic 8 KB default buffer as an example. If you're
> stuffing sequential-file data at the disk as fast as RMS can (by
> default) destage it synchronously, you're placing that 8 KB on the disk
> each disk revolution (i.e., every 4 - 6 ms. with a 15K or 10K rpm
> FC/SCSI drive). That's a data rate of a measly 1.3 - 2 MB/sec.

I've been into tuning systems for 30 years, though there wasn't much
that you could on RSTS. A serious system manager will not have the
default out-of-the-box settings. It's always been a case of using
memory to ease the bottlenecks. For RMS, large and multiple buffers can
be the difference between getting work done, and just being a heater.

> Well, that measly 1.3 - 2 MB/sec amounts to a not-so measly 4.7 - 7.2
> GB/hour, and that's just the data rate that you want to *improve upon*
> by using the write-back controller cache. So if, for example, you're
> shooting just for the 10x improvement that Rob was boasting about (not
> the 100x improvement you can get by using a system cache) you're talking
> about gobbling up 40 - 60 GB of cache, per hour, per application
> (because since you're accepting data about 5x faster than you'll
> eventually be able to get rid of it to disk after taking queue
> optimizations into account, most of the data that you stuff into the
> cache during the hour will still be there at the hour's end).
>
> No, not all applications will stress the controller cache to this
> degree, so by all means decrease that amount by as much as an order of
> magnitude (as long as you just want to cater to *average* environments
> rather than to the most demanding ones, anyway): then on average you'll
> fill up only 4 - 6 GB of cache per hour, per application. Just how much
> (mirrored and battery-backed, of course) cache did you say you had, and
> just how much more expensive was it than unmirrored, volatile system
> memory which also, not so incidentally, could be put to better uses when
> they had priority?
>
> 3. Then comes the question of what happens when your long (or, as noted
> above, perhaps not-so-long) period of time is up, and your cache is
> full: what does your data rate drop back to then? Well, in this
> example, right back to 2.6 - 4 MB/sec (I'm assuming that queuing
> optimizations will be able to double the IOPS for the disk, though if
> the disk is being shared with other applications both this and the
> original 1.3 - 2 MB/sec figures should be reduced somewhat to account
> for additional seek overhead) - and that's where Spiralog came in,
> because (even as implemented, and I believe that improvements on that
> were possible) it should have been able to sustain 20 - 40 MB/sec under
> that kind of load (the disks we're talking about here are today's disks,
> not those from Spiralog's era), which is where my comment about happily
> buying 5x - 20x as much disk just to compensate for the deficiencies of
> VMS's default data-handling came in.
>
> The bottom line is that hardware cannot *wholly* compensate for
> poorly-designed data-handling, and just compensating *partially* is not
> cheap (large amounts of mirrored, battery-backed controller cache, many
> times as many disks - these are not down-in-the-noise expenses, even in
> a high-end system).
>
> Boasting that VMS doesn't perform as pitifully as it used to compared to
> Unix if you throw enough expensive hardware at it (which Unix doesn't
> require) is not exactly the best way to win converts, I'll suggest.

Agreed.

> You're right: I don't like Rob. While I don't consider him to be as
> much of an active sleaze as I've found Kerry and often Terry to be, they
> all vigorously defended the Alphacide while studiously ignoring (or
> actively attempting to spin away) mounting evidence that it was just
> what we suspected it was, and continue to do so to this day by
> uncritically touting Itanic (though I've heard that Terry may finally
> have seen something of the light in this area). Kerry has an obvious
> job-related motive for pimping for cHumPaq and my impression was that
> Terry hoped to acquire one, whereas Rob's motivation is less clear: for
> some reason he seems to develop organizational allegiances which cannot
> be weakened by any factual evidence - which I guess may be
> characteristic of faith-based rather than analytical minds but which
> still doesn't explain the strength of the allegiance in the first place.
>
> If Rob were simply an uncritical, enthusiastic supporter of the things
> that he (for whatever reason) decides are worthy of his support, I'd
> find him a lot more tolerable. It's his continual attempt to portray
> them as *relatively* far better than they are compared to the
> competition which causes me to slap him down hard, though without any of
> the distortions (intentional or sloppy - it's often hard to tell) which
> he himself so readily indulges in to support his (current) pet products
> (as we saw with Alpha, while his allegiance to organizations remains
> steadfast, his allegiance to particular products can turn on a dime).

The part that I really don't understand are the people who, just because
something may not be good for them, can totally ignore it, or deny it.

Anybody with 1/10 of a brain could see, years ago, that if AMD would be
reasonable successful with Hammer, that Intel would be vastly affected
by that success, since Intel was moving in another direction, IA-64.

Hey, I watched VAX and Alpha get marginalized by IA-32, and the main
reason was volumn, not capability. Many more people were replacing
typewriters and calculaters that were doing the traditional computing of
the 1975-1985 timeframe.

The same applications were still on x86, and anything else just wasn't
going to compete in terms of quantity. If you cannot compete on
quantity, you just don't compete at all.

It wasn't ever a question of the capabilities of IA-64. It has always
been whether IA-64 could provide anything that x86 could not.

It's not Opteron that will directly affect itanic. It's Intel's chips
that are made to compete with Opteron that will affect itanic.

How these people can ignore the really hugh event of Intel doing an
abrupt U-turn is just unbelievable.

How they can presume that it cannot happen again is beyond belief.

And I'm just a casual observer, not somebody that watches all this stuff
every day, all day long.

> I really don't give a damn whether people like the way I treat Rob or
> not: as with cHumPaq, I'll stop when, and only when, I see some real
> attempt to make amends for what are (going on 4 years now) continuing
> transgressions (not to suggest that Rob's are anything comparable in
> significance to cHumPaq's, but they both exude a similar stench). But
> I'm as careful with the facts when responding to Rob as I am in other
> situations - not that I give a damn about people who can't recognize
> that, either.
>
> - bill

-- 
David Froble                       Tel: 724-529-0450
Dave Froble Enterprises, Inc.      Fax: 724-529-0596
DFE Ultralights, Inc.              E-Mail: davef@tsoft-inc.com
170 Grimplin Road
Vanderbilt, PA  15486


Relevant Pages

  • Re: why I told dd command to write but it reads in first?
    ... the buffer cache is different from file system cache? ... > You can't write 1 byte to the disk. ...
    (comp.os.linux.development.system)
  • Re: is there a user mode way to flush disk cache
    ... I've got plenty of memory. ... > the disk led is on solid. ... > disk cache is nearly empty before I start. ... I'm not sure whether you mean the system cache or the drive's on-board ...
    (comp.os.linux.development.system)
  • Re: Caching control
    ... |> | invalidate/unmap them in order to discard the data from memory. ... |> writing out to disk. ... | easy to discard as clean disk cache. ... stating that a specific amount of RAM can be used only for I/O ...
    (comp.os.linux.development.system)
  • PROBLEM: sata_sil24 lockups under heavy i/o
    ... a total of 4) and started heavy i/o (extending a software raid5 device) ... system recovers the disk transfer speed is reduced from UDMA/100 to ... Cache Line Size: 32 bytes ... parport_pc: Current parallel port base: 0x378 ...
    (Linux-Kernel)
  • Re: Scheduler: Process priority fed back to parent?
    ... Mac OS X has a special cache ... on disk of things that get loaded on boot. ... >>initial priority is a guess, and isn't set until the priority info has ... This prefetch activity could be turned on/off ...
    (Linux-Kernel)