Theory of Unix I/O
- From: mutazilah@xxxxxxxxx
- Date: Fri, 7 Mar 2008 00:44:56 -0800 (PST)
At work we are experiencing severe problems with Oracle. It is taking
1.4 seconds
to write less than 200k of data. Oracle is reporting reads of 3 MB/
sec and writes
of 0.7 MB/sec. The reads are supposedly caused because Oracle needs
to write
to a block on disk, and before it can write (update) it, it needs to
read the existing
block. This is a multi-row insert operation that is being done.
The DBA is going to do some tuning that will hopefully eliminate the
reads in the
first place, but what I'd like to know is why the system is so slow
even despite
the fact that it has to read in the first place. I would have
expected it to be able
to do 3 MB/sec of reads in less than half a second. My laptop can do
that. So
I suspect that some other system is hammering the disks. I'd like to
know what
tools are available in Solaris (we're running a pretty old version I
think - 8 or
something), in order to determine the following things:
The disks themselves are reporting 5 msec response or access times or
less, so I don't think that is useful. What I want to know is:
1. From Oracle's perspective, how much of the 1.4 second response time
for
the insert was taken up doing reading?
2. From Unix's perspective, how much time was spent waiting for a free
I/O
channel (or whatever these Sparc machines use for I/O) to be made
available?
3. From the I/O channel's perspective, how much time was spent waiting
for
the disk to be ready to accept a read request.
4. From the disk's perspective, how much time was spent from the time
the
request came in to the time it responded with the data.
Presumably there were multiple I/O requests to transfer data, so I
expect all
the waiting time for all the individual requests to be added up and
then
expressed as a percentage of the 1.4 second response time.
These disks are connected to multiple boxes, and there is some
suspicion
that another box is hitting the drives so the process slowing down the
I/O
may be elsewhere. What tool is required to track that down?
If the process hammering the disks is in fact on the same box, what
tool
will track that down? ie what's the equivalent of top (for CPU) for I/
O?
We have Best/1 (sp?) available, but the person who runs it claims that
it
can't produce MB/sec for each process.
I am as much interested in knowing how disks work as anything else.
ie
is there such a concept even of Unix having to queue for an I/O
channel
or does Unix always have access to the disk and it's just a matter of
Unix deciding when to send the request? And in fact, can Unix send
lots
of requests to the disks simultaneously or only one at a time? Can we
see a queue of requests for each disk somewhere?
And what's that "I/O wait" that top displays? Which one of those (if
any)
concepts?
Thanks. Paul.
.
- Follow-Ups:
- Re: Theory of Unix I/O
- From: Darren Dunham
- Re: Theory of Unix I/O
- From: Cydrome Leader
- Re: Theory of Unix I/O
- Prev by Date: Re: strange "Trace/breakpoint trap" on SXDE 1/08
- Next by Date: iSCSI-Boot?
- Previous by thread: solaris and san disks problem
- Next by thread: Re: Theory of Unix I/O
- Index(es):