Re: newfs locks entire machine for 20seconds



In message <008201c86388$fd159010$b6db87d4@xxxxxxxxxxxxxxx>, "Steven Hartland" writes:

From: "Ivan Voras" <ivoras@xxxxxxxxxxx>
The machine is running with ULE on 7.0 as mention using an Areca 1220
controller over 8 disks in RAID 6 + Hotspare.

I'd suggest you first try to reproduce the stall without ULE, while
keeping all other parameters exactly the same.

Ok tried with an updated 7 world / kernel as of this afternoon and with 4BSD
instead of ULE and no difference the machine still locks up with no activity
for anywhere from 20 to 30 seconds.

Here's a snapshot from top under cpu and io modes when the stall has occured
[top]
last pid: 1102; load averages: 0.02, 0.08, 0.07 up 0+00:09:37 21:39:13
162 processes: 4 running, 145 sleeping, 13 waiting
CPU states: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle
Mem: 60M Active, 19M Inact, 54M Wired, 56K Cache, 27M Buf, 3809M Free
Swap: 4096M Total, 4096M Free

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
12 root 1 171 ki31 0K 16K RUN 0 8:59 97.90% idle: cpu0
11 root 1 171 ki31 0K 16K RUN 1 8:57 95.80% idle: cpu1
1102 root 1 -8 0 4752K 1256K physrd 1 0:01 19.64% newfs
4 root 1 -8 - 0K 16K - 0 0:00 0.10% g_down
1048 root 1 96 0 7656K 2544K CPU0 0 0:01 0.00% top
1054 root 1 96 0 7656K 2348K CPU1 1 0:01 0.00% top
863 root 1 96 0 131M 15768K select 0 0:00 0.00% httpd
1055 root 1 96 0 32928K 4656K select 0 0:00 0.00% sshd


last pid: 1102; load averages: 0.02, 0.08, 0.07 up 0+00:09:37 21:39:13
162 processes: 4 running, 145 sleeping, 13 waiting
CPU states: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle
Mem: 60M Active, 19M Inact, 54M Wired, 56K Cache, 27M Buf, 3809M Free
Swap: 4096M Total, 4096M Free

PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND
12 root 9 154 0 0 0 0 0.00% idle: cpu0
11 root 28 5 0 0 0 0 0.00% idle: cpu1
1102 root 5 0 0 0 0 0 0.00% newfs
4 root 14 0 0 0 0 0 0.00% g_down
1048 root 1 0 0 0 0 0 0.00% top
1054 root 1 0 0 0 0 0 0.00% top
863 root 1 0 0 0 0 0 0.00% httpd
[/top]

What *exactly* do you mean by

machine still locks up with no activity for anywhere from 20 to 30 seconds.

Is there disk activity? (e.g. activity light(s) flashing if you have them)

Does top continue to update the screen during the 20-30 seconds?

I'm thinking that newfs has queued up a bunch of disk i/o, and other
disk i/o gets locked out, but activities that don't require any disk i/o
(like top, once it is up and running) could continue. Is that what is
happening?
_______________________________________________
freebsd-performance@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: msExchESEParamLogBuffers not set????
    ... Step 2 is collect physical disk - sec/write and Database - log record stall per second at a small interval and compare the two. ... Collect Database - log record stalls/sec and Physical disk - avg disk sec/write ) at a very frequent interval and look for correlations between spikes in the two counters. ... I suspect this is for the same reason that jetstress.doc hedges the 20ms response time and uses Database page fault stalls/sec in cases where synchronous replication is used. ...
    (microsoft.public.exchange.admin)
  • Re: msExchESEParamLogBuffers not set????
    ... One common source of the "requesting data" dialog box is a log stall. ... and there is no correlation to slow response times on ... The root cause in that case is slow disk. ... That's a page fault. ...
    (microsoft.public.exchange.admin)
  • Re: About removable disks, mountroot and sw-raid
    ... failed attempt to query device size because the disk needs another ... A tunable stall in the disk driver would do the same thing, ... limiting it to boot. ...
    (freebsd-arch)
  • Re: newfs locks entire machine for 20seconds
    ... I'd suggest you first try to reproduce the stall without ULE, ... Here's a snapshot from top under cpu and io modes when the stall has occured ... I'm thinking that newfs has queued up a bunch of disk i/o, ...
    (freebsd-stable)
  • Re: Choosing the proper disk setup. I need help and advice.
    ... no amount of hardware will fix the problem. ... something like 18 disk spindles across 4 SCSI channels. ... Don't fret too much about disk I/O most arrays go much faster than the ... > inadequeate CPU speed, or network I/O as the bottleneck. ...
    (microsoft.public.windows.server.general)