Re: "Load Balancing": How Busy are the servers?



Marc G. Fournier writes:

For all the technology, I was kinda hoping for some 'scientific formula' :)

There are..

Now, I really hate to ask, but how do you use vmstat to get a feel for how busy the disk subsystem is?

For me, reading "Absolute BSD" by Michael Lucas was very helpfull. In particular Chapter 18, System performance.

The three columns I look at are for vmstat "r" and "b" on the left, and "fault".

"r" shows how many processes are waiting for CPU, "b" shows how many processes are waiting for disk. The fault column(s) show how badly your system is accesing swap.

Quick example:
r b w
2 5 0
1 5 0
2 4 0
2 5 0
3 4 0
1 5 0
1 5 0


That's from my home machine as I am doing some backups.
The machine at this point is more disk bound than CPU bound with 4 to 5 disk operations at any point in time waiting for disk access


I am also falling behind in CPU, but not as bad.

On the far right of vmsat you also have CPU stats.. in my case the vmstat from the above lines showed 70% to 90% iddle which confirmed I was disk bound at that point.

The fault column show you how actively you are using swap. The lines above had between 30 and 200 approximately. If you look at swapinfo and you have a large amount of swap in use and then you see a high number in vmstat for fault, the machine is short on RAM for the load you have on it.

So far in my experience nothing hurts a machine as badly as hitting swap (given that you have adequate CPU/disks). Once you start to hit swap heavily you need to do something (if you can...) such as moving services to another machine or putting in more memory.

Instead of looking for fixed number I think that relative figures are more important.. like looking at your machines at their lowest usage and then at their busiest.. or at spikes.. If at slow times of activity the machines are already falling behind on "b", "r" on vmstat.. then that machine is overloaded.

One possible quick way to start benchmarking your machines, until you can do something better is to capture snapshots of vmstat every 15 to 30 minutes and take a look.. perhaps even write a short script to summarize it. On my list of things to do.. is to do a simple setup of that nature.. just because it would be easy to setup and can provide very valuable information until you setup something more feature rich.


"top" in 5.X branch and up is also very userfull. If you hit "m" it shows you disk processes so you can see what programs are doing the most I/O.


One thing to watch out for in top when using 'm' is if you see all low numbers ( hit 'o' to sort and then type 'total').. is that you may have lots of programs doing little I/O, but their combined load is a problem for your disk subsystem.... like having 200+ IMAP connections. Each single IMAP connection may not be doing more than a handfull of transactions per second, but all of them combined can give a disk subsystem a pretty good workout.

The load averages from 'w' are also good figures to do comparative tests. I started to wokr on a script (but needs more work) that dumps 'w' and 'vmstat' .. next have to work on parsing them and giving summaries. In particular one wants to know peak times.. since that is the best time to determine if the machine can handle it's load.. and more importantly spikes. If a machine is usually under 2.. and it spikes at 5+.. that machine is possibly able to do "normal" loads, but may not be able to handle spikes in traffic (ie a customer doing a mailing list, or a site just got press.. and there are a larger number than usual of people going to their URL).

I still thinkg I have MUCH, MUCH to learn.. but I would be glad to expand on anything mentioned above.. or anything else. Ultimately each machine/company is unique enough that absolute numbers from other people (ie what is a good value for 'r' and 'b' to be around most of the time) may be less important than learning what are the different figures for your different machines under "normal" operation.
_______________________________________________
freebsd-questions@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscribe@xxxxxxxxxxx"




Relevant Pages

  • Re: "Load Balancing": How Busy are the servers?
    ... processes are waiting for disk. ... The fault columnshow how badly your system is accesing swap. ... If you look at swapinfo and you have a large amount of swap in use and then you see a high number in vmstat for fault, the machine is short on RAM for the load you have on it. ...
    (freebsd-questions)
  • Re: Firefox 3.5
    ... LONG time to load in the first place, ... run the disk optimizer. ... Get and run a registry cleaner. ... This fragmentation happens as a result of things that have occurred AT ANY ...
    (rec.outdoors.rv-travel)
  • Re: Shopping for Blu-Ray
    ... Howard Brazee wrote: ... In spite of its odd look compared to a classic player, it is the single best BluRay player out there. ... And while they may not be the fastest to load a disk, I've always wondered why this aspect is always mentioned. ...
    (alt.tv.tech.hdtv)
  • Re: XP Pro OEM installs, Academic Upgrade doesnt
    ... He says I am NOT supposed to load ... the NVidia RAID driver if I have one disk. ... he also says that you can't use an "upgrade" version of Windows XP to ... > connectors but there is also the possibility that only certain connectors ...
    (microsoft.public.windowsxp.setup_deployment)
  • Re: SATA 3ware RAID review...sort of.
    ... indicate the load you are putting on the card or the harddrives. ... of sustained disk transfer. ... how many hits the server serves every second. ... >setup though so I was concerned about moving over to SATA for something ...
    (freebsd-isp)