Re: amrd disk performance drop after running under high load



Hi.

Kris Kennaway wrote:
After some time of running under high load disk performance become expremely poor. At that periods 'systat -vm 1' shows something like
this:
This web service is similiar to YouTube. This server is video store. I
have around 200G of *.flv (flash video) files on the server
I run lighttpd as a web server. Disk load is usually around 50%, network
output 100Mbit/s, 100 simultaneous connections. CPU is mostly idle.
This is very unlikely, because I have 5 another video storage servers of the same hardware and software configurations and they feel good.
Clearly something is different about them, though. If you can characterize exactly what that is then it will help.
I can't see any difference but a date of installation. Really I compared all parameters and got nothing interesting.

At first glance one can say that problem is in Dell's x850 series or amr(4), but we run this hardware on many other projects and they work well. Also Linux on them works.

OK but there is no evidence in what you posted so far that amr is involved in any way. There is convincing evidence that it is the mbuf issue.
Why are you sure this is the mbuf issue? For example, if there is a real problem with amr or VM causing disk slowdown, then when it occurs the network subsystem will have another load pattern. Instead of just quick sending large amounts of data, the system will have to accept large amount of sumultaneous connections waiting for data. Can this cause high mbuf contention?


And few hours ago I received feed back from Andrzej Tobola, he has the same problem on FreeBSD 7 with Promise ATA software mirror:
Well, he didnt provide any evidence yet that it is the same problem, so let's not become confused by feelings :)
I think he is telling about 100% disk busy while processing ~5 transfers/sec.

So I can conclude that FreeBSD has a long standing bug in VM that could be triggered when serving large amount of static data (much bigger than memory size) on high rates. Possibly this only applies to large files like mp3 or video.
It is possible, we have further work to do to conclude this though.
I forgot to mention I have pmc and kgmon profiling for good and bad times. But I have not enough knowledge to interpret it right and not sure if it can help.

Also now I run nginx instead of lighttpd on one of the problematic servers. It seems to work much better - sometimes there is a peaks in disk load, but disk does not become very slow and network output does not change. The difference of nginx is that it runs in multiple processes, while lighttpd by default has only one process. Now I configured lighttpd on other server to run in multiple workers. I'll see if it helps.

What else can i try?

With best regards,
Alexey Popov
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: amrd disk performance drop after running under high load
    ... have around 200G of *.flv (flash video) files on the server ... I run lighttpd as a web server. ... For example, if there is a real problem with amr or VM causing disk slowdown, then when it occurs the network subsystem will have another load pattern. ... Also now I run nginx instead of lighttpd on one of the problematic servers. ...
    (freebsd-hackers)
  • Re: Disk Array Usage
    ... > server, and the rest are web, file and app servers. ... I see an array as a massive storage ... > than a disk array, why not have an actual server with a ton of space? ... If you have a number of disks which can be attached to by several machines ...
    (comp.sys.sun.hardware)
  • Re: Convert RAID1 to RAID5
    ... In a typical R5 array that is installed/setup by small shops, where they have 3xDisk for R5, you are right, but, in a proper setup where you have 5xDisks or more, then you will see a LOT of read performance that you don't get out of a Mirror - this is very much the case when you have LOTS of users accessing the server or have a SQL database that users are hitting. ... then install another RAID-0 or RAID-1 or single disk for page file, it's still not as fast as a properly configured system designed for SQL use. ...
    (microsoft.public.windows.server.sbs)
  • RE: sunmanagers Digest, Vol 20, Issue 34
    ... EMAIL SERVER SOFTWARE ... Solaris 8 to 9 upgrade problem with disk space ... My test device was not recognized as a disk in format, ... During the upgrade process installation I am always asked to provide ...
    (SunManagers)
  • Re: I need further HDD advice before submitting order.
    ... I might make different choices in partioning a server in datacenter ... > 200GB HDD IDE: ... disk dies and you have to reinstall the OS, ...
    (freebsd-questions)