Re: How to debug or explain slow gjournal writes?



On Thu, Mar 13, 2008 at 05:57:41PM -0400, Adam McDougall wrote:

I am evaluating using gjournal on my servers. This one test system is
running 7.0-RELEASE at the moment on a Dell PE2650 with dual 2ghz xeon,
ahc0: <Adaptec aic7899 Ultra160 SCSI adapter>, and some seagate 36g 10k
<snip>
Additionally, when I have the journal on a gmirror (the data is too), the
journal write speed is variable between 8 and 35MB/sec, it fluxuates pretty
wildly second to second. As soon as I deactivate one side of the mirror, or
get rid of the mirror under the journal, it is a consistent 35MB/sec.

You know, the more I probe, the more it seems to unravel to an issue I have seen
on at least 3 FreeBSD 7.x systems so far: SOMETIMES writes have inconsistent
speed if they have been used in some sort of raid. Key words are "sometimes"
and "have been". It seems like a disk is more likely to misbehave or just be
consistently slow if it used to be in a raid like a gmirror. Sometimes I could
deactivate a disk from a mirror and use it independantly (like for a journal)
but it would stay slow. Other times, with a more fresh setup, it would perform
fine.

- On my brand new desktop, I setup a zfs mirror, my writes went into the toilet.
I could hear the writes getting interrupted every few seconds while the disk
heads seeked. I deactivated the mirror, and the writes would go to the first
disk with full speed again. And when I formatted the second disk with whatever
FS I wanted, it would give me full speed. But mirror them, and performance would
drop back down by several factors. Somehow I avoided this by switching my desktop
install to amd64 freebsd and re-setting up the mirror with zfs, but I think
that was chance.

- On a friend's home server, sometimes when I would setup a multidisk raid (using
different zfs methods, or geom raids), I would run gstat and watch the performance
bounce all around between the disks in the raid. It would continue to bounce
between 8M/sec and something more reasonable for his ata disks like 40M/sec.
If I still had the opportunity to experiment with that again, I would try to
reproduce it and look at it more closely with gstat -I 100000 or so. I bet it
was completely stalling out when I saw it be slow on a one-second average.
I found a combination of sata controllers that gave fair and consistent
(but not impressive) performance and called it good because it exceeded what
his 1Gbit network could likely push or pull to it for a file server.

- On this Dell 2650, I keep setting up different zfs mirrors, raidz, gmirror, gstripe,
and sometimes the write speed is fine, sometimes it keeps bouncing around
or stalling out for brief periods. Not always every drive at the same
time. Its too loud in the server room to hear if this one is seeking while its
slow. I can never see anything obvious in top -S, gstat, systat -vmstat that
might cause a spike. I know the scsi bus on this one is capped at 160 so
I don't expect to see 3-4 drives run at the same time and get full performance.

But ALWAYS if you use dd to write directly to a drive, even multiple drives, or
put a filesystem on a single drive, write speed was fine.



I'm looking for input on debugging, tuning, questions, bonehead errors, etc
because I would like to get the most out of this setup if possible and not
just settle for an inconsistent 16-22MB/sec. Thanks.

Geom name: gjournal 170802896
ID: 170802896
Providers:
1. Name: da2.journal
Mediasize: 35346332672 (33G)
Sectorsize: 512
Mode: r1w1e1
Consumers:
1. Name: da2
Mediasize: 36420075008 (34G)
Sectorsize: 512
Mode: r1w1e1
Jend: 36420074496
Jstart: 35346332672
Role: Data,Journal

_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"

_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • hai...find me a solution in M5000 server
    ... series server. ... B B B B B i want to do install one more solaris Over there. ... Solaris cannot see all drives on Areca RAID controller ... popped it (a single disk at this point) into machine1. ...
    (SunManagers)
  • Re: sunmanagers Digest, Vol 22, Issue 12
    ... series server. ... B B B B B i want to do install one more solaris Over there. ... Solaris cannot see all drives on Areca RAID controller ... popped it (a single disk at this point) into machine1. ...
    (SunManagers)
  • Re: SBS 2003 software mirror problem
    ... The Dell server has SCSI disks, ... disks' physical position in the server (hot-swap drives) which changes the ... It is as though the server does not see an active disk ... from hard-disk but only from floppy after you repair the mirror on the ...
    (microsoft.public.windows.server.sbs)
  • Re: Little assistance ...please
    ... A bit of a overkill as I do have one server using 2x36GB RAID1 for the OS and it is more than enough. ... Personaly myself I would have invested a bit more money and gone with additional 80GB drives and using RAID5 partitioning off the OS and DATA drives. ... Assuming card or bios is handling mirror then one hard drive failure server will still be running. ... hard drive failed then server down, either disk 2 has broken mirror needs ...
    (microsoft.public.windows.server.sbs)
  • Re: hardware for Exchange 2003
    ... your current server will work fine. ... in daily bases or weekly bases in order to commit the T.L files into Exchange ... one(or mirror or RAID 1) for Transaction Log files and let's call this D: ... ALL SCSI drives running at 15krpm for maximum throughput. ...
    (microsoft.public.exchange.design)