Re: System deadlock when using mksnap_ffs



On Thu, Nov 13, 2008 at 12:26:42PM +0200, Kostik Belousov wrote:
On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote:
On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote:
On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
I've been playing around with snapshots lately but I've got a problem on
one of my servers running 7-STABLE amd64:

FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64

I run the mksnap_ffs command to take the snapshot and some time later
the system completely freezes up:

paladin# cd /u2/.snap/
paladin# mksnap_ffs /u2 test.1

It only happens on this one filesystem, though, which might be to do
with its size. It's not over the 2TB marker, but it's pretty close. It's
also backed by a hardware RAID system, although a smaller filesystem on
the same RAID has no issues.

Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/da0s1a 2078881084 921821396 990749202 48% /u2

To clarify "completely freezes up": unresponsive to all services over
the network, except ping. On the console I can switch between the ttys,
but none of them respond. The only way out is to hit the reset button.

You need to provide information described in the
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
and especially
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

Ok, I've done that, and removed the patch that seemed to fix things.

The first thing I notice after doing this on the console is that I can
still ctrl+t the process:

load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k

But the top and ps I left running on other ttys have all stopped
responding.

Then in my book, the patch didn't fix anything. :-) The system is
still "deadlocking"; snapshot generation **should not** wedge the system
hard like this.
You systematically mix two completely different issues:
- first one is the _deadlock_ experienced by Tim;

Re-read what he wrote. Quote:

"Ok, I've done that, and removed the patch that seemed to fix things.

The first thing I notice after doing this on the console is that I can
still ctrl+t the process:

load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k

But the top and ps I left running on other ttys have all stopped
responding."

If he can press Control-T, it means SIGINFO can be sent to the
mksnap_ffs process, and the process responds with that information. So,
the system is not deadlocked -- meaning, I believe what he experiences
is what others experience (the system becomes completely unusable during
mksnap_ffs running, but DOES NOT hang or lock up, it just becomes so
god-awful slow that processes on the machine literally sit and spin for
minutes at a time).

- second one is the slowdown during snapshot creation.
In fact, I may count third, where dump itself hangs, as a usermode process,
but kernel still normally operates.

Patch posted should fix or paper over the first issue for practical means.
Third issue most likely fixed by the subr_sleepqueue race fix.

--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |

_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: System deadlock when using mksnap_ffs
    ... I run the mksnap_ffs command to take the snapshot and some time later ... paladin# mksnap_ffs /u2 test.1 ... To clarify "completely freezes up": ... Ok, I've done that, and removed the patch that seemed to fix things. ...
    (freebsd-stable)
  • Re: System deadlock when using mksnap_ffs
    ... paladin# mksnap_ffs /u2 test.1 ... To clarify "completely freezes up": ... Ok, I've done that, and removed the patch that seemed to fix things. ... But the top and ps I left running on other ttys have all stopped ...
    (freebsd-stable)
  • Re: System deadlock when using mksnap_ffs
    ... paladin# mksnap_ffs /u2 test.1 ... Ok, I've done that, and removed the patch that seemed to fix things. ... snapshotting though. ...
    (freebsd-stable)
  • Re: freebsd-current Digest, Vol 398, Issue 3
    ... For the list: Attached patch works. ... Nothing gets me a system that will actually boot. ... This may not be easy to fix properly for the time being as it ... that this is all to work around your BIOS being very broken. ...
    (freebsd-current)
  • Linux 2.6.19
    ... knowing that it's all your own d*mn fault, and you should just fix your ... [SCSI] ... [PATCH] ... USB: ipaq: Add HTC Modem Support ...
    (Linux-Kernel)