how to royally mess up a -stable system

From: Mikhail Teterin (mi+kde_at_aldan.algebra.com)
Date: 07/18/04

  • Next message: Tim Robbins: "Re: how to royally mess up a -stable system"
    To: stable@FreeBSD.org
    Date: Sun, 18 Jul 2004 03:03:30 -0400
    
    

    Have ImageMagick try to load a really big image file -- big enough to
    overblow your /var/tmp ...

    Here is the state of the box (after the libMagick process was killed) --
    from the `systat -pigs':

                        /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
         Load Average |||||||||||||||||||||||||||||||||||||||||||||||||| 10.5

                        /0 /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
    root syncer XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    root pagedaemon XXXXX
                 <idle> X
    [...]

    The machine is almost entirely unresponsive. When tcsh echoes the
    commands back at all, it can not execute them for many minutes. The
    kernel then tries to log each and every error:

    [...]
    Jul 18 02:06:33 corbulon /kernel: vnode_pager_putpages: residual I/O 65536 at
    18777
    Jul 18 02:06:33 corbulon /kernel: pid 7 (syncer), uid 0 on /var: file system
    full
    Jul 18 02:06:33 corbulon /kernel: vnode_pager_putpages: I/O error 28
    Jul 18 02:06:33 corbulon /kernel: vnode_pager_putpages: residual I/O 65536 at
    18778
    Jul 18 02:06:34 corbulon /kernel: pid 7 (syncer), uid 0 on /var: file system
    full
    Jul 18 02:06:34 corbulon /kernel: vnode_pager_putpages: I/O error 28
    Jul 18 02:06:34 corbulon /kernel: vnode_pager_putpages: residual I/O 65536 at
    18779
    Jul 18 02:06:34 corbulon /kernel: pid 7 (syncer), uid 0 on /var: file system
    full
    Jul 18 02:06:35 corbulon /kernel: vnode_pager_putpages: I/O error 28
    [...]
    Jul 18 02:09:43 corbulon /kernel: vnode_pager_putpages: residual I/O 40960 at
    8179
    Jul 18 02:09:43 corbulon /kernel: pid 3 (pagedaemon), uid 0 on /var: file
    system full
    Jul 18 02:09:43 corbulon /kernel: vnode_pager_putpages: I/O error 28
    Jul 18 02:09:43 corbulon /kernel: vnode_pager_putpages: residual I/O 40960 at
    8179
    Jul 18 02:09:43 corbulon /kernel: pid 3 (pagedaemon), uid 0 on /var: file
    system full
    Jul 18 02:09:43 corbulon /kernel: vnode_pager_putpages: I/O error 28
    Jul 18 02:09:43 corbulon /kernel: vnode_pager_putpages: residual I/O 40960 at
    8179
    Jul 18 02:09:43 corbulon /kernel: pid 3 (pagedaemon), uid 0 on /var: file
    system full
    [...]

    which really chokes the box even though /var/log is on a different
    device from /var/tmp ...

    Yesterday my 4.8-stable kernel had to be cold-rebooted after almost
    a year because of this -- existing processes (sshd, webmin) were
    responding sometimes, but were unable to launch any new processes --
    like shell (in case of sshd) or even /sbin/reboot (in case of webmin).

    Why is a fast-writing program (not run by root) able to hang a server?

    Perhaps, these errors logged by the kernel can be made less specific and
    fit into one line -- that way syslogd will be able to cope with them
    better, at least?

            -mi

    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  • Next message: Tim Robbins: "Re: how to royally mess up a -stable system"