hang with raid, postgresql

From: Don Bowman (don_at_sandvine.com)
Date: 05/30/04

  • Next message: Doug White: "Re: migrating -current machine to amd64"
    To: "'current@freebsd.org'" <current@freebsd.org>
    Date: Sun, 30 May 2004 15:52:04 -0400
    
    
    

    I have a system with 2x 2.8GHz XEON (P4), intel e7501 chipset,
    4GB of ram, aac [adaptec 2200s] raid with 4 scsi
    disks. I have also tried asr (adaptec 2015).
    I have tried two different motherboards.
    The only application the machine runs is postgresql,
    with about ~30 databases, about ~250GB of data.

    I'm finding the machine locks up solid once a day
    or so (sometimes more, sometimes less, no pattern
    of time of day). I know its not a hardware issue, it
    is reliable with FreeBSD 4.7. I've run through memory
    test, disk test, etc.

    There appears to be a correlation between
    disk activity (postgresql vacuum) and the lockup,
    but i can't be sure.

    I've just reproduced it with a cvsup from head today
    [2004-05-30 12:00 EDT], so its still present.
    I've got a serial console, and the break to
    debugger (which works under normal circumstances).

    In the lockup case, i cannot drop into db, and
    no output appears anywhere. I have enabled
    the following options, but still no affect, no
    messages come out (other than erroneous LOR
    issues).

    options ALT_BREAK_TO_DEBUGGER
    options DDB
    options INVARIANTS
    options INVARIANT_SUPPORT
    options WITNESS
    options WITNESS_SKIPSPIN
    options MUTEX_DEBUG
    options DIAGNOSTIC

    i've tried both with and without ACPI. It
    does not have PAE configured in.

    The fact that i can't drop into the debugger
    using the CR ~ ^B sequence when its locked up
    implies that its no longer servicing the serial
    interrupt.

    Does anyone have any suggestions? postgresql
    makes use of disk, sysv semaphores, shared memory,
    etc.

    I don't have sound, vga, X, ... any of the
    'complicated' things, its just a server.
    There is no ATA.

    I tried setting kern.smp.active to 0, but
    it still locked up.

    I'm looking for any suggestions. I have
    attached the config file from it if anyone
    has any comments on that.

    --don

    
    
    

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"



  • Next message: Doug White: "Re: migrating -current machine to amd64"

    Relevant Pages

    • Re: hang with raid, postgresql
      ... > The only application the machine runs is postgresql, ... > test, disk test, etc. ... > I've got a serial console, ... > debugger. ...
      (freebsd-current)
    • Re: SELinux question
      ... another tablespace for postgresql under another mount point... ... Make a local policy module for this issue, ... remove any file context objects you added for this issue using ... I created only one partition on that disk, ...
      (Fedora)
    • Re: SELinux question
      ... I would like to relabel it to recover the context... ... another tablespace for postgresql under another mount point... ... Adding /home1/pgsql with var_lib_t context didn't make any difference, ... I created only one partition on that disk, ...
      (Fedora)
    • Re: SELinux question
      ... another tablespace for postgresql under another mount point... ... Make a local policy module for this issue, ... remove any file context objects you added for this issue using ... The system is my home/devel machine and the disk is SATA and fast enough. ...
      (Fedora)
    • Re: SELinux question
      ... I would like to relabel it to recover ... another tablespace for postgresql under another mount point... ... The existence of the symlink itself is probably the problem, ... I created only one partition on that disk, ...
      (Fedora)