panic: integer divide fault on 6.1



Hi all,

I was contacted by Chris who had the same problem.
Since I didn't follow up on my problem he asked me if it was solved
and indeed it was solved by Søren and the fix has been committed to
CURRENT:
http://lists.freebsd.org/pipermail/cvs-all/2006-September/188353.html

I'm hoping this gets MFC'ed in time for 6.2

My conversation with Søren is bellow:

---------- Forwarded message ----------
From: Joao Barros <joao.barros@xxxxxxxxx>
Date: Sep 14, 2006 8:04 PM
Subject: Re: Fwd: panic: integer divide fault on 6.1
To: Søren Schmidt <sos@xxxxxxxxxxx>


On 9/13/06, Joao Barros <joao.barros@xxxxxxxxx> wrote:
On 9/13/06, Søren Schmidt <sos@xxxxxxxxxxx> wrote:
> Joao Barros wrote:
> > ---------- Forwarded message ----------
> > From: Sam Leffler <sam@xxxxxxxxx>
> > Date: Sep 10, 2006 5:16 PM
> > Subject: Re: panic: integer divide fault on 6.1
> > To: Joao Barros <joao.barros@xxxxxxxxx>
> > Cc: freebsd-current@xxxxxxxxxxx, freebsd-stable@xxxxxxxxxxx, Kris
> > Kennaway <kris@xxxxxxxxxxxxxx>
> >
> >
> > Joao Barros wrote:
> >> On 9/9/06, Kris Kennaway <kris@xxxxxxxxxxxxxx> wrote:
> >>> On Sat, Sep 09, 2006 at 09:02:35PM +0100, Joao Barros wrote:
> >>> > On 9/9/06, Max Laier <max@xxxxxxxxxxxxxx> wrote:
> >>> > >
> >>> > >Can you try to get a dump, trace, or at least figure out which
> >>> function
> >>> > >the IP is refering to?
> >>> > >
> >>> >
> >>> > Well, the problem only occurs when I boot from the disk and the
> >>> > installed kernel doesn't have debug support.
> >>> > Does 'set dumpdev=' work from the boot loader? I tried some
> >>> > combinations with no success.
> >>>
> >>> No.
> >>>
> >>> > I can try and install a 6-STABLE snapshot if there's no way of
> >>> getting
> >>> > the info needed.
> >>>
> >>> You can either try to install a new kernel with DDB support, or follow
> >>> the "instruction pointer" method in the developers handbook chapter on
> >>> kernel debugging.
> >>
> >> I copied a CURRENT kernel from a 200608 snapshot and the problem also
> >> occurs thus I'm adding current@.
> >> My current laptop doesn't have a serial port so I'm copying this by
> >> hand:
> >>
> >> Fatal trap 18: integer divide fault while in kernel mode
> >> cpuid = 0; apic id = 00
> >> instruction pointer = 0x20:0xc08a1fb7
> >> stack pointer = 0x28:0xc0c20b14
> >> frame pointer = 0x28:0xc0c20b9c
> >> code segment = base 0x0, limit 0xfffff, type 0x1b
> >> = DPL 0, pres 1, def32 1, gran 1
> >> processor eflags = interrupt enabled, resume, IOPL = 0
> >> current process = 0 (swapper)
> >> [thread pid 0 tid 0 ]
> >> Stopped at __qdivrem+0x3b: divl %ecx,%eax
> >>
> >> db> bt
> >> Tracing pid 0 tid0 td 0xc0a0c818
> >> __qdivrem(37fdfa0,0,0,0,0,...) at __qdivrem+0x3b
> >> __udivdi3(37fdfa0,0,0,0) at __udivdi3+0x16
> >> ata_raid_promise_read_meta(c37a5000,c09f4a80,1,8086,c37a5000,...) at
> >> ata_raid_promise_read_meta+0x9b
> >> ata_raid_read_metadata(c37a5000,c37a5000,c0c20c70,c06b58a4,c37a5000,...)
> >> at ata_raid_metadata+0x2be
> >> ata_raid_subdisk_attach(c37a5000) at ata_raid_subdisk_attach+0x33
> >> device_attach(c37a5000,c37a5180,c37a5000,c36885c0,0,...) at
> >> device_attach+0x58
> >> device_probe_and_attach(c37a5200,c37a5200,c08ec9a9,0,c37a5180,...) at
> >> bus_generic_attach+0x16
> >> ad_attach(c37a5200) at ad_attach+0x2c8
> >> device_attach(c37a5200,c095f2d0,c37a5200,0,c368d800,...) at
> >> device_attach+0x58
> >> device_probe_and_attach(c37a5200) at device_probe_and_atach+0xe0
> >> bus_generic_attach(c3659080,c3659080,ffffffff,0,c37a5200,...) at
> >> bus_generic_attach+0x16
> >> ata_identify(c3659080) at ata_identify+0x1c8
> >> ata_boot_attach(0xc0a11d80,0,c09212e7,47,...) at ata_boot_attach+0x3e
> >> run_interrupt_drive_config_hooks(0,c1ec00,c1e000,0,c0451065,...) at
> >> run_interrupt_drive_config_hooks+0x43
> >> mi_startup() at mi_startup+0x96
> >> begin() at begin+0x2c
> >>
> >> This board has a Promise SATA raid controller and it is disabled in
> >> the BIOS. I even tried disabling it through a jumper but it still
> >> stops.
> >>
> >
> > In sys/dev/ata/ata-raid.h the PROMISE_LBA macro does an unchecked
> > calculation that apparently can divide by zero. Soren would likely
> > understand the root cause of this problem but until then you can patch
> > the driver to workaround the problem.
> >
> > Sam
> >
> >
> > Hi Søren,
> >
> > I don't know if you bumped into this thread but this should definitely
> > be fixed.
> > Do you want me to open a PR?
> >
> Hmm, the problem seems to be that the geometry thats gotten from the
> disk has (all) zero's in it, which we cannot handle.
> Its most likely because your BIOS put invalid or no current geometry
> info in the disks parameters page.
>
> If you are up to a little debugging, you could try this patch:
>
> diff -u -r1.189.2.4 ata-disk.c
> --- ata-disk.c 4 Apr 2006 16:07:42 -0000 1.189.2.4
> +++ ata-disk.c 13 Sep 2006 06:18:59 -0000
> @@ -97,7 +97,8 @@
> }
> device_set_ivars(dev, adp);
>
> - if (atadev->param.atavalid & ATA_FLAG_54_58) {
> + if ((atadev->param.atavalid & ATA_FLAG_54_58) &&
> + atadev->param.current_heads && atadev->param.current_sectors) {
> adp->heads = atadev->param.current_heads;
> adp->sectors = atadev->param.current_sectors;
> adp->total_secs = (u_int32_t)atadev->param.current_size_1 |
>
>
> -Søren
>

Sure, I'll test it when I get home later today and start building a
kernel on another machine that's currently off.
I'll give you feedback tomorrow.
Thanks,

--
Joao Barros


I'm happy to report that your patch did the trick. The machine happily
boot a today's CURRENT+patch :)
Tell me if you need to test something more.
Thanks!

--
Joao Barros
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • panic: integer divide fault on 6.1
    ... >>> the BIOS. ... I even tried disabling it through a jumper but it still ... >> understand the root cause of this problem but until then you can patch ... > info in the disks parameters page. ...
    (freebsd-current)
  • Re: PostgreSQL pgbench performance regression in 2.6.23+
    ... Also, that requires being intrusive into people's setup scripts, which bothers me a lot more than doing a bit of kernel tuning at system startup. ... I did again get useful results here with the stock 2.6.26.git kernel and default parameters using Peter's small patch to adjust se.waker. ... Combining those two but keeping the rest of the features on actually gave the best result I've ever seen here, better than with all the features disabled. ... Mike suggested a patch to 2.6.25 in this thread that backports the feature for disabling SCHED_FEAT_SYNC_WAKEUPS. ...
    (Linux-Kernel)
  • P4 HT with SMP kernel, FreeBSD 5.2.1-p4 problem
    ... After I made a bios upgrade I have this problem with it: ... irq 2 ... isa_probe_children: disabling PnP devices ... : Unretryable Error ...
    (freebsd-hackers)
  • [patch] Re: PostgreSQL pgbench performance regression in 2.6.23+
    ... better than with all the features disabled. ... Because those were git _with_ Peter's patch. ... feature for disabling SCHED_FEAT_SYNC_WAKEUPS. ... handler (int n) ...
    (Linux-Kernel)
  • Re: XP SP2 doesnt work on AOpen AX4SG Max m/b
    ... Disabling the L1/L2 cache in the BIOS will correct the ... Disabling the L1/L2 will at least allow the PC to startup and SP2 can be ... >> quite a bit, so I'll try install XP SP2 on a Shuttle I have around, to ...
    (microsoft.public.windowsxp.hardware)