Re: PAE broken on -current, likely broken on stable/9



Hi *,

[I could have renamed the subject 1001 fancy ways to crash FreeBSD,
but I'll avoid :)]

On Mon, Dec 5, 2011 at 5:15 PM, Arnaud Lacombe <lacombar@xxxxxxxxx> wrote:
Hi,

The kernel tree is utterly broken when PAE is enabled, it chokes
[non-exclusively] on the following:

After finally having been able to complete a build, the resulting
kernel miserably panics on:

real memory: 25769803776 (24576 MB)
panic: kmem_suballoc: bad status return of 3

This was with the default value of `vm.kmem_size' and
`vm.kmem_size_max'. I cannot find a good value for either of them.
With 2GB of RAM and 9.0RC2 (the release kernel), 700MB of kmem boots
fine. The same and 750MB of kmem chokes, when bringing up userland,
on:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0xbfc00000
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0d4baca
stack pointer = 0x28:0xc520f9dc
frame pointer = 0x28:0xc520fa14
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, IOPL = 0
current process = 1 (kernel)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xc0a4b027 at kdb_backtrace+0x47
#1 0xc0a185f7 at panic+0x117
#2 0xc0d48a03 at trap_fatal+0x323
#3 0xc0d48abd at trap_pfault+0xad
#4 0xc0d49845 at trap+0x465
#5 0xc0d3279c at calltrap+0x6
#6 0xc09e57a0 at exec_map_first_page+0x430
#7 0xc09e61fc at kern_execve+0x58c
#8 0xc09e75bc at sys_execve+0x4c
#9 0xc09cb372 at start_init+0x292
#10 0xc09ea8d7 at fork_exit+0x97
#11 0xc0d32814 at fork_trampoline+0x8
Uptime: 1s
Automatic reboot in 15 seconds - press a key on the console to abort

With 12GB of RAM and 700MB of kmem, chokes early on:

CPU: QEMU Virtual CPU version 0.14.50 (2660.71-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x633 Family = 6 Model = 3 Stepping = 3
Features=0x781abf9<FPU,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,PGE,CMOV,PAT,MMX,FXSR,SSE,SSE2>
Features2=0x80800001<SSE3,POPCNT,HV>
real memory = 12884901888 (12288 MB)
panic: kmem_suballoc: bad status return of 3
cpuid = 0
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at kdb_enter+0x3a: movl $0,kdb_why
db> bt
Tracing pid 0 tid 0 td 0xc068edb0
kdb_enter(c0603b0a,c0603b0a,c061fbb4,c08f6cbc,0,...) at kdb_enter+0x3a
panic(c061fbb4,3,0,0,c06c3a54,...) at panic+0x134
kmem_suballoc(c0ba6000,c06c3a54,c06c3a58,90f8000,1,...) at kmem_suballoc+0x85
vm_ksubmap_init(c06c3a4c,0,3,3000,0,...) at vm_ksubmap_init+0xbc
cpu_startup(0,8f0020,8f0020,8f0000,8fb000,...) at cpu_startup+0x27c
mi_startup() at mi_startup+0xac
begin() at begin+0x2c
db>

Reverting to the default value for `vm.kmem_size' and
`vm.kmem_size_max', 4GB (and 6GB) of RAM, with a PAE-enabled -current
kernel triggers an infinite loop of:

CPU: QEMU Virtual CPU version 0.14.50 (2660.40-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x633 Family = 6 Model = 3 Stepping = 3
Features=0x781abf9<FPU,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,PGE,CMOV,PAT,MMX,FXSR,SSE,SSE2>
Features2=0x80800001<SSE3,POPCNT,HV>
real memory = 6442450944 (6144 MB)
kernel trap 12 with interrupts disabled
kernel trap 12 with interrupts disabled
kernel trap 12 with interrupts disabled
kernel trap 12 with interrupts disabled
[...]
kernel trap 12 with interrupts disabled

At this point, even FreeBSD 7.1 is better, as it goes at least up until:

Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-RELEASE-p13 #0: Mon Nov 21 17:23:05 UTC 2011
root@build:/freebsd/conf/PAE
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: QEMU Virtual CPU version 0.14.50 (2660.26-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x633 Stepping = 3
Features=0x781abf9<FPU,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,PGE,CMOV,PAT,MMX,FXSR,SSE,SSE2>
Features2=0x80800001<SSE3,POPCNT,<b31>>
real memory = 16642998272 (15872 MB)
avail memory = 15784312832 (15053 MB)

It hanged there for a while, I'm not sure if it's because the system
is running on a VM with a disk-backed memory or another issue. I
killed qemu at this point. 6GB was "fine" too.

Coming back to -current, but now with `vm.kmem_size' and
`vm.kmem_size_max' set to 512M, a 12G system boots:

CPU: QEMU Virtual CPU version 0.14.50 (2660.39-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x633 Family = 6 Model = 3 Stepping = 3
Features=0x781abf9<FPU,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,PGE,CMOV,PAT,MMX,FXSR,SSE,SSE2>
Features2=0x80800001<SSE3,POPCNT,HV>
real memory = 12884901888 (12288 MB)
avail memory = 12621688832 (12036 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: <BOCHS BXPCAPIC>
ioapic0: Changing APIC ID to 1
ioapic0 <Version 1.1> irqs 0-23 on motherboard
[...]

up until right before multi-user, where it just directly reboot,
without triggering any message:

ada0: Previously was known as ad0
pass1 at ata1 bus 0 scbus1 target 0 lun 0
pass1: <QEMU QEMU DVD-ROM 0.14> Removable CD-ROM SCSI-0 device
pass1: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
Timecounter "TSC" frequency 2660388588 Hz quality 800
/boot/kernel/kernel data=0xc3e4ec+0xbda74 syms=[0x4+0xaff70+0x4+0xf1cd8]
-
______ ____ _____ _____
| ____| | _ \ / ____| __ \
| |___ _ __ ___ ___ | |_) | (___ | | | |
| ___| '__/ _ \/ _ \| _ < \___ \| | | |
| | | | | __/ __/| |_) |____) | |__| |
| | | | | | || | | |
|_| |_| \___|\___||____/|_____/|_____/
s` `.....---.......--.``` -/
Welcome to FreeBSDͻ +o .--` /y:` +.
yo`:. :o `+-
1. Boot [ENTER] y/ -/` -o/
2. [Esc]ape to loader prompt .- ::/sy+:.
3. Reboot / `-- /

The same kernel, build with KDB_TRACE, INVARIANTS, WITNESS and
WITNESS_SKIPSPIN doesn't reboot:

pass1 at ata1 bus 0 scbus1 target 0 lun 0
pass1: <QEMU QEMU DVD-ROM 0.14> Removable CD-ROM SCSI-0 device
pass1: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
Timecounter "TSC" frequency 2660386172 Hz quality 800
WARNING: WITNESS option enabled, expect reduced performance.
Swap zone entries reduced from 121574 to 24014.
Trying to mount root from ufs:/dev/ada0s1a [rw]...

but spins there, certainly potentially again because of the disk-backed memory.

4GB of RAM, with the same `vm.kmem_size' and `vm.kmem_size_max',
triggers the same `kernel trap 12 with interrupts disabled' as
previously described with the default value.

6GB of RAM self-reboot, even with the INVARIANTS/WITNESS kernel.

8GB and 10GB boots up until trying to mount root and spins.

14GB fails as described originally:

CPU: QEMU Virtual CPU version 0.14.50 (2660.41-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x633 Family = 6 Model = 3 Stepping = 3
Features=0x781abf9<FPU,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,PGE,CMOV,PAT,MMX,FXSR,SSE,SSE2>
Features2=0x80800001<SSE3,POPCNT,HV>
real memory = 15032385536 (14336 MB)
panic: kmem_suballoc: bad status return of 3
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper(c060a019,59c,c0af6c5c,c03c6173,c0da605c,...) at
db_trace_self_wrapper+0x26
kdb_backtrace(c063e31e,0,c062df08,c0af6cbc,0,...) at kdb_backtrace+0x2a
panic(c062df08,3,0,0,c0af6d0c,...) at panic+0x117
kmem_suballoc(c0da6000,c0af6d0c,c0af6d08,10080c0,0,...) at kmem_suballoc+0x85
vm_ksubmap_init(c0830ccc,80000000,3,3800,0,...) at vm_ksubmap_init+0x17d
cpu_startup(0,af0020,af0020,af0000,afb000,...) at cpu_startup+0x27c
mi_startup() at mi_startup+0xac
begin() at begin+0x2c

- Arnaud
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages