Re: [poll / rfc] kdb_stop_cpus




On 4 Jun 2011, at 09:22, Andriy Gapon wrote:

on 03/06/2011 20:57 Robert N. M. Watson said the following:

On 3 Jun 2011, at 16:13, Andriy Gapon wrote:

I wonder if anybody uses kdb_stop_cpus with non-default value. If, yes, I
am very interested to learn about your usecase for it.

The issue that prompted the sysctl was non-NMI IPIs being used to enter the
debugger or reboot following a core hanging with interrupts disabled. With
the switch to NMI IPIs in some of those circumstances, life is better -- at
least, on hardware that supports non-maskable IPIs. I seem to recall sparc64
doesn't, however?

Seems to be so as Nathan has also pointed out for PPC.
For this I also plan the following change:

commit 458ebd9aca7e91fc6e0825c727c7220ab9f61016

generic_stop_cpus: move timeout detection code from under DIAGNOSTIC

... and also increase it a bit.
IMO it's better to detect and report the (rather serious) condition and
allow a system to proceed somehow rather than be stuck in an endless
loop.

Agreed on detecting and reporting. It would be good to confirm that it works in practice, however, and also that there are no false positives. I'm not sure what the best test scenarios are for that.

Robert

_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: [poll / rfc] kdb_stop_cpus
    ... On 4 Jun 2011, at 09:22, Andriy Gapon wrote: ... The issue that prompted the sysctl was non-NMI IPIs being used to enter the ... move timeout detection code from under DIAGNOSTIC ... IMO it's better to detect and report the condition and ...
    (freebsd-current)
  • Re: CPU problems after 8.0-STABLE update
    ... Andriy Gapon wrote: ... <ACPI Control Method Battery> on acpi0 ... I'll see about the report after I rebuild the base system again. ... I will recompile the kernel these days and come back to you. ...
    (freebsd-stable)
  • Re: [poll / rfc] kdb_stop_cpus
    ... On 4 Jun 2011, at 09:22, Andriy Gapon wrote: ... move timeout detection code from under DIAGNOSTIC ... because that would greatly depend on hardware. ... Maybe we should use some time-based approach instead of the iteration count ...
    (freebsd-current)
  • Re: [poll / rfc] kdb_stop_cpus
    ... On 4 Jun 2011, at 09:22, Andriy Gapon wrote: ... move timeout detection code from under DIAGNOSTIC ... because that would greatly depend on hardware. ... Maybe we should use some time-based approach instead of the iteration count ...
    (freebsd-stable)
  • Re: [poll / rfc] kdb_stop_cpus
    ... On 3 Jun 2011, at 16:13, Andriy Gapon wrote: ... The issue that prompted the sysctl was non-NMI IPIs being used to enter the ... the switch to NMI IPIs in some of those circumstances, ...
    (freebsd-stable)