Re: kernel panic: page fault




On Mon, 3 Apr 2006, Kazuaki Oda wrote:

Also, are you running with INVARIANTS and/or WITNESS?

Sorry, I compiled kernel without INVARIANTS and WITNESS.

No problem at all -- the debugging information you have sent me is enough to track down the source of the problem. It looks like we have an inconsistency in how we handle (especially in my new world order) the recycling of timewait state for an inpcb that is still present. I've committed a work-around which should prevent the panic you're seeing, but I need to investigate a bit more before I can commit a full solution.

For those interested, the problem is how to handle sockets with attached inpcbs that represent closed or time wait TCP connections. This can happen if shutdown() is called on a socket, kicking the TCP state engine into a close cycle, rather than a reset. In the current world order, the following sets of socket, pcb, and ppcb protocol state can occur:

fd -> socket <-> inpcb <-> tcpcb Normal TCP socket in various states.
fd -> socket <-> inpcb <-> twtcp Unclosed TCP socket in time wait.
fd -> socket <-> inpcb <-> NULL Unclosed TCP socket after tw recycle.
socket <-> inpcb <-> tcpcb Socket closed, buffer still needed.
inpcb <-> twtcp Socket closed, time wait.

The problem was that the middle case exists, but was not accounted for. There's another problem that is still present in the new socket/pcb model, in which the inpcb of an open socket with a closed TCP connection continues to reserve the address/port combination. This is related to the inpcb without twtcp case, where we recycle the twtcp, but can't recylce the inpcb immediately because there's still an fd reference to the socket, and hence a socket reference to the inpcb.

My current leaning is to do the following:

- Since we now keep inpcb's around for the lifetime of the socket, either
teach the inpcb lookup routines to ignore INP_DROPPED, or to move them to
another inpcb list for open but dropped inpcb's. This will avoid the
reservation hanging around.

- Either eliminate the inp_ppcb pointer NULL case by prohibiting recycling of
the twtcp state of a socket that is still open, or more formally supporting
it through the previous change. The trick is to prevent the twtcp/inpcb
pairs from turning up and being used during input and allocation collision
processing.

The summary is that we're not quite there yet in terms of how it all should be working, but that we should avoid the panic for now due to the workarounds I committed (which basically are changes to handle the inp_ppcb pointer being NULL for INP_TIMEWAIT sockets).

Thanks!

Robert N M Watson
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: kernel panic: page fault
    ... I compiled kernel without INVARIANTS and WITNESS. ... It looks like we have an inconsistency in how we handle the recycling of timewait state for an inpcb that is still present. ... This can happen if shutdownis called on a socket, kicking the TCP state engine into a close cycle, rather than a reset. ...
    (freebsd-current)
  • Re: how to use wait_event_interruptible_timeout
    ... I have a kernel module where I have ... opened multiple tcp connections. ... i want my process to sleep. ... so that whenever data is available on any socket (tcp ...
    (Linux-Kernel)
  • [6.x] problem with AIO, non-blocking sockets on freebSD and IE7 on windows.
    ... If one has an event-driven process that accepts tcp connections, one needs to set eh non-blocking socket option and use kqueue or similar to schedule work. ... to tell it to put the session in question onto a timer.. ...
    (freebsd-net)
  • C# Raw Socket Issues starting TCP Connections
    ... I am working on starting and managing TCP connections manually. ... read incomming data if I use IOControl and set the socket SIO_RCVALL. ... some reason the Socket replys with a RST with out me doing anything. ... promiscious mode that it seems to apply to other sockets running in ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: My server sometimes goes deaf to certain hosts
    ... If your program is blocked on a pipe or unix socket, ... new TCP connections because it won't call 'accept' until the blocking ... so you are saying that the listening process might be blocking ...
    (comp.os.linux.development.apps)