nagios and freebsd threads issue : help please ...

From: Christophe Yayon (lists_at_nbux.com)
Date: 08/19/05

  • Next message: Daniel Eischen: "Re: nagios and freebsd threads issue : help please ..."
    Date: Fri, 19 Aug 2005 20:33:39 +0200
    To: freebsd-hackers@freebsd.org
    
    

    Hi all

    You should know about freebsd and nagios 2.0b threads issues (100% cpu
    use by a forked process, lost check result, some pause of nagios main
    process in certains obscursives conditions...).

    Some Nagios developpers says that the problem is in FreeBSD and some
    other says that the problem is in nagios pthreads implementation, here a
    resume of our discussions :

    -------
    The thread I started is here:

      http://marc.theaimsgroup.com/?t=111930118000001&r=1&w=2

      There are some very interesting replies, a few in particular note that
      Nagios may be breaking POSIX spec in how it spawns/destroys threads:

      http://marc.theaimsgroup.com/?l=freebsd-hackers&m=111944526323754&w=2
      http://marc.theaimsgroup.com/?l=freebsd-hackers&m=111945035012258&w=2

      Anyhow, I"m sure if Ethan were to post some more specific info to
      freebsd-hackers@fr... (it"s an open list, no need to sub), this
      issue could get banged out pretty quickly.

      Shortly after this thread, I found another where the issue was brought up
      by another curious poster, and he was using 5.4, which uses a newer
      threading library:

      http://marc.theaimsgroup.com/?t=112119712600002&r=1&w=2

      This post again brings up the "fork without exec or exit" possibly not
      following spec:

      http://marc.theaimsgroup.com/?l=freebsd-hackers&m=112125883804481&w=2

      "I don"t know what Nagios does just after fork(2), it would be worth to
      check. It appears that fork(2)ing without exec(2)ing or _exit(2)ing
      in a pthreaded program is not a "valid" behaviour, regarding to
      SUSv3 [1]. I don"t want to avoid admitting there is a problem in
      FreeBSD threading library, I don"t know how other OSes handle this,
      but Nagios folks should really avoid doing what is explicitely
      dissuaded in SUSv3."
    --------

    --------
    As the problem isn't in Nagios and noone seems to have an authoritative
      answer on what exactly is causing it, I'd say you would be better off
      switching to a GNU/Linux system, with at least Linux 2.4.29 and
      glibc-2.3 (a lot work was put into thread-safeness on glibc-2.3).
    --------

    --------
       From
     
    http://www.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html

      "It is suggested that programs that use fork() call an exec function
      very soon afterwards in the child process, thus resetting all states. In
      the meantime, only a short list of async-signal-safe library routines
      are promised to be available."

      Note *suggested*. This is a recommendation to protect against a shoddy
      pthread-implementation. The thread specifications rule that only the
      thread calling fork() is duplicated, which initially leads to the
      recommendation (other threads holding locks aren't around to release
      them in the new execution context).

      That said, Nagios would most likely benefit greatly from a different
      means of checking things than fork()'ing twice and sending the results
      through several tiers of FIFO's. Several different methods have already
      been benchmarked. For server machines (or at least cans with a lot of
      memory and quite regularly multiple CPU's), the best way seems to be to
      create a new thread for each check to run. popen() causes a fork() and
      execve(), so that should be safe enough.

      What limits this imposes I don't know, but the NPTL library in use on
      most modern linux systems today handles 10.000 threads without barfing,
      so the limit would probably be sysconf(_SC_MAX_FILES), or ulimit -n,
      which is required by posix to be at least 256. Note that half this value
      (give or take 5 or so for stdin and such) represents the number of
      checks that can run simultaneously at any given time. When one of them
      completes another can kick in.
    --------

    What do you think about this ?
    Should we have a specific threads nagios patch for FreeBSD ?
    Nagios problem or FreeBSD problem ?
    Should we switch our Nagios systems to Linux (which is very
    psychological difficult for me ...) ?

    Thanks in advance for your help... I hope we will found a solution...

    _______________________________________________
    freebsd-hackers@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
    To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


  • Next message: Daniel Eischen: "Re: nagios and freebsd threads issue : help please ..."

    Relevant Pages

    • Re: nagios and freebsd threads issue : help please ...
      ... Nagios may be breaking POSIX spec in how it spawns/destroys threads: ... FreeBSD threading library, I don"t know how other OSes handle this, ... "It is suggested that programs that use fork() call an exec function ...
      (freebsd-hackers)
    • Nagios and threads
      ... There are a few known issues with the Nagios 2.0 code at the moment. ... FreeBSD and threads. ... On the second fork to create the grandchild, ...
      (freebsd-hackers)
    • nagios and pthreads
      ... What's section on nagios website ... "FreeBSD and threads. ... On the second fork to create the grandchild, ... there is a nagios forked child process which consume 100% of ...
      (freebsd-hackers)
    • Re: pthreads and nagios issue
      ... But is it a Nagios or FreeBSD problem, if you read "what's new" section on ... reported that using the pthread library causes Nagios to pause under heavy ... second fork to create the grandchild, the grandchild is created by fork, ...
      (freebsd-stable)
    • Re: Nagios Client on FreeBSD 5.4
      ... I have the Nagios server on Linux, Have installed nrpe ... from FreeBSD to FreeBSD it works fine. ...
      (freebsd-questions)