Skip to content

Timeouts problem / Parallelize #9

@adrianlzt

Description

@adrianlzt

Hi,

Schematically:
Icinga -----active check---> gearman ----> check_nrpe -c exec-passive ----> check_multi -f /etc/check_multi -----> send_multi --> gearman-perfdata

check_multi is triggered by an active check via nrpe.

command[exec_passive]=LC_ALL=C /usr/lib/nagios/plugins/check_multi -f /etc/check_multi -r 256 | send_multi --server=192.168.51.4 --encryption=yes --key=should_be_changed --host=m2m_client.com

Then check_multi starts to run, one by one, checks.

If a check_tcp can't connect to his target, it will wait 10s before fail.
send_multi will fail because he is hopping to get the data in less than 10s.

First idea was to increase the send_multi timeout, but then the problem was check_nrpe, which fails also if doesn't receive data in 10s.

I can increase also timeout in check_nrpe, but if a second check_tcp fails, I will need to set timeout in more than 20s.

Another approach is to limit the timeout of check_tcp, but same problem.
If I set check_tcp timeout to 4s, and 3 checks fail, we are again over the 10s timeout.

The only soulution I can think of is to parallelize/fork execution checks. In that way, global timeout will be the timeout of the slower check.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions