BUG/MEDIUM: checks: unblock signals in external checks

As discussed in issue #140, processes are forked with signals blocked
resulting in haproxy's kill being ignored. This happens when the command
takes more time to complete than the configured check timeout or interval.
Just calling "sleep 30" every second makes the problem obvious.

The fix simply consists in unblocking the signals in the child after the
fork. It needs to be backported to all stable branches containing external
checks and where signals are blocked on startup. It's unclear when it
started, but the following config exhibits the issue :

  global
    external-check

  listen www
    bind :8001
    timeout client 5s
    timeout server 5s
    timeout connect 5s
    option external-check
    external-check command "$PWD/sleep10.sh"
    server local 127.0.0.1:80 check inter 200

  $ cat sleep10.sh
  #!/bin/sh
  exec /bin/sleep 10

The "sleep" processes keep accumulating for 10 seconds and stabilize
around 25 when the bug is present. Just issuing "killall sleep" has no
effect on them, and stopping haproxy leaves these processes behind.
This commit is contained in:
Willy Tarreau 2019-07-01 07:51:29 +02:00
parent ad03288e6b
commit 2df8cad0fe

View File

@ -1997,6 +1997,7 @@ static int connect_proc_chk(struct task *t)
environ = check->envp;
extchk_setenv(check, EXTCHK_HAPROXY_SERVER_CURCONN, ultoa_r(s->cur_sess, buf, sizeof(buf)));
haproxy_unblock_signals();
execvp(px->check_command, check->argv);
ha_alert("Failed to exec process for external health check: %s. Aborting.\n",
strerror(errno));