Christopher reported a rare race condition involving 'healthcheckmail.vtc'
The regtest would randomly FAIL with this kind of error:
** S1 === expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Health check for server b...
**** S1 EXPECT MATCH ~ "[^:\[ ]\[581669\]: Health check for server be1/srv1 failed.+check duration: [[:digit:]]+ms.+status: 0/1 DOWN."
** S1 === recv info
**** S1 syslog|<25>May 11 15:38:46 haproxy[581669]: Server be1/srv1 is DOWN. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
**** S1 syslog|<24>May 11 15:38:46 haproxy[581669]: backend be1 has no server available!
It turns out that this it due to the recent commit 7963fb5 ("REGTESTS: use
lua mailer script for mailers tests") in which we tell the regtest to use
the new lua mailers instead of the legacy mailers API.
However, in the lua mailers script, due to the event subscriptions being
performed from a lua task, it is possible that the subscription may be
delayed during startup. Indeed lua tasks relie on the scheduler which runs
tasks with no ordering guarantees. Thus early tasks, including server
checks which are used in the regtest are competing during startup.
As such, we may end up with some events that are generated right before
the lua mailers script starts subscribing to events (because the lua task
is scheduled but started yet), resulting in events loss from lua point of
view.
To fix this and to make lua mailers more reliable during startup, we now
perform the events subscription from an init function instead of an
asynchronous task. (The init function is called synchronously during
haproxy post_init, and exclusively runs before the scheduler starts)
This should be enough to prevent healthcheckmail.vtc from randomly failing
While this used to work fine with legacy mailers, IPV6 server support
for lua mailers script was overlooked so it is currently broken.
Indeed, within the lua script, server address was parsed as an IPV4
address to extract both ip and port and pass them to smtp_send_email()
function from Thierry FOURNIER.
From lua point of view: when fetching server address from
ProxyMailers.mailservers, server ip and port are not separated. Each
server address is represented using haproxy server address custom-format
(the one used to specify server addresses within haproxy config,
see 11. Address formats in haproxy configuration manual):
It is a string that contains both proto hint, ip and port.
(Such addresses are manipulated using str2sa_range() and sa2str()
in haproxy's code)
Parsing these custom-format addresses from lua to support multiple address
families is feasible since the format is properly documented in haproxy
configuration.
However, to keep things simple, and given that smtp_send_email() relies
on Socket.connect() function to set-up the tcp connection:
Socket.connect() already supports the full server address custom-format
when no explicit port argument is provided. Thus with minor code changes
we're able to pass the server string as it is.
With this, IPV6 smtp servers from mailers section are now automatically
supported when using lua mailers script.
Lua mailers scripts now leverages Queue class to implement a mailqueue.
Thanks to the mailqueue, emails are still sent asynchronously, but with
respect to their generation order (better ordering consistency).
That is, previous script limitation (see below) is no longer true:
"
Current known script limitation: if multiple events are generated
simultaneously it is possible that emails could be received out of
order since emails are sent asynchronously using smtp_send_email()
and there is no sending queue. Relying on the email "date" should
help to know which email was generated first..
"
For a given server, email-alerts will be sent in the same order as the
events were generated. However this does not apply to events between 2
distinct servers, since each server is tracked independently within
the lua script. (1 subscription per server, to make sure only relevant
servers x events are being tracked and prevent useless wakeups)
Legacy mailers implemented using tcpchecks may now be replaced using a
pure lua implementation.
Simply loading the file "mailers.lua" using lua-load directive is enough
to disable the legacy mailer implementation and make the lua script
subscribe to server events in order to generate messages and send email
alerts to configured mailservers according to the mailers configuration
specified by the user in the config file.
lua-load-per-thread directive is supported, the script will automatically
force itself on a single thread to prevent multiple mails from being sent
for the same event.
The email sending from lua in itself is handled with smtp_send_email()
function from Thierry FOURNIER.
(see: https://www.arpalert.org/how-to-send-email.html)
The function was slightly adapted to send HELO instead of EHLO when
beginning the SMTP handshake to make it more compatible with basic SMTP
stacks and to comply with the legacy SMTP handshake performed in
mailers.c.
Authentication is not yet handled by smtp_send_email(), but it may be
further improved to do so.
Note that existing lua libraries may also support sending emails (even
with authentication support maybe?), I did not do much researchs about
this, so I'm not aware of existing solutions at the time of writing this
script.
The only restriction is that when using an external library, the library
function calls must not be blocking, since haproxy relies on lua
executions to be yieldable and rescheduled.
As long as the script complies with this limitation, it may be customized
or improved in many ways, including templating, making calls to APIs
services.. (ie: triggering automation flows, sending SMS alerts...
you name it)
The purpose of this script is to provide a basic working replacement for
legacy mailers implementation based on tcpchecks, which is planned for
removal in the future. tcpcheck based mailers is kind of a hack which is
not suitable as a long term solution.
(hard to maintain and not customizable)
Note: Email content for email alerts sent using this script might slightly
differ from the legacy implementation (some optional info might be missing
such as server's check dynamic description, or included statistics such as
currently active servers may appear out of sync) due the email generation
now being performed asynchronously. However the output format complies with
the original one and essential informations are consistently reported.
Current known script limitation: if multiple events are generated
simultaneously it is possible that emails could be received out of
order since emails are sent asynchronously using smtp_send_email()
and there is no sending queue. Relying on the email "date" should
help to know which email was generated first..
For now: basic server event handling from lua, events are printed to
STDOUT.
The idea behind this is to provide simple and easy-to-use examples
that serve as a basic introduction for lua event handler feature.