I was careful to have it for sock_unix.c but missed it for sock_inet
which broke with commit 36722d227 ("MINOR: sock_inet: report the errno
string in binding errors") depending on the build options. No backport
is needed.
Just like with previous patch, let's report UNIX socket binding errors
in plain text. we can now see for example:
[ALERT] 260/083531 (13365) : Starting frontend f: cannot switch final and temporary UNIX sockets (Operation not permitted) [/tmp/root.sock]
[ALERT] 260/083640 (13375) : Starting frontend f: cannot change UNIX socket ownership (Operation not permitted) [/tmp/root.sock]
With the socket binding code cleanup it becomes easy to add more info to
error messages. One missing thing used to be the error string, which is
now added after the generic one, for example:
[ALERT] 260/082852 (12974) : Starting frontend f: cannot bind socket (Permission denied) [0.0.0.0:4]
[ALERT] 260/083053 (13292) : Starting frontend f: cannot bind socket (Address already in use) [0.0.0.0:4444]
[ALERT] 260/083104 (13298) : Starting frontend f: cannot bind socket (Cannot assign requested address) [1.1.1.1:4444]
We used to resort to a trick to detect whether the caller was a listener
or an outgoing socket in order never to present an AF_CUST_UDP* socket
to a log server nor a nameserver. This is no longer necessary, the socket
type alone will be enough.
We don't need to cheat with the sock_domain anymore, we now always have
the SOCK_DGRAM sock_type as a complementary selector. This patch restores
the sock_domain to AF_INET* in the udp* protocols and removes all traces
of the now unused AF_CUST_*.
By doing so we can remove the hard-coded mapping from AF_INET to AF_CUST_UDP
but we still need to keep the test on the listeners as long as these dummy
families remain present in the code.
The protocol array used to be only indexed by socket family, which is very
problematic with UDP (requiring an extra family) and with the forthcoming
QUIC (also requiring an extra family), especially since that binds them to
certain families, prevents them from supporting dgram UNIX sockets etc.
In order to address this, we now start to register the protocols with more
info, namely the socket type and the control type (either stream or dgram).
This is sufficient for the protocols we have to deal with, but could also
be extended further if multiple protocol variants were needed. But as is,
it still fits nicely in an array, which is convenient for lookups that are
instant.
This one will be needed to more accurately select a protocol. It may
differ from the socket type for QUIC, which uses dgram at the socket
layer and provides stream at the control layer. The upper level requests
a control layer only so we need this field.
Most callers of str2sa_range() need the protocol only to check that it
provides a ->connect() method. It used to be used to verify that it's a
stream protocol, but it might be a bit early to get rid of it. Let's keep
the test for now but move it to str2sa_range() when the new flag PA_O_CONNECT
is present. This way almost all call places could be cleaned from this.
There's a strange test in the server address parsing code that rechecks
the family from the socket which seems to be a duplicate of the previously
removed tests. It will have to be rechecked.
We'll need this so that it can return pointers to stacked protocol in
the future (for QUIC). In addition this removes a lot of tests for
protocol validity in the callers.
Some of them were checked further apart, or after a call to
str2listener() and they were simplified as well.
There's still a trick, we can fail to return a protocol in case the caller
accepts an fqdn for use later. This is what servers do and in this case it
is valid to return no protocol. A typical example is:
server foo localhost:1111
The function will need to use more than just a family, let's pass it
the selected protocol. The caller will then be able to do all the fancy
stuff required to pick the best protocol.
str2listener() was temporarily hacked to support datagram sockets for
the log-forward listeners. This has has an undesirable side effect that
"bind udp@1.2.3.4:5555" was silently accepted as TCP for a bind line.
We don't need this hack anymore since the only user (log-forward) now
relies on str2receiver(). Now such an address will properly be rejected.
Thanks to this we don't need to specify "udp@" as it's implicitly a
datagram type listener that is expected, so any AF_INET/AF_INET4 address
will work.
This is at least temporary, as the migration at once is way too difficuly.
For now it still creates listeners but only allows DGRAM sockets. This
aims at easing the split between listeners and receivers.
Now we only rely on dgram type associated with AF_INET/AF_INET6 to infer
UDP4/UDP6. We still keep the hint based on PA_O_SOCKET_FD to detect that
the caller is a listener though. It's still far from optimal but UDP
remains rooted into the protocols and needs to be taken out first.
For now only listeners can make use of AF_CUST_UDP and it requires hacks
in the DNS and logsrv code to remap it to AF_INET. Make str2sa_range()
smarter by detecting that it's called for a listener and only set these
protocol families for listeners. This way we can get rid of the hacks.
The parser now supports a socket type for the control layer and a possible
other one for the transport layer. Usually they are the same except for
protocols like QUIC which will provide a stream transport layer based on
a datagram control layer. The default types are preset based on the caller's
expectations, and may be refined using "stream+" and "dgram+" prefixes.
For now they were not added to the docuemntation because other changes
will probably happen around UDP as well. It is conceivable that "tcpv4@"
or "udpv6@" will appear later as aliases for "stream+ipv4" or "dgram+ipv6".
Just like for inherited sockets, we want to make sure that FDs that are
mentioned in "sockpair@" are actually usable. Right now this test is
performed by the callers, but not everywhere. Typically, the following
config will fail if fd #5 is not bound:
frontend
bind sockpair@5
But this one will pass if fd #6 is not bound:
backend
server s1 sockpair@6
Now both will return an error in such a case:
- 'bind' : cannot use file descriptor '5' : Bad file descriptor.
- 'server s1' : cannot use file descriptor '6' : Bad file descriptor.
As such the test in str2listener() is not needed anymore (and it was
wrong by the way, as it used to test for the socket by overwriting the
local address with a new address that's made of the FD encoded on 16
bits and happens to still be at the same place, but that strictly
depends on whatever the kernel wants to put there).
Since previous patch we know that a successfully bound fd@XXX socket
is returned as its own protocol family from str2sa_range() and not as
AF_CUST_EXISTING_FD anymore o we don't need to check for that case
in str2listener().
When str2sa_range() is invoked for a bind or log line, and it gets a file
descriptor number, it will immediately resolve the socket's address (when
it's a socket) so that the address family, address and port are correctly
set. This will later allow to resolve some transport protocols that are
attached to existing FDs. For raw FDs (e.g. logs) and for socket pairs,
the FD number is still returned in the address, because we need the
underlying address management to complete the bind/listen/connect/whatever
needed. One immediate benefit is that passing a bad FD will now result in
one of these errors:
'bind' : cannot use file descriptor '3' : Socket operation on non-socket.
'bind' : socket on file descriptor '3' is of the wrong type.
Note that as of now, we never return a listening socket with a family of
AF_CUST_EXISTING_FD. The only case where this family is seen is for a raw
FD (e.g. logs).
If a file descriptor was passed, we can optionally return it. This will
be useful for listening sockets which are both a pre-bound FD and a ready
socket.
These flags indicate whether the call is made to fill a bind or a server
line, or even just send/recv calls (like logs or dns). Some special cases
are made for outgoing FDs (e.g. pipes for logs) or socket FDs (e.g external
listeners), and there's a distinction between stream or dgram usage that's
expected to significantly help str2sa_range() proceed appropriately with
the input information. For now they are not used yet.
Now that str2sa_range() checks for appropriate port specification, we
don't need to implement adhoc test cases in every call place, if the
result is valid, the conditions are met otherwise the error message is
appropriately filled.
Now str2sa_range() will enforce the caller's port specification passed
using the PA_O_PORT_* flags, and will return an error on failure. For
optional ports, values 0-65535 will be enforced. For mandatory ports,
values 1-65535 are enforced. In case of ranges, it is also verified that
the upper bound is not lower than the lower bound, as this used to result
in empty listeners.
I couldn't find an easy way to test this using VTC since the purpose is
to trigger parse errors, so instead a test file is provided as
tests/ports.cfg with comments about what errors are expected for each
line.
These flags indicate what is expected regarding port specifications. Some
callers accept none, some need fixed ports, some have it mandatory, some
support ranges, and some take an offset. Each possibilty is reflected by
an option. For now they are not exploited, but the goal is to instrument
str2sa_range() to properly parse that.
We currently have an argument to require that the address is resolved
but we'll soon add more, so let's turn it into a bit field. The old
"resolve" boolean is now PA_O_RESOLVE.
The code is built to match prefixes at one place and to parse the address
as a second step, except for fd@ and sockpair@ where the test first passes
via AF_UNSPEC that is changed again. This is ugly and confusing, so let's
proceed like for the other ones.
At some places (log fd@XXX, bind fd@XXX) we support using an explicit
file descriptor number, that is placed into the sockaddr for later use.
The problem is that till now it was done with an AF_UNSPEC family, which
is also used for other situations like missing info or rings (for logs).
Let's create an "official" family AF_CUST_EXISTING_FD for this case so
that we are certain the FD can be found in the address when it is set.
This removes the following fields from struct protocol that are now
retrieved from the protocol family instead: .sock_family, .sock_addrlen,
.l3_addrlen, .addrcmp, .bind, .get_src, .get_dst.
This also removes the UDP-specific udp{,6}_get_{src,dst}() functions
which were referenced but not used yet. Their goal was only to remap
the original AF_INET* addresses to AF_CUST_UDP*.
Note that .sock_domain is still there as it's used as a selector for
the protocol struct to be used.
We now take care of retrieving sock_family, l3_addrlen, bind(),
addrcmp(), get_src() and get_dst() from the protocol family and
not just the protocol itself. There are very few places, this was
only seldom used. Interestingly in sock_inet.c used to rely on
->sock_family instead of ->sock_domain, and sock_unix.c used to
hard-code PF_UNIX instead of using ->sock_domain.
Also it appears obvious we have something wrong it the protocol
selection algorithm because sock_domain is the one set to the custom
protocols while it ought to be sock_family instead, which would avoid
having to hard-code some conversions for UDP namely.
We need to specially handle protocol families which regroup common
functions used for a given address family. These functions include
bind(), addrcmp(), get_src() and get_dst() for now. Some fields are
also added about the address family, socket domain (protocol family
passed to the socket() syscall), and address length.
These protocol families are referenced from the protocols but not yet
used.
All protocol's listeners now only take care of themselves and not of
the receiver anymore since that's already being done in proto_bind_all().
Now it finally becomes obvious that UDP doesn't need a listener, as the
only thing it does is to set the listener's state to LI_LISTEN!
Now protocol_bind_all() starts the receivers before their respective
listeners so that ultimately we won't need the listeners for non-
connected protocols.
We still have to resort to an ugly trick to set the I/O handler in
case of syslog over UDP because for now it's still not set in the
receiver, so we hard-code it.
Note that for now we don't have a sockpair.c file to host that unusual
family, so the new function was placed directly into proto_sockpair.c.
It's no big deal given that this family is currently not shared with
multiple protocols.
The function does almost nothing but setting up the receiver. This is
normal as the socket the FDs are passed onto are supposed to have been
already created somewhere else, and the only usable identifier for such
a socket pair is the receiving FD itself.
The function was assigned to sockpair's ->bind() and is not used yet.
This removes all the AF_UNIX-specific code from uxst_bind_listener()
and now simply relies on sock_unix_bind_listener() to do the same
job. As mentionned in previous commit, the only difference is that
now an unlikely failure on listen() will not result in a roll back
of the temporary socket names since they will have been renamed
during the bind() operation (as expected). But such failures do not
correspond to any normal case and mostly denote operating system
issues so there's no functionality loss here.
This function performs all the bind-related stuff for UNIX sockets that
was previously done in uxst_bind_listener(). There is a very tiny
difference however, which is that previously, in the unlikely event
where listen() would fail, it was still possible to roll back the binding
and rename the backup to the original socket. Now we have to rename it
before calling returning, hence it will be done before calling listen().
However, this doesn't cover any particular use case since listen() has no
reason to fail there (and the rollback is not done for inherited sockets),
that was just done that way as a generic error processing path.
The code is not used yet and is referenced in the uxst proto's ->bind().
This removes all the AF_INET-specific code from udp_bind_listener()
and now simply relies on sock_inet_bind_listener() to do the same
job. The function is now basically just a wrapper around
sock_inet_bind_receiver().
This removes all the AF_INET-specific code from tcp_bind_listener()
and now simply relies on sock_inet_bind_listener() to do the same
job. The function was now roughly cut in half and its error path
significantly simplified.
This function collects all the receiver-specific code from both
tcp_bind_listener() and udp_bind_listener() in order to provide a more
generic AF_INET/AF_INET6 socket binding function. For now the API is
not very elegant because some info are still missing from the receiver
while there's no ideal place to fill them except when calling ->listen()
at the protocol level. It looks like some polishing code is needed in
check_config_validity() or somewhere around this in order to finalize
the receivers' setup. The main issue is that listeners and receivers
are created *before* bind_conf options are parsed and that there's no
finishing step to resolve some of them.
The function currently sets up a receiver and subscribes it to the
poller. In an ideal world we wouldn't subscribe it but let the caller
do it after having finished to configure the L4 stuff. The problem is
that the caller would then need to perform an fd_insert() call and to
possibly set the exported flag on the FD while it's not its job. Maybe
an improvement could be to have a separate sock_start_receiver() call
in sock.c.
For now the function is not used but it will soon be. It's already
referenced as tcp and udp's ->bind().
This will be the function that must be used to bind the receiver. It
solely depends on the address family but for now it's simpler to have
it per protocol.
The new RX_O_FOREIGN, RX_O_V6ONLY and RX_O_V4V6 options are now set into
the rx_settings part during the parsing, so that we don't need to adjust
them in each and every listener anymore. We have to keep both v4v6 and
v6only due to the precedence from v6only over v4v6.
It's the receiver's FD that's inherited from the parent process, not
the listener's so the flag must move to the receiver so that appropriate
actions can be taken.
In order to split the receiver from the listener, we'll need to know that
a socket is already bound and ready to receive. We used to do that via
tha LI_O_ASSIGNED state but that's not sufficient anymore since the
receiver might not belong to a listener anymore. The new RX_F_BOUND flag
is used for this.