Commit Graph

3126 Commits

Author SHA1 Message Date
Christopher Faulet
8d8ac191a7 MINOR: proto_htx: Add functions htx_req_replace_stline and htx_res_set_status
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers. In the legacy versions of these functions, we switch on the HTX
code when applicable.
2018-11-18 22:08:56 +01:00
Christopher Faulet
7233352fe4 MINOR: proto_htx: Add functions htx_transform_header and htx_transform_header_str
It is more or less the same than legacy versions but adapted to be called from
HTX analyzers.
2018-11-18 22:08:56 +01:00
Christopher Faulet
7ff1ceaa5e MINOR: http_htx: Add functions to retrieve a specific occurrence of a header
There are 2 functions. The first one considers any comma as a delimiter for
distinct values. The second one considers full-line headers.
2018-11-18 22:08:55 +01:00
Christopher Faulet
e010c80753 MINOR: http_htx: Add functions to replace part of the start-line 2018-11-18 22:08:54 +01:00
Christopher Faulet
0f226958b7 MINOR: proto_htx: Add some functions to handle HTX messages
More functions will come, but it is the minimum to switch HTX analyzers on the
HTX internal representation.
2018-11-18 22:08:54 +01:00
Christopher Faulet
47596d3787 MINOR: http_htx: Add functions to manipulate HTX messages in http_htx.c
This file will host all functions to manipulate HTTP messages using the HTX
representation. Functions in this file will be able to be called from anywhere
and are mainly related to the HTTP semantics.
2018-11-18 22:08:53 +01:00
Christopher Faulet
a3d2a16fad MEDIUM: htx: Add API to deal with the internal representation of HTTP messages
The internal representation of an HTTP message, called HTX, is a structured
representation, unlike the old one which is a raw representation of
messages. Idea is to have a version-agnostic representation of the HTTP
messages, which can be easily used by to handle HTTP/1, HTTP/2 and hopefully
QUIC messages, and communication from one of them to another.

In this patch, we add types to define the internal representation itself and the
main functions to manipulate them.
2018-11-18 22:08:53 +01:00
Christopher Faulet
f2824e6e10 MAJOR: mux-h1/proto_htx: Handle keep-alive connections in the mux
Now, the connection mode is detected in the mux and not in HTX analyzers
anymore. Keep-alive connections are now managed by the mux. A new stream is
created for each transaction. This removes the most important part of the
synchronization between channels and the HTTP transaction cleanup. These changes
only affect the HTX part (proto_htx.c). Legacy HTTP analyzers remain untouched
for now.

On the client-side, the mux is responsible to create new streams when a new
request starts. It is also responsible to parse and update the "Connection:"
header of the response. On the server-side, the mux is responsible to parse and
update the "Connection:" header of the request. Muxes on each side are
independent. For now, there is no connection pool on the server-side, so it
always close the server connection.
2018-11-18 22:02:42 +01:00
Christopher Faulet
e0768ebabc MEDIUM: proto_htx: Add HTX analyzers and use it when the mux H1 is used
For now, these analyzers are just copies of the legacy HTTP analyzers. But,
during the HTTP refactoring, it will be the main place where it will be
visible. And in legacy analyzers, the macro IS_HTX_STRM is used to know if the
HTX version should be called or not.

Note: the following commits were applied to proto_http.c after this patch
      was developed and need to be studied to see if an adaptation to htx
      is required :

  fd9b68c BUG/MINOR: only mark connections private if NTLM is detected
2018-11-18 21:45:50 +01:00
Christopher Faulet
1d5b85aba2 MINOR: http: Add macros to check if a stream uses the HTX representation
To prepare the refactoring of the code handling HTTP messages, these macros will
help to use HTX functions instead of legacy ones when the new HTX internal
representation is in use. To do so, for a given stream, we will check if its
frontend has the option PR_O2_USE_HTX. It is useless to test backend options
because it is not possible to mix the HTX representation and the legacy one
(i.e, having an HTX frontend and a legacy backend or vice versa).
2018-11-18 21:45:50 +01:00
Christopher Faulet
effc3750cc MINOR: conn_stream: Add a flag to notify the SI some data were received
The flag CS_FL_READ_PARTIAL can be set by the mux on the conn_stream to notify
the stream interface that some data were received. Is is used in si_cs_recv to
re-arm read timeout on the channel.
2018-11-18 21:45:49 +01:00
Christopher Faulet
27a3dc8fb2 MINOR: http: Call http_send_name_header with the stream instead of the txn
This is just a minor change to ease integrartion of the HTX.
2018-11-18 21:45:49 +01:00
Christopher Faulet
8277ca72b1 MINOR: http: Add standalone functions to parse a start-line or a header
These 2 functions are pretty naive. They only split a start-line into its 3
substrings or a header line into its name and value. Spaces before and after
each part are skipped. No CRLF at the end are expected.
2018-11-18 21:45:49 +01:00
Christopher Faulet
72d9125efb MINOR: conn_stream: Add a flag to notify the mux it must respect the reserve
By setting the flag CO_RFL_KEEP_RSV when calling mux->rcv_buf, the
stream-interface notifies the mux it must keep some space to preserve the
buffer's reserve. This flag is only useful for multiplexers handling structured
data, because in such case, the stream-interface cannot know the real amount of
free space in the channel's buffer.
2018-11-18 21:45:48 +01:00
Christopher Faulet
c6618d6835 MINOR: conn_stream: Add a flag to notify the mux it should flush its buffers
By setting the flag CO_RFL_BUF_FLUSH when calling mux->rcv_buf, the
stream-interface notifies the mux it should flush its buffers without reading
more data. This flag is set when the SI want to use the kernel TCP splicing to
forward data. Of course, the mux can respect it or not, depending on its
state. It's just an information.
2018-11-18 21:45:48 +01:00
Olivier Houchard
7c6f8b146d MAJOR: connections: Detach connections from streams.
Do not destroy the connection when we're about to destroy a stream. This
prevents us from doing keepalive on server connections when the client is
using HTTP/2, as a new stream is created for each request.
Instead, the session is now responsible for destroying connections.
When reusing connections, the attach() mux method is now used to create a new
conn_stream.
2018-11-18 21:45:45 +01:00
Olivier Houchard
131fd89d5a MINOR: sessions: Start to store the outgoing connection in sessions.
Introduce a new field in session, "srv_conn", and a linked list of sessions
in the connection. It will be used later when we'll switch connections
from being managed by the stream, to being managed by the session.
2018-11-18 21:44:56 +01:00
Olivier Houchard
060ed43361 MINOR: mux: Add a destroy() method.
Add a new method to muxes, destroy(), that is responsible for destroying
the mux and the associated connection, to be used for server connections.
2018-11-18 21:44:53 +01:00
Olivier Houchard
d540b36e8a MINOR: mux: Add a new "avail_streams" method.
Add a new method for mux, avail_streams, that returns the number of streams
still available for a mux.
For the mux_pt, it'll return 1 if the connection is in idle, or 0. For
the H2 mux, it'll return the max number of streams allowed, minus the number
of streams currently in use.
2018-11-18 21:44:06 +01:00
Willy Tarreau
db398435aa MINOR: stream-int: replace si_cant_put() with si_rx_room_{blk,rdy}()
Remaining calls to si_cant_put() were all for lack of room and were
turned to si_rx_room_blk(). A few places where SI_FL_RXBLK_ROOM was
cleared by hand were converted to si_rx_room_rdy().

The now unused si_cant_put() function was removed.
2018-11-18 21:41:50 +01:00
Willy Tarreau
b26a6f9708 MEDIUM: stream-int: make use of si_rx_chan_{rdy,blk} to control the stream-int from the channel
The channel can disable reading from the stream-interface using various
methods, such as :
  - CF_DONT_READ
  - !channel_may_recv()
  - and possibly others

Till now this was done by mangling SI_FL_RX_WAIT_EP which is not
appropriate at all since it's not the stream interface which decides
whether it wants to deliver data or not. Some places were also wrongly
relying on SI_FL_RXBLK_ROOM since it was the only other alternative,
but it's not suitable for CF_DONT_READ.

Let's use the SI_FL_RXBLK_CHAN flag for this instead. It will properly
prevent the stream interface from being woken up and reads from
subscribing to more receipt without being accidently removed. It is
automatically reset if CF_DONT_READ is not set in stream_int_notify().

The code is not trivial because it splits the logic between everything
related to buffer contents (channel_is_empty(), CF_WRITE_PARTIAL, etc)
and buffer policy (CF_DONT_READ). Also it now needs to decide timeouts
based on any blocking flag and not just SI_FL_RXBLK_ROOM anymore.

It looks like this patch has caused a minor performance degradation on
connection rate, which possibly deserves being investigated deeper as
the test conditions are uncertain (e.g. slightly more subscribe calls?).
2018-11-18 21:41:49 +01:00
Willy Tarreau
abb5d4202f MEDIUM: stream-int: use si_rx_shut_blk() to indicate the SI is closed
Till now we were using si_done_put() upon shutr, but these flags could
be reset upon next activity. Now let's switch to SI_FL_RXBLK_SHUT which
doesn't go away. It's also set in stream_int_update() in case a shutr
condition is detected.

The now unused si_done_put() was removed.
2018-11-18 21:41:49 +01:00
Willy Tarreau
7f494d0c5e MINOR: stream-int: make si_sync_recv() simply check ENDP before si_cs_recv()
Instead of checking complex conditions to call si_cs_recv() upon first
call, let's simply use si_rx_endp_ready() now that si_cs_recv() reports
it accurately, and add si_rx_blocked() to cover any blocking situation.
2018-11-18 21:41:48 +01:00
Willy Tarreau
8bb2ffb831 MINOR: stream-int: replace si_{want,stop}_put() with si_rx_endp_{more,done}()
Here it's only a 1-to-1 replacement.
2018-11-18 21:41:47 +01:00
Willy Tarreau
8be7cd7b92 MEDIUM: stream-int: use si_rx_buff_{rdy,blk} to report buffer readiness
The stream interface used to conflate a missing buffer and lack of
buffer space into SI_FL_WAIT_ROOM but this causes difficulties as
these cannot be checked at the same moment and are not resolved at
the same moment either. Now we instead mark the buffer as presumably
available using si_rx_buff_rdy() and mark it as unavailable+requested
using si_rx_buff_blk().

The call to si_alloc_buf() was moved after si_stop_put(). This makes
sure that the SI_FL_RX_WAIT_EP flag is cleared on allocation failure so
that the function is called again if the callee fails to do its work.
2018-11-18 21:41:47 +01:00
Willy Tarreau
32742fdf45 MINOR: stream-int: use si_rx_blocked()/si_tx_blocked() to check readiness
This way we don't limit ourselves to random flags only and the code
is more readable and safer for the long term.
2018-11-18 21:41:46 +01:00
Willy Tarreau
05b9b64afb MINOR: stream-int: replace SI_FL_WANT_PUT with !SI_FL_RX_WAIT_EP
The SI_FL_WANT_PUT flag is used in an awkward way, sometimes it's
set by the stream-interface to mean "I have something to deliver",
sometimes it's cleared by the channel to say "I don't want you to
send what you have", and it has to be set back once CF_DONT_READ
is cleared. This will have to be split between SI_FL_RX_WAIT_EP
and SI_FL_RXBLK_CHAN. This patch only replaces all uses of the
flag with its natural (but negated) replacement SI_FL_RX_WAIT_EP.
The code is expected to be strictly equivalent. The now unused flag
was completely removed.
2018-11-18 21:41:46 +01:00
Willy Tarreau
78dcacef5c MINOR: stream-int: add new functions si_{rx,tx}_{blocked,endp_ready}()
The first ones are used to figure if a direction is blocked on the
stream interface for anything but the end point. The second ones are
used to detect if the end point is ready to receive/transmit. They
should be used instead of directly fiddling with the existing bits.
2018-11-18 21:41:46 +01:00
Willy Tarreau
94f7907d65 MINOR: stream-int: introduce new SI_FL_RXBLK flags
The plan is to have the following flags to describe why a stream interface
doesn't produce data :

    - SI_FL_RXBLK_CHAN : the channel doesn't want it to receive
    - SI_FL_RXBLK_BUFF : waiting for a buffer allocation to complete
    - SI_FL_RXBLK_ROOM : more room is required in the channel to receive
    - SI_FL_RXBLK_SHUT : input now closed, nothing new will come
    - SI_FL_RX_WAIT_EP : waiting for the endpoint to produce more data

Applets like the CLI which consume complete commands at once and produce
large chunks of responses will for example be able to stop being woken up
by clearing SI_FL_WANT_GET and setting SI_FL_RXBLK_ROOM when the rx buffer
is full. Once called they will unblock WANT_GET. The flags were moved
together in readable form with the Rx bits using 2 hex digits and still
have some room to do a similar operation on the Tx path later, with the
WAIT_EP flag being represented alone on a digit.
2018-11-18 21:41:45 +01:00
Willy Tarreau
d0f5bbcd64 MINOR: stream-int: rename SI_FL_WAIT_ROOM to SI_FL_RXBLK_ROOM
This flag is not enough to describe all blocking situations, as can be
seen in each case we remove it. The muxes has taught us that using multiple
blocking flags in parallel will be much easier, so let's start to do this
now. This patch only renames this flags in order to make next changes more
readable.
2018-11-18 21:41:45 +01:00
Willy Tarreau
a44e576f62 MINOR: stream-int: expand the flags to 32-bit
We used to have enough of 16 bits, with 3 still available but it's
not possible to add the rx/tx blocking bits there. Let's extend the
format to 32 bits and slightly reorder the fields to maintain the
struct size to 64 bytes. Nothing else was changed.
2018-11-18 21:41:45 +01:00
Willy Tarreau
fafd3984b9 MINOR: mux: implement a get_first_cs() method
This method is used to retrieve the first known good conn_stream from
the mux. It will be used to find the other end of a connection when
dealing with the proxy protocol for example.
2018-11-18 21:29:20 +01:00
Willy Tarreau
ade6478a8c MINOR: stream: move the conn_stream specific calls to the stream-int
There are still some unwelcome synchronous calls to si_cs_recv() in
process_stream(). Let's have a new function si_sync_recv() to perform
a synchronous receive call on a stream interface regardless of the type
of its endpoint, and move these calls there. For now it only implements
conn_streams since it doesn't seem useful to support applets there. The
function implements an extra check for the stream interface to be in an
established state before attempting anything.
2018-11-17 19:53:45 +01:00
William Lallemand
c59f9884d7 MEDIUM: listeners: support unstoppable listener
An unstoppable listener is a listener which won't be stop during a soft
stop. The unstoppable_jobs variable is incremented and the listener
won't prevent the process to leave properly.

It is not a good idea to use this feature (the LI_O_NOSTOP flag) with a
listener that need to be bind again on another process during a soft
reload.
2018-11-16 17:05:40 +01:00
William Lallemand
a719926cf8 MEDIUM: jobs: support unstoppable jobs for soft stop
This patch allows a process to properly quit when some jobs are still
active, this feature is handled by the unstoppable_jobs variable, which
must be atomically incremented.

During each new iteration of run_poll_loop() the break condition of the
loop is now (jobs - unstoppable_jobs) == 0.

The unique usage of this at the moment is to handle the socketpair CLI
of a the worker during the stopping of the process.  During the soft
stop, we could mark the CLI listener as an unstoppable job and still
handle new connections till every other jobs are stopped.
2018-11-16 17:05:40 +01:00
Frdric Lcaille
9ca51aa288 MINOR: http: Implement "early-hint" http request rules.
This patch implements http_apply_early_hint_rule() function is responsible of
building HTTP 103 Early Hint responses each time a "early-hint" rule is matched.
2018-11-12 21:08:55 +01:00
Frdric Lcaille
0ebbcb663c MINOR: http: Make new "early-hint" http-request action really be parsed.
This patch adds a "early_hint" struct to "arg" union of "act_rule" struct
and parse "early-hint" http-request keyword with it using the same
code as for "(add|set)-header" parser.
2018-11-12 21:08:55 +01:00
Frdric Lcaille
a985e3875b MINOR: http: Add new "early-hint" http-request action.
This patch adds the new "early-hint" action to "http-request" rules parser.
This action should be parsed the same way as "(add|set)-header" actions.
2018-11-12 21:08:55 +01:00
Willy Tarreau
7520e4ff57 MINOR: namespaces: don't build namespace.c if disabled
When namespaces are disabled, support is still reported because the file
is built with almost nothing in it but built anyway. Instead of extending
the scope of the numerous ifdefs in this file, better avoid building it
when namespaces are diabled. In this case we define my_socketat() as an
inline function mapping directly to socket(). The struct netns_entry
still needs to be defined because it's used by various other functions
in the code.
2018-11-12 19:15:15 +01:00
Willy Tarreau
c1b0645dac MEDIUM: log: add a new "raw" format
This format is pretty similar to the previous "short" format except
that it also removes the severity level. Thus only the raw message is
sent. This is suitable for use in containers, where only the raw
information is expected and where the severity is supposed to come
from the file descriptor used.
2018-11-12 18:37:55 +01:00
Willy Tarreau
e8746a08b2 MEDIUM: log: support a new "short" format
This format is meant to be used with local file descriptors. It emits
messages only prefixed with a level, removing all the process name,
system name, date and so on. It is similar to the printk() format used
on Linux. It's suitable to be sent to a local logger compatible with
systemd's output format.

Note that the facility is still required but not used, hence it is
suggested to use "daemon" to remind that it's a local logger.
Example :

    log stdout format short daemon          # send everything to stdout
    log stderr format short daemon notice   # send important events to stderr
2018-11-12 18:37:55 +01:00
Willy Tarreau
13ef773722 MINOR: log: report the number of dropped logs in the stats
It's easy to detect when logs on some paths are lost as sendmsg() will
return EAGAIN. This is particularly true when sending to /dev/log, which
often doesn't support a big logging capacity. Let's keep track of these
and report the total number of dropped messages in "show info".
2018-11-12 18:37:55 +01:00
Willy Tarreau
d0d40ebf5e CLEANUP: stream-int: remove the now unused si->update() function
We exclusively use stream_int_update() now, the lower layers are not
called anymore so let's remove them, as well as si_update() which used
to be their wrapper.
2018-11-11 10:18:37 +01:00
Willy Tarreau
d14844a734 MINOR: stream-int: replace si_update() with si_update_both()
The function used to be called in turn for each side of the stream, but
since it's called exclusively from process_stream(), it prevents us from
making use of the knowledge we have of the operations in progress for
each side, resulting in having to go all the way through functions like
stream_int_notify() which are not appropriate there.

That patch creates a new function, si_update_both() which takes two
stream interfaces expected to belong to the same stream, and processes
their flags in a more suitable order, but for now doesn't change the
logic at all.

The next step will consist in trying to reinsert the rest of the socket
layer-specific update code to ultimately update the flags correctly at
the end of the operation.
2018-11-11 10:18:37 +01:00
Willy Tarreau
8fe516f08a MEDIUM: stream-int: make si_chk_rcv() check that SI_FL_WAIT_ROOM is cleared
After careful inspection, it now seems OK to call si_chk_rcv() only when
SI_FL_WAIT_ROOM is cleared and SI_FL_WANT_PUT is set, since all identified
call places have already taken care of this.
2018-11-11 10:18:37 +01:00
Willy Tarreau
abf531caa0 MEDIUM: stream-int: always call si_chk_rcv() when we make room in the buffer
Instead of clearing the SI_FL_WAIT_ROOM flag and losing the information
about the need from the producer to be woken up, we now call si_chk_rcv()
immediately. This is cheap to do and it could possibly be further improved
by only doing it when SI_FL_WAIT_ROOM was still set, though this will
require some extra auditing of the code paths.

The only remaining place where the flag was cleared without a call to
si_chk_rcv() is si_alloc_ibuf(), but since this one is called from a
receive path woken up from si_chk_rcv() or not having failed, the
clearing was not necessary anymore either.

And there was one place in stream_int_notify() where si_chk_rcv() was
called with SI_FL_WAIT_ROOM still explicitly set so this place was
adjusted in order to clear the flag prior to calling si_chk_rcv().

Now we don't have any situation where we randomly clear SI_FL_WAIT_ROOM
without trying to wake the other side up, nor where we call si_chk_rcv()
with the flag set, so this flag should accurately represent a failed
attempt at putting data into the buffer.
2018-11-11 10:18:37 +01:00
Willy Tarreau
1f9de21c38 MEDIUM: stream-int: make SI_FL_WANT_PUT reflect CF_DONT_READ
When CF_DONT_READ is set, till now we used to set SI_FL_WAIT_ROOM, which
is not appropriate since it would lose the subscribe status. Instead let's
clear SI_FL_WANT_PUT (just like applets do), and set the flag only when
CF_DONT_READ is cleared.

We have to do this in stream_int_update(), and in si_cs_io_cb() after
returning from si_cs_recv() since it would be a bit invasive to hack
this one for now. It must not be done in stream_int_notify() otherwise
it would re-enable blocked applets.

Last, when si_chk_rcv() is called, it immediately clears the flag before
calling ->chk_rcv() so that we are not tempted to uselessly loop on the
same call until the receive function is called. This is the same principle
as what is done with the applet scheduler.
2018-11-11 10:18:37 +01:00
Willy Tarreau
1bdb598a55 MINOR: stream-int: factor the SI_ST_EST state test into si_chk_rcv()
This test is made in each implementation of the function, better to
merge it.
2018-11-11 10:18:37 +01:00
Willy Tarreau
96aadd5c55 MEDIUM: stream-int: temporarily make si_chk_rcv() take care of SI_FL_WAIT_ROOM
This flag should already be cleared before calling the *chk_rcv() functions.
Before adapting all call places, let's first make sure si_chk_rcv() clears
it before calling them so that these functions do not have to check it again
and so that they do not adjust it. This function will only call the lower
layers if the SI_FL_WANT_PUT flag is present so that the endpoint can decide
not to be called (as done with applets).
2018-11-11 10:18:37 +01:00
Willy Tarreau
57f08bb63b MINOR: stream-int: make it clear that si_ops cannot be null
There was an ambiguity in which functions of the si_ops struct could be
null or not. only ->update doesn't exist in one of the si_ops (the
embedded one), all others are always defined. ->shutr and ->shutw were
never tested. However ->chk_rcv() and ->chk_snd() were tested, causing
confusion about the proper way to wake the other side up if undefined
(which never happens).

Let's update the comments to state these functions are mandatory and
remove the offending checks.
2018-11-11 10:18:37 +01:00
Willy Tarreau
af4f6f6d2f MINOR: stream-int: use si_cant_put() instead of setting SI_FL_WAIT_ROOM
We now do this on the si_cs_recv() path so that we always have
SI_FL_WANT_PUT properly set when there's a need to receive and
SI_FL_WAIT_ROOM upon failure.
2018-11-11 10:18:37 +01:00
Willy Tarreau
394970c297 MINOR: stream-int: add si_done_{get,put} to indicate that we won't do it anymore
This is useful on close or stream aborts as it saves us from having
to manipulate the (sometimes confusing) flags.
2018-11-11 10:18:37 +01:00
Willy Tarreau
0cd3bd628a MINOR: stream-int: rename si_applet_{want|stop|cant}_{get|put}
It doesn't make sense to limit this code to applets, as any stream
interface can use it. Let's rename it by simply dropping the "applet_"
part of the name. No other change was made except updating the comments.
2018-11-11 10:18:37 +01:00
Willy Tarreau
21028b5e7f MEDIUM: appctx: check for allocation attempts in buffer allocation callbacks
The buffer allocation callback appctx_res_wakeup() used to rely on old
tricks to detect if a buffer was already granted to an appctx, namely
by checking the task's state. Not only this test is not valid anymore,
but it's inaccurate.

Let's solely on SI_FL_WAIT_ROOM that is now set on allocation failure by
the functions trying to allocate a buffer. The buffer is now allocated on
the fly and the flag removed so that the consistency between the two
remains granted. The patch also fixes minor issues such as the function
being improperly declared inline(!) and the fact that using appctx_wakeup()
sets the wakeup reason to TASK_WOKEN_OTHER while we try to use TASK_WOKEN_RES
when waking up consecutive to a ressource allocation such as a buffer.
2018-11-11 10:18:37 +01:00
Willy Tarreau
b882dd88cc MEDIUM: stream: implement stream_buf_available()
This function replaces stream_res_available(), which is used as a callback
for the buffer allocator. It now carefully checks which stream interface
was blocked on a buffer allocation, tries to allocate the input buffer to
this stream interface, and wakes the task up once such a buffer was found.
It will automatically remove the SI_FL_WAIT_ROOM flag upon success since
the info this flag indicates becomes wrong as soon as the buffer is
allocated.

The code is still far from being perfect because if a call to si_cs_recv()
fails to allocate a buffer, we'll still end up passing via process_stream()
again, but this could be improved in the future by using finer-grained
wake-up notifications.
2018-11-11 10:18:37 +01:00
Willy Tarreau
2d372c2aa1 MINOR: stats: report the number of currently connected peers
The active peers output indicates both the number of established peers
connections and the number of peers connection attempts. The new counter
"ConnectedPeers" also indicates the number of currently connected peers.
This helps detect that some peers cannot be reached for example. It's
worth mentioning that this value changes over time because unused peers
are often disconnected and reconnected. Most of the time it should be
equal to ActivePeers.
2018-11-05 17:15:21 +01:00
Willy Tarreau
199ad24661 MINOR: stats: report the number of active peers in "show info"
Peers are the last type of activity which can maintain a job present, so
it's important to report that such an entity is still active to explain
why the job count may be higher than zero. Here by "ActivePeers" we report
peers sessions, which include both established connections and outgoing
connection attempts.
2018-11-05 17:15:21 +01:00
Willy Tarreau
00098ea034 MINOR: stats: report the number of active jobs and listeners in "show info"
When an haproxy process doesn't stop after a reload, it's because it
still has some active "jobs", which mainly are active sessions, listeners,
peers or other specific activities. Sometimes it's difficult to troubleshoot
the cause of these issues (which generally are the result of a bug) only
because some indicators are missing.

This patch add the number of listeners, the number of jobs, and the stopping
status to the output of "show info". This way it becomes a bit easier to try
to narrow down the cause of such an issue should it happen. A typical use
case is to connect to the CLI before reloading, then issuing the "show info"
command to see what happens. In the normal situation, stopping should equal
1, jobs should equal 1 (meaning only the CLI is still active) and listeners
should equal zero.

The patch is so trivial that it could make sense to backport it to 1.8 in
order to help with troubleshooting.
2018-11-05 17:15:21 +01:00
Willy Tarreau
4698adf68f MINOR: compat: automatically detect support for crypt_r()
glibc >= 2.2 and FreeBSD >= 12.0 support crypt_r(), let's detect this
and set a macro HA_HAVE_CRYPT_R for this.
2018-10-29 19:14:14 +01:00
Willy Tarreau
34d4b525a1 BUG/MEDIUM: auth/threads: use of crypt() is not thread-safe
It was reported here that authentication may fail when threads are
enabled :

    https://bugzilla.redhat.com/show_bug.cgi?id=1643941

While I couldn't reproduce the issue, it's obvious that there is a
problem with the use of the non-reentrant crypt() function there.
On Linux systems there's crypt_r() but not on the vast majority of
other ones. Thus a first approach consists in placing a lock around
this crypt() call. Another patch may relax it when crypt_r() is
available.

This fix must be backported to 1.8. Thanks to Ryan O'Hara for the
quick notification.
2018-10-29 18:06:02 +01:00
Willy Tarreau
ce487aab46 BUG/MEDIUM: tools: fix direction of my_ffsl()
Commit 27346b01a ("OPTIM: tools: optimize my_ffsl() for x86_64") optimized
my_ffsl() for intensive use cases in the scheduler, but as half of the times
I got it wrong so it counted bits the reverse way. It doesn't matter for the
scheduler nor fd cache but it broke cpu-map with threads which heavily relies
on proper ordering.

We should probably consider dropping support for gcc < 3.4 and switching
to builtins for these ones, though often they are as ambiguous.

No backport is needed.
2018-10-29 16:09:57 +01:00
Willy Tarreau
8e9f4531cb BUG/MINOR: memory: make the thread-local cache allocator set the debugging link
When building with DEBUG_MEMORY_POOLS, an element returned from the
cache would not have its pool link initialized unless it's allocated
using pool_alloc(). This is problematic for buffer allocators which
use pool_alloc_dirty(), as freeing this object will make the code
think it was allocated from another pool. This patch does two things :
  - make __pool_get_from_cache() set the link
  - remove the extra initialization from pool_alloc() since it's always
    done in either __pool_get_first() or __pool_refill_alloc()

This patch is marked MINOR since it only affects code explicitly built
for debugging. No backport is needed.
2018-10-28 20:12:31 +01:00
William Lallemand
90b1ca1ff5 MEDIUM: channel: reorder the channel analyzers for the cli
Reorder the channel analyzers so the CLI analyzers are defined before
the XFER_DATA ones.
2018-10-28 14:13:31 +01:00
William Lallemand
309dc9adec MEDIUM: mworker: stop the master proxy in the workers
The master proxy which handles the CLI should not be used or shown in
the stats of the workers. This proxy is now disabled after the fork.
2018-10-28 14:03:31 +01:00
William Lallemand
cf62f7e3cb MEDIUM: cli: implement 'mode cli' proxy analyzers
This patch implements analysers for parsing the CLI and extra features
for the master's CLI.

For each command (sent alone, or separated by ; or \n) the request
analyser will determine to which server it should send the request.

The 'mode cli' proxy is able to parse a prefix for each command which is
used to select the apropriate server. The prefix start by @ and is
followed by "master", the PID preceded by ! or the relative PID. (e.g.
@master, @1, @!1234). The servers are not round-robined anymore.

The command is sent with a SHUTW which force the server to close the
connection after sending its response. However the proxy allows a
keepalive connection on the client side and does not close.

The response analyser does not do much stuff, it only reinits the
connection when it received a close from the server, and forward the
response. It does not analyze the response data.
The only guarantee of the end of the response is the close of the
server, we can't rely on the double \n since it's not send by every
command.

This could be reimplemented later as a filter.
2018-10-28 14:03:06 +01:00
William Lallemand
291810d8f8 MEDIUM: mworker: find the server ptr using a CLI prefix
Add a struct server pointer in the mworker_proc struct so we can easily
use it as a target for the mworker proxy.

pcli_prefix_to_pid() is used to find the right PID of the worker
when using a prefix in the CLI. (@master, @#<relative pid> , @<pid>)

pcli_pid_to_server() is used to find the right target server for the
CLI proxy.
2018-10-28 13:51:39 +01:00
William Lallemand
14721be11f MEDIUM: cli: disable some keywords in the master
The master process does not need all the keywords of the cli, add 2
flags to chose which keyword to use.

It might be useful to activate some of them in a debug mode later...
2018-10-28 13:51:39 +01:00
William Lallemand
e736115d3a MEDIUM: mworker: create CLI listeners from argv[]
This patch introduces mworker_cli_proxy_new_listener() which allows the
creation of new listeners for the CLI proxy.

Using this function it is possible to create new listeners from the
program arguments with -Sa <unix_socket>. It is allowed to create
multiple listeners with several -Sa.
2018-10-28 13:51:39 +01:00
William Lallemand
8a02257d88 MEDIUM: mworker: proxy for the master CLI
This patch implements a listen proxy within the master. It uses the
sockpair of all the workers as servers.

In the current state of the code, the proxy is only doing round robin on
the CLI of the workers. A CLI mode will be needed to know to which CLI
send the requests.
2018-10-28 13:51:39 +01:00
William Lallemand
6e0db2fa99 MEDIUM: mworker: add proc_list in global.h
Add the process list in types/global.h so it could be accessed from
anywhere.
2018-10-28 13:51:39 +01:00
William Lallemand
313bfd18c1 MINOR: server: export new_server() function
The new_server() function will be useful to create a proxy for the
master-worker.
2018-10-28 13:51:38 +01:00
William Lallemand
7e1299bb3a REORG: mworker: move struct mworker_proc to global.h
Move the definition of the mworker_proc structure in types/global.h.
2018-10-28 13:51:38 +01:00
William Lallemand
ce83b4a5dd MEDIUM: mworker: each worker socketpair is a CLI listener
The init code of the mworker_proc structs has been moved before the
init of the listeners.

Each socketpair is now connected to a CLI within the workers, which
allows the master to access their CLI.

The inherited flag of the worker side socketpair is removed so the
socket can be closed in the master.
2018-10-28 13:51:38 +01:00
Willy Tarreau
85f890174a MEDIUM: stream-int: make si_update() synchronize flag changes before the I/O
With the new synchronous si_cs_send() at the end of process_stream(),
we're seeing re-appear the I/O layer specific part of the stream interface
which is supposed to deal with I/O event subscription. The only difference
is that now we subscribe to I/Os only after having attempted (and failed)
them.

This patch brings a cleanup in this by reintroducing stream_int_update_conn()
with the send code from process_stream(). However this alone would not be
enough because the flags which are cleared afterwards would result in the
loss of the possible events (write events only at the moment). So the flags
clearing and stream-int state updates are also performed inside si_update()
between the generic code and the I/O specific code. This definitely makes
sense as after this call we can simply check again for channel and SI flag
changes and decide to loop once again or not.
2018-10-28 13:47:00 +01:00
Willy Tarreau
0979916d3b MINOR: stream-int: add si_alloc_ibuf() to ease input buffer allocation
This will supersed channel_alloc_buffer() while relying on it. It will
automatically adjust SI_FL_WAIT_ROOM on the stream-int depending on
success or failure to allocate this buffer.

It's worth noting that it could make sense to also set SI_FL_WANT_PUT
each time we do this to further simplify the code at user places such
as applets, but it would possibly not be easy to clean this flag
everywhere an rx operation stops.
2018-10-28 13:47:00 +01:00
Willy Tarreau
ede3d884fc MEDIUM: channel: merge back flags CF_WRITE_PARTIAL and CF_WRITE_EVENT
The behaviour of the flag CF_WRITE_PARTIAL was modified by commit
95fad5ba4 ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") due
to a situation where it could trigger an immediate wake up of the other
side, both acting in loops via the FD cache. This loss has caused the
need to introduce CF_WRITE_EVENT as commit c5a9d5bf, to replace it, but
both flags express more or less the same thing and this distinction
creates a lot of confusion and complexity in the code.

Since the FD cache now acts via tasklets, the issue worked around in the
first patch no longer exists, so it's more than time to kill this hack
and to restore CF_WRITE_PARTIAL's semantics (i.e.: there has been some
write activity since we last left process_stream).

This patch mostly reverts the two commits above. Only the part making
use of CF_WROTE_DATA instead of CF_WRITE_PARTIAL to detect the loss of
data upon connection setup was kept because it's more accurate and
better suited.
2018-10-26 08:32:57 +02:00
Ioannis Cherouvim
1ff7633dd7 CLEANUP: tools: fix misleading comment above function LIM2A
The function produces ASCII, but its comment was copied from U2H which
produces HTML.
2018-10-26 05:00:48 +02:00
Frdric Lcaille
b80bc273a3 MINOR: shctx: Change max. object size type to unsigned int.
This change is there to prevent implicit conversions when comparing
shctx maximum object sizes with other unsigned values.
2018-10-26 04:54:40 +02:00
Frdric Lcaille
b7838afe6f MINOR: shctx: Add a maximum object size parameter.
This patch adds a new parameter to shctx_init() function to be used to
limit the size of each shared object, -1 value meaning "no limit".
2018-10-24 04:39:44 +02:00
Frdric Lcaille
8df65ae5e2 MINOR: cache: Larger HTTP objects caching.
This patch makes the capable of storing HTTP objects larger than a buffer.
It makes usage of the "block by block shared object allocation" new shctx API.

A new pointer to struct shared_block has been added to the cache applet
context to memorize the next block to be used by the HTTP cache I/O handler
http_cache_io_handler() to emit the data. Another member, named "sent" memorize
the number of bytes already sent by this handler. So, to send an object from cache,
http_cache_io_handler() must be called until "sent" counter reaches the size
of this object.
2018-10-24 04:37:12 +02:00
Frdric Lcaille
0bec807e08 MINOR: shctx: Shared objects block by block allocation.
This patch makes shctx capable of storing objects in several parts,
each parts being made of several blocks. There is no more need to
walk through until reaching the end of a row to append new blocks.

A new pointer to a struct shared_block member, named last_reserved,
has been added to struct shared_block so that to memorize the last block which was
reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which
is used to memorize the last block used by shctx_row_data_append() to store the data.
2018-10-24 04:35:53 +02:00
Willy Tarreau
68ad3a42f7 MINOR: proxy: add a new option "http-use-htx"
This option makes a proxy use only HTX-compatible muxes instead of the
HTTP-compatible ones for HTTP modes. It must be set on both ends, this
is checked at parsing time.
2018-10-23 10:22:36 +02:00
Christopher Faulet
55d6be7d83 MINOR: h1: Export some functions parsing the value of some HTTP headers
Functions parsing the value of "Connection:", "Transfer-encoding:" and
"Content-length:" headers are now exported to be used by the mux-h1.
2018-10-23 10:22:36 +02:00
Willy Tarreau
627505d36a MINOR: freq_ctr: add swrate_add_scaled() to work with large samples
Some samples representing time will cover more than one sample at once
if they are units of time per time. For this we'd need to have the
ability to loop over swrate_add() multiple times but that would be
inefficient. By developing the function elevated to power N, it's
visible that some coefficients quickly disappear and that those which
remain at the first order more or less compensate each other.

Thus a simplified version of this function was added to provide a single
value for a given number of samples. Tests with multiple values, window
sizes and sample sizes have shown that it is possible to make it remain
surprisingly accurate (typical error < 0.2% over various large window
and sample sizes, even samples representing up to 1/4 of the window).
2018-10-22 08:13:57 +02:00
Olivier Houchard
3f03ab5b15 MINOR: connection: Add a SUB_CALL_UNSUBSCRIBE event.
Add a SUB_CALL_UNSUBSCRIBE event, to let the caller know that the
unsubscribe method should be called before destroyin the object.
2018-10-21 06:00:04 +02:00
Olivier Houchard
53216e7db9 MEDIUM: connections: Don't directly mess with the polling from the upper layers.
Avoid using conn_xprt_want_send/recv, and totally nuke cs_want_send/recv,
from the upper layers. The polling is now directly handled by the connection
layer, it is activated on subscribe(), and unactivated once we got the event
and we woke the related task.
2018-10-21 05:58:40 +02:00
Olivier Houchard
1fddc9b7bb BUG/MEDIUM: connections: Remove subscription if going in idle mode.
Make sure we don't have any subscription when the connection is going in
idle mode, otherwise there's a race condition when the connection is
reused, if there are still old subscriptions, new ones won't be done.

No backport is needed.
2018-10-21 05:55:20 +02:00
Olivier Houchard
62975a7740 BUG/MEDIUM: pools: Fix the usage of mmap()) with DEBUG_UAF.
When mapping memory with mmap(), we should use a fd of -1, not 0. 0 may
work on linux, but it doesn't work on FreeBSD, and probably other OSes.

It would be nice to backport this to 1.8 to help debugging there.
2018-10-21 05:43:33 +02:00
Willy Tarreau
4e7cc3381b BUILD: compiler: rename __unreachable() to my_unreachable()
Olivier reported that on FreeBSD __unreachable is already defined
and causes build warnings. Let's rename it then.
2018-10-20 17:45:48 +02:00
Willy Tarreau
7a6ad88b02 BUILD: memory: fix free_list pointer declaration again for atomic CAS
Commit ac6c880 ("BUILD: memory: fix pointer declaration for atomic CAS")
attemtped to fix a build warning affecting the lock-free version of the
pool allocator. But the fix tried to hide the cause instead of addressing
it, thus clang still complains about (void **) not matching (void ***).

The real solution is to declare free_list (void **) and not to use a cast.
Now this builds fine with gcc/clang with and without threads.

No backport is needed.
2018-10-20 17:37:38 +02:00
Willy Tarreau
ed72d82827 MEDIUM: time: measure the time stolen by other threads
The purpose is to detect if threads or processes are competing for the
same CPU. This can happen when threads are incorrectly bound, or after a
reload if the previous process still has an important activity. With
threads this situation is problematic because a preempted thread holding
a lock will block other ones waiting for this lock to be released.

A first attempt consisted in measuring the cumulated lost time more
precisely but the system's scheduler is smart enough to try to limit the
thread preemption rate by mostly context switching during poll()'s blank
periods, so most of the time lost is not seen. In essence this is good
because it means a thread is not preempted with a lock held, and even
regarding the rendez-vous point it cannot prevent the other ones from
making progress. But still it happens tens to hundreds of times per
second that a thread might be preempted, so it's still possible to detect
that the situation is happening, thus it's interesting to measure and
report its frequency.

Each time we enter the poller, we check the CPU time spent working and
see if we've lost time doing something else. To limit false positives,
we're only interested in losses of 500 microseconds or more (i.e. half
a clock tick on a 1 kHz system). If so, it indicates that some time was
stolen by another thread or process. Note that we purposely store some
sub-millisecond counters so that under heavy traffic with a 1 kHz clock,
it's still possible to measure something without being subject to the
risk of rounding errors (i.e. if exactly 1 ms is stolen it's possible
that the time difference could often be slightly lower).

This counter of lost CPU time slots time is reported in "show activity"
in numbers of milliseconds of CPU lost per second, per 15s, and total
over the process' life. By definition, the per-second counter cannot
report values larger than 1000 per thread per second and the 15s one
will be limited to 15000/s in the worst case, but it's possible that
peak values exceed such thresholds after long pauses.
2018-10-19 08:51:59 +02:00
Willy Tarreau
5ceeb15002 MINOR: time: add now_mono_time() and now_cpu_time()
These two functions retrieve respectively the monotonic clock time and
the per-thread CPU time when available on the platform, or return zero.
These syscalls may require to link with -lrt on certain libc, which is
enabled in the Makefile with USE_RT=1 (default on Linux systems).
2018-10-18 16:39:48 +02:00
Willy Tarreau
ac6c8805be BUILD: memory: fix pointer declaration for atomic CAS
The calls to HA_ATOMIC_CAS() on the lockfree version of the pool allocator
were mistakenly done on (void*) for the old value instead of (void **).
While this has no impact on "recent" gcc, it does have one for gcc < 4.7
since the CAS was open coded and it's not possible to assign a temporary
variable of type "void".

No backport is needed, this only affects 1.9.
2018-10-18 16:12:28 +02:00
Willy Tarreau
7e9c4ae4de MINOR: poller: move time and date computation out of the pollers
By placing this code into time.h (tv_entering_poll() and tv_leaving_poll())
we can remove the logic from the pollers and prepare for extending this to
offer more accurate time measurements.
2018-10-17 19:59:43 +02:00
Willy Tarreau
f37ba94768 MINOR: fd: centralize poll timeout computation in compute_poll_timeout()
The 4 pollers all contain the same code used to compute the poll timeout.
This is pointless, let's centralize this into fd.h. This also gets rid of
the useless SCHEDULER_RESOLUTION macro which used to work arond a very old
linux 2.2 bug causing select() to wake up slightly before the timeout.
2018-10-17 19:59:43 +02:00
Willy Tarreau
e18db9e984 MEDIUM: pools: implement a thread-local cache for pool entries
Each thread now keeps the last ~512 kB of freed objects into a local
cache. There are some heuristics involved so that a specific pool cannot
use more than 1/8 of the total cache in number of objects. Tests have
shown that 512 kB is an optimal size on a 24-thread test running on a
dual-socket machine, resulting in an overall 7.5% performance increase
and a cache miss ratio reducing from 19.2 to 17.7%. Anyway it seems
pointless to keep more than an L2 cache, which probably explains why
sizes between 256 and 512 kB are optimal.

Cached objects appear in two lists, one per pool and one LRU to help
with fair eviction. Currently there is no way to check each thread's
cache state nor to flush it. This cache cannot be disabled and is
enabled as soon as the lockless pools are enabled (i.e.: threads are
enabled, no pool debugging is in use and the CPU supports a double word
CAS).
2018-10-16 13:46:08 +02:00
Willy Tarreau
146794dc4f MINOR: pools: split pool_free() in the lockfree variant
This separates the validity tests from the code committing the object
to the pool, in order to ease insertion of the thread-local cache.
2018-10-16 10:29:28 +02:00
Willy Tarreau
0a93b6413f MINOR: pools: allocate most memory pools from an array
For caching it will be convenient to have indexes associated with pools,
without having to dereference the pool itself. One solution could consist
in replacing all pool pointers with integers but this would limit the
number of allocatable pools. Instead here we allocate the 32 first pools
from a pre-allocated array whose base address is known so that it's trivial
to convert a pool to an index in this array. Pools that cannot fit there
will be allocated normally.
2018-10-16 10:29:26 +02:00
Bertrand Jacquin
d5e4de8e5f DOC: Fix a few typos
these are mostly spelling mistakes, some of them might be candidate for
backporting as well.
2018-10-15 19:38:15 +02:00
Willy Tarreau
8d8747abe0 OPTIM: tasks: group all tree roots per cache line
Currently we have per-thread arrays of trees and counts, but these
ones unfortunately share cache lines and are accessed very often. This
patch moves the task-specific stuff into a structure taking a multiple
of a cache line, and has one such per thread. Just doing this has
reduced the cache miss ratio from 19.2% to 18.7% and increased the
12-thread test performance by 3%.

It starts to become visible that we really need a process-wide per-thread
storage area that would cover more than just these parts of the tasks.
The code was arranged so that it's easy to move the pieces elsewhere if
needed.
2018-10-15 19:06:13 +02:00