Commit Graph

1670 Commits

Author SHA1 Message Date
Willy Tarreau
bd20a9dd4e BUG: tasks: fix bug introduced by latest scheduler cleanup
In commit 86eded6c6 ("CLEANUP: tasks: rename task_remove_from_tasklet_list()
to tasklet_remove_*") which consisted in removing the casts between tasks
and tasklet, I was a bit too fast to believe that we only saw tasklets in
this function since process_runnable_tasks() also uses it with tasks under
a cast. So removing the bookkeeping on task_list_size was not appropriate.
Bah, the joy of casts which hide the real thing...

This patch does two things at once to address this mess once for all:
  - it restores the decrement of task_list_size when it's a real task,
    but moves it to process_runnable_task() since it's the only place
    where it's allowed to call it with a task

  - it moves the increment there as well and renames
    task_insert_into_tasklet_list() to tasklet_insert_into_tasklet_list()
    of obvious consistency reasons.

This way the increment/decrement of task_list_size is made at the only
places where the cast is enforced, so it has less risks to be missed.
The comments on top of these functions were updated to reflect that they
are only supposed to be used with tasklets and that the caller is responsible
for keeping task_list_size up to date if it decides to enforce a task there.

Now we don't have to worry anymore about how these functions work outside
of the scheduler, which is better longterm-wise. Thanks to Christopher for
spotting this mistake.

No backport is needed.
2019-06-14 18:16:19 +02:00
Olivier Houchard
fe4abe62c7 BUG/MEDIUM: connections: Don't call shutdown() if we want to disable linger.
In conn_sock_shutw(), avoid calling shutdown() if linger_risk is set. Not
doing so will result in getting sockets in TIME_WAIT for some time.
This is particularly observable with health checks.

This should be backported to 1.9.
2019-06-14 15:33:41 +02:00
Willy Tarreau
86eded6c69 CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*
The function really only operates on tasklets, its arguments are always
tasklets cast as tasks to match the function's type, to be cast back to
a struct tasklet. Let's rename it to tasklet_remove_from_tasklet_list(),
take a struct tasklet, and get rid of the undesired task casts.
2019-06-14 14:57:03 +02:00
Willy Tarreau
3c39a7d889 CLEANUP: connection: rename the wait_event.task field to .tasklet
It's really confusing to call it a task because it's a tasklet and used
in places where tasks and tasklets are used together. Let's rename it
to tasklet to remove this confusion.
2019-06-14 14:42:29 +02:00
Christopher Faulet
36a7702b03 CLEANUP: channel: Remove channel_htx_fwd_payload() and channel_htx_fwd_all()
These functions are unused now. No backport needed.
2019-06-14 11:13:32 +02:00
Christopher Faulet
421e769783 BUG/MEDIUM: htx: Don't change position of the first block during HTX analysis
In the HTX structure, the field <first> is used to know where to (re)start the
analysis. It may differ from the message's head. It is especially important to
update it to handle 1xx messages, to be sure to restart the analysis on the next
message (another 1xx message or the final one). It is also updated when some
data are forwarded (the headers or part of the body). But this update is an
error and must never be done at the analysis level. It is a bug, because some
sample fetches may be used after the data forwarding (but before the first send
of course). At this stage, if the first block position does not point on the
start-line, most of HTTP sample fetches fail.

So now, when something is forwarding by HTX analyzers, the first block position
is not update anymore.

This issue was reported on Github. See #119. No backport needed.
2019-06-14 11:13:32 +02:00
Christopher Faulet
87ebe944d6 BUG/MINOR: channel/htx: Call channel_htx_full() from channel_full()
When channel_full() is called for an HTX stream, we fall back on the HTX
version. This function is called, among other, from tcp_inspect_request(). With
this patch, the inspect delay is respected again.

This patch must be backported to 1.9.
2019-06-14 11:13:32 +02:00
Willy Tarreau
3cec0f94f3 BUG/MINOR: task: prevent schedulable tasks from starving under high I/O activity
With both I/O and tasks in the same tasklet list, we now have a very
smooth and responsive scheduler, providing a good fairness between I/O
activities. With the lower layers relying on tasklet a lot (I/O wakeup,
subscribe, etc), there may often be a large number of totally autonomous
tasklets doing their business such as forwarding data between two muxes.

But the task scheduler historically refrained from picking tasks from the
priority-ordered run queue to put them into the tasklet list until this
later had less than max_runqueue_depth entries. This was to make sure that
low-latency, high-priority tasks would have an opportunity to be dequeued
before others even if they arrive late. But the counter used for this is
still the tasklet list size, which contains countless I/O events. This
causes an unfairness between unbounded I/Os and bounded tasks, resulting
for example in the CLI responding slower when forwarding 40 Gbps of HTTP
traffic spread over a thousand of connections.

A good solution consists in sticking to the initial intent of
max_runqueue_depth which is to limit the number of tasks in the list
(to maintain fairness between them) and not to limit the number of these
tasks among tasklets. It just turns out that the task_list_size initially
was this task counter and changed over time to be a tasklet list size.
Let's simply refrain from updating it for pure tasklets so that it takes
back its original role of counting real tasks as its name implies. With
this change the CLI becomes instantly responsive under load again.

This patch may possibly be backported to 1.9 though it requires some
careful checks.
2019-06-14 09:16:51 +02:00
Olivier Houchard
a0fdce3950 MINOR: fd: Don't use atomic operations when it's not needed.
In updt_fd_polling(), when updating fd_nbupdt, there's no need to use an
atomic operation, as it's a TLS variable.
2019-06-12 14:36:24 +02:00
Willy Tarreau
ad660e3f84 BUILD: stream-int: avoid a build warning in dev mode in si_state_bit()
The BUG_ON() test emits a warning about an always-true comparison regarding
<state> which cannot be lower than zero. Let's get rid of it.
2019-06-06 16:42:08 +02:00
Willy Tarreau
3b285d7fbd MINOR: stream-int: make si_sync_send() from the send code of si_update_both()
Just like we have a synchronous recv() function for the stream interface,
let's have a synchronous send function that we'll be able to call from
different places. For now this only moves the code, nothing more.
2019-06-06 16:36:19 +02:00
Willy Tarreau
236c4298b3 MINOR: stream-int: split si_update() into si_update_rx() and si_update_tx()
We should not update the two directions at once, in fact we should update
the Rx path after recv() and the Tx path after send(). Let's start by
splitting the update function in two for this.
2019-06-06 16:36:19 +02:00
Willy Tarreau
8c603ded39 MEDIUM: stream-int: make idle-conns switch to ST_RDY
The purpose of making idle-conns switch to SI_ST_CON was to make the
transition detectable and the operation retryable in case of connection
error. Now we have the RDY state for this which is much more suitable
since it indicates a validated connection on which we didn't necessarily
send anything yet. This will still lead to a transition to EST while not
requiring unnatural write polling nor connect timeouts.
2019-06-06 16:36:19 +02:00
Willy Tarreau
4f283fa604 MEDIUM: stream-int: introduce a new state SI_ST_RDY
The main reason for all the trouble we're facing with stream interface
error or timeout reports during the connection phase is that we currently
can't make the difference between a connection attempt and a validated
connection attempt. It is problematic because we tend to switch early
to SI_ST_EST but can't always do what we want in this state since it's
supposed to be set when we don't need to visit sess_establish() again.

This patch introduces a new state betwen SI_ST_CON and SI_ST_EST, which
is SI_ST_RDY. It indicates that we've verified that the connection is
ready. It's a transient state, like SI_ST_DIS, that cannot persist when
leaving process_stream(). For now it is not set, only verified in various
tests where SI_ST_CON was used or SI_ST_EST depending on the cases.

The stream-int state diagram was minimally updated to reflect the new
state, though it is largely obsolete and would need to be seriously
updated.
2019-06-06 16:36:19 +02:00
Willy Tarreau
7ab22adbf7 MEDIUM: stream-int: remove dangerous interval checks for stream-int states
The stream interface state checks involving ranges were replaced with
checks on a set of states, already revealing some issues. No issue was
fixed, all was replaced in a one-to-one mapping for easier control. Some
checks involving a strict difference were also replaced with fields to
be clearer. At this stage, the result must be strictly equivalent. A few
tests were also turned to their bit-field equivalent for better readability
or in preparation for upcoming changes.

The test performed in the SPOE filter was swapped so that the closed and
error states are evicted first and that the established vs conn state is
tested second.
2019-06-06 16:36:19 +02:00
Willy Tarreau
bedcd698b3 MINOR: stream-int: use bit fields to match multiple stream-int states at once
At some places we do check for ranges of stream-int states but those
are confusing as states ordering is not well known (e.g. it's not obvious
that CER is between CON and EST). Let's create a bit field from states so
that we can match multiple states at once instead. The new enum si_state_bit
contains SI_SB_* which are state bits instead of state values. The function
si_state_in() indicates if the state in argument is one of those represented
by the bit mask in second argument.
2019-06-06 16:36:19 +02:00
Olivier Houchard
03abf2d31e MEDIUM: connections: Remove CONN_FL_SOCK*
Now that the various handshakes come with their own XPRT, there's no
need for the CONN_FL_SOCK* flags, and the conn_sock_want|stop functions,
so garbage-collect them.
2019-06-05 18:03:38 +02:00
Olivier Houchard
fe50bfb82c MEDIUM: connections: Introduce a handshake pseudo-XPRT.
Add a new XPRT that is used when using non-SSL handshakes, such as proxy
protocol or Netscaler, instead of taking care of it in conn_fd_handler().
This XPRT is installed when any of those is used, and it removes itself once
the handshake is done.
This should allow us to remove the distinction between CO_FL_SOCK* and
CO_FL_XPRT*.
2019-06-05 18:03:38 +02:00
Olivier Houchard
000694cf96 MINOR: ssl: Make ssl_sock_handshake() static.
ssl_sock_handshake is now only used by the ssl code itself, there's no need
to export it anymore, so make it static.
2019-06-05 18:03:38 +02:00
Christopher Faulet
a4f9dd4a56 BUG/MINOR: channel/htx: Don't alter channel during forward for empty HTX message
In channel_htx_forward() and channel_htx_forward_forever(), if the HTX message
is empty, the underlying buffer may be really empty too. And we have no warranty
the caller will call htx_to_buf() later. And in practice, it is almost never
done. So the channel's buffer must not be altered. Otherwise, the buffer may be
considered as full (data == size) for an empty HTX message and no outgoing data.

This patch must be backported to 1.9.
2019-06-05 10:12:11 +02:00
Frdric Lcaille
8d78fa7def MINOR: peers: Make peers protocol support new "server_name" data type.
Make usage of the APIs implemented for dictionaries (dict.c) and their LRU caches (struct dcache)
so that to send/receive server names used for the server by name stickiness. These
names are sent over the network as follows:

 - in every case we send the encode length of the data (STD_T_DICT), then
 - if the server names is not present in the cache used upon transmission (struct dcache_tx)
   we cache it and we the ID of this TX cache entry followed the encode length of the
   server name, and finally the sever name itseft (non NULL terminated string).
 - if the server name is present, we repead these operations but we only send the TX cache
   entry ID.

Upon receipt, the couple of (cache IDs, server name) are stored the LRU cache used
only upon receipt (struct dcache_rx). As the peers protocol is symetrical, the fact
that the server name is present in the received data (resp. or not) denotes if
the entry is absent (resp. or not).
2019-06-05 08:42:33 +02:00
Frdric Lcaille
5ad57ea85f MINOR: stick-table: Add "server_name" new data type.
This simple patch only adds definitions to create a new stick-table
data type ID and a new standard type to store information in relation
wich dictionary entries (STD_T_DICT).
2019-06-05 08:33:35 +02:00
Frdric Lcaille
4a3fef834c MINOR: dict: Add dictionary new data structure.
This patch adds minimalistic definitions to implement dictionary new data structure
which is an ebtree of ebpt_node structs with strings as keys. Note that this has nothing
to see with real dictionary data structure (maps of keys in association with values).
2019-06-05 08:33:35 +02:00
Willy Tarreau
7bb39d7cd6 CLEANUP: connection: remove the now unused CS_FL_REOS flag
Let's remove it before it gets uesd again. It was mostly replaced with
CS_FL_EOI and by mux-specific states or flags.
2019-06-03 14:23:33 +02:00
Alexander Liu
2a54bb74cd MEDIUM: connection: Upstream SOCKS4 proxy support
Have "socks4" and "check-via-socks4" server keyword added.
Implement handshake with SOCKS4 proxy server for tcp stream connection.
See issue #82.

I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found
at "https://www.openssh.com/txt/socks4.protocol". Please reference to it.

[wt: for now connecting to the SOCKS4 proxy over unix sockets is not
 supported, and mixing IPv4/IPv6 is discouraged; indeed, the control
 layer is unique for a connection and will be used both for connecting
 and for target address manipulation. As such it may for example report
 incorrect destination addresses in logs if the proxy is reached over
 IPv6]
2019-05-31 17:24:06 +02:00
Olivier Houchard
cfbb3e6560 MEDIUM: tasks: Get rid of active_tasks_mask.
Remove the active_tasks_mask variable, we can deduce if we've work to do
by other means, and it is costly to maintain. Instead, introduce a new
function, thread_has_tasks(), that returns non-zero if there's tasks
scheduled for the thread, zero otherwise.
2019-05-29 21:53:37 +02:00
Willy Tarreau
ef28dc11e3 MINOR: task: turn the WQ lock to an RW_LOCK
For now it's exclusively used as a write lock though, thus it remains
100% equivalent to the spinlock it replaces.
2019-05-28 19:15:44 +02:00
Christopher Faulet
dab5ab551d MINOR: channel/htx: Add functions to forward a part or all HTX payload
The functions channel_htx_fwd_payload() and channel_htx_fwd_all() should now be
used to forward, respectively, a part of the HTX payload or all of it. These
functions forward data and update the first block position.
2019-05-28 07:42:33 +02:00
Christopher Faulet
29f1758285 MEDIUM: htx: Store the first block position instead of the start-line one
We don't store the start-line position anymore in the HTX message. Instead we
store the first block position to analyze. For now, it is almost the same. But
once all changes will be made on this part, this position will have to be used
by HTX analyzers, and only in the analysis context, to know where the analyse
should start.

When new blocks are added in an HTX message, if the first block position is not
defined, it is set. When the block pointed by it is removed, it is set to the
block following it. -1 remains the value to unset the position. the first block
position is unset when the HTX message is empty. It may also be unset on a
non-empty message, meaning every blocks were already analyzed.

From HTX analyzers point of view, this position is always set during headers
analysis. When they are waiting for a request or a response, if it is unset, it
means the analysis should wait. But once the analysis is started, and as long as
headers are not forwarded, it points to the message start-line.

As mentionned, outside the HTX analysis, no code must rely on the first block
position. So multiplexers and applets must always use the head position to start
a loop on an HTX message.
2019-05-28 07:42:33 +02:00
Christopher Faulet
b2f4e83a28 MINOR: channel/htx: Add function to forward headers of an HTX message
The function channel_htx_fwd_headers() should now be used by HTX analyzers to
forward all headers of an HTX message, from the start-line to the corresponding
EOH. It takes care to update the star-line position.
2019-05-28 07:42:33 +02:00
Christopher Faulet
aad458587d MINOR: channel/htx: Call channel_htx_recv_max() from channel_recv_max()
When channel_recv_max() is called for an HTX stream, we fall back on the HTX
version. This function is called from si_cs_recv(). This will let us pass the
max amount of bytes to read to HTX multiplexers.
2019-05-28 07:42:12 +02:00
Christopher Faulet
297fbb45fe MINOR: htx: Replace the function http_find_stline() by http_get_stline()
Now, we only return the start-line. If not found, NULL is returned. No lookup is
performed and the HTX message is no more updated. It is now the caller
responsibility to update the position of the start-line to the right value. So
when it is not found, i.e sl_pos is set to -1, it means the last start-line has
been already processed and the next one has not been inserted yet.

It is mandatory to rely on this kind of warranty to store 1xx informational
responses and final reponse in the same HTX message.
2019-05-28 07:42:12 +02:00
Christopher Faulet
c8b246f108 MINOR: htx: Move the macro IS_HTX_STRM() in proto/stream.h
The macro IS_HTX_STRM() only relies on stream flags. So move it in
proto/stream.h.
2019-05-28 07:42:12 +02:00
Christopher Faulet
429b91d308 MINOR: htx: Remove the macro IS_HTX_SMP() and always use IS_HTX_STRM() instead
The macro IS_HTX_SMP() is only used at a place, in a context where the stream
always exists. So, we can remove it to use IS_HTX_STRM() instead.
2019-05-28 07:42:12 +02:00
Willy Tarreau
0d6c75d749 OPTIM: freq-ctr: don't take the date lock for most updates
It's amazing that the value was still incremented under the date lock,
let's first use an atomic increment for the counter and move it out of
the date lock to reduce contention. These are just counters, we don't
need to take locks if we're not rotating, atomic ops are enough. This
patch does this, and leaves the lock for when the period is over. It's
important to note that some values might be added just before or just
after a rotation but this is not a problem since we don't care if a
value is counted in the previous or next period when it's exactly on
the edge. Great care was taken to ensure that the current counter is
always atomically updated.

Other minor cleanups were performed, such as avoiding to reload the
value from memory after a CAS, or using &~1 instead of two shifts to
remove the lowest bit.
2019-05-25 20:31:53 +02:00
Willy Tarreau
ca2a3cc8d5 MINOR: connection: report the mux names in "haproxy -vv"
Since the mux names appear at a few places (dumps etc), let's list
them in front of supported mux protocols in "haproxy -vv".
2019-05-22 11:50:48 +02:00
Willy Tarreau
5484d58a17 MINOR: stream: introduce a stream_dump() function and use it in stream_dump_and_crash()
This function dumps a lot of information about a stream into the provided
buffer. It is now used by stream_dump_and_crash() and will be used by the
debugger as well.
2019-05-22 11:50:48 +02:00
Willy Tarreau
6ea63c301d CLEANUP: objtype: make obj_type() and obj_type_name() take consts
There is no reason for them to require a writable area.
2019-05-22 11:50:48 +02:00
Tim Duesterhus
9b7a976cd6 BUG/MINOR: mworker: Fix memory leak of mworker_proc members
The struct mworker_proc is not uniformly freed everywhere, sometimes leading
to leaks of the `id` string (and possibly the other strings).

Introduce a mworker_free_child function instead of duplicating the freeing
logic everywhere to prevent this kind of issues.

This leak was reported in issue #96.

It looks like the leaks have been introduced in commit 9a1ee7ac31,
which is specific to 2.0-dev. Backporting `mworker_free_child` might be
helpful to ease backporting other fixes, though.
2019-05-22 11:29:18 +02:00
Willy Tarreau
81036f2738 MINOR: time: move the cpu, mono, and idle time to thread_info
These ones are useful across all threads and would be better placed
in struct thread_info than thread-local. There are very few users.
2019-05-20 21:14:14 +02:00
Willy Tarreau
619a95f5ad MEDIUM: init/mworker: make the pipe register function a regular initcall
Now that we have the guarantee that init calls happen before any other
thread starts, we don't need anymore the workaround installed by commit
1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we
can instead rely on a regular per-thread initcall for this function. It
will only be performed on worker thread #0, the other ones and the master
have nothing to do, just like in the original code that was only moved
to the function.
2019-05-20 11:26:12 +02:00
Willy Tarreau
29bf96d73d MINOR: task: always reset curr_task when freeing a task or tasklet
With the thread debugger it becomes visible that we can leave some
wandering pointers for a while in curr_task, which is inappropriate.
This patch addresses this by resetting curr_task to NULL before really
freeing the area. This way it becomes safe even regarding signals.
2019-05-17 17:16:20 +02:00
Willy Tarreau
aa1e1be88f MINOR: task: export global_task_mask
It will be used in debugging functions and must be exported.
2019-05-16 18:02:03 +02:00
Olivier Houchard
478281f55d BUG/MEDIUM: connections: Don't forget to set xprt_ctx to NULL on close.
In conn_xprt_close(), after calling xprt->close(), don't forget to set
conn->xprt_ctx to NULL, or we may attempt to reuse the now-free'd
conn->xprt_ctx if the connection failed and we're retrying it.
2019-05-13 19:11:38 +02:00
Willy Tarreau
c125cef6da CLEANUP: ssl: make inclusion of openssl headers safe
It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around
ssl headers, it even results in some of them appearing in a random order
and multiple times just to benefit form an existing ifdef block. Let's
make these headers safe for inclusion when USE_OPENSSL is not defined,
they now perform the test themselves and do nothing if USE_OPENSSL is
not defined. This allows to remove no less than 8 such ifdef blocks
and make include blocks more readable.
2019-05-10 09:58:43 +02:00
Willy Tarreau
8d164dc568 CLEANUP: ssl: never include openssl/*.h outside of openssl-compat.h anymore
Since we're providing a compatibility layer for multiple OpenSSL
implementations and their derivatives, it is important that no C file
directly includes openssl headers but only passes via openssl-compat
instead. As a bonus this also gets rid of redundant complex rules for
inclusion of certain files (engines etc).
2019-05-10 09:36:42 +02:00
Willy Tarreau
5599456ee2 REORG: ssl: move openssl-compat from proto to common
This way we can include it much earlier to cover types/ as well.
2019-05-10 09:19:50 +02:00
Willy Tarreau
1d158ab12d BUILD: ssl: make libressl use its own version numbers
LibreSSL causes lots of build issues by pretending to be OpenSSL 2.0.0,
and it requires lots of care for each #if added to cover any specific
OpenSSL features.

This commit addresses the problem by making LibreSSL only advertise the
version it forked from (1.0.1g) and by starting to use tests based on
its real version to enable features instead of working by exclusion.
2019-05-09 14:25:47 +02:00
Willy Tarreau
9a1ab08160 CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER
Most tests on OPENSSL_VERSION_NUMBER have become complex and break all
the time because this number is fake for some derivatives like LibreSSL.
This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will
carry the real openssl version defining the compatibility level, and
this version will be adjusted depending on the variants.
2019-05-09 14:25:43 +02:00
Christopher Faulet
3b1d004d41 BUG/MEDIUM: spoe: Be sure the sample is found before setting its context
When a sample fetch is encoded, we use its context to set info about the
fragmentation. But if the sample is not found, the function sample_process()
returns NULL. So we me be sure the sample exists before setting its context.

This patch must be backported to 1.9 and 1.8.
2019-05-07 22:16:41 +02:00