Commit Graph

3569 Commits

Author SHA1 Message Date
Willy Tarreau
9a1f57351d MEDIUM: threads: add thread_sync_release() to synchronize steps
This function provides an alternate way to leave a critical section run
under thread_isolate(). Currently, a thread may remain in thread_release()
without having the time to notice that the rdv mask was released and taken
again by another thread entering thread_isolate() (often the same that just
released it). This is because threads wait in harmless mode in the loop,
which is compatible with the conditions to enter thread_isolate(). It's
not possible to make them wait with the harmless bit off or we cannot know
when the job is finished for the next thread to start in thread_isolate(),
and if we don't clear the rdv bit when going there, we create another
race on the start point of thread_isolate().

This new synchronous variant of thread_release() makes use of an extra
mask to indicate the threads that want to be synchronously released. In
this case, they will be marked harmless before releasing their sync bit,
and will wait for others to release their bit as well, guaranteeing that
thread_isolate() cannot be started by any of them before they all left
thread_sync_release(). This allows to construct synchronized blocks like
this :

     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()
     /* do something together here */
     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()

And so on. This is particularly useful during initialization where several
steps have to be respected and no thread must start a step before the
previous one is completed by other threads.

This one must not be placed after any call to thread_release() or it would
risk to block an earlier call to thread_isolate() which the current thread
managed to leave without waiting for others to complete, and end up here
with the thread's harmless bit cleared, blocking others. This might be
improved in the future.
2019-06-10 09:42:43 +02:00
Willy Tarreau
9faebe34cd MEDIUM: tools: improve time format error detection
As reported in GH issue #109 and in discourse issue
https://discourse.haproxy.org/t/haproxy-returns-408-or-504-error-when-timeout-client-value-is-every-25d
the time parser doesn't error on overflows nor underflows. This is a
recurring problem which additionally has the bad taste of taking a long
time before hitting the user.

This patch makes parse_time_err() return special error codes for overflows
and underflows, and adds the control in the call places to report suitable
errors depending on the requested unit. In practice, underflows are almost
never returned as the parsing function takes care of rounding values up,
so this might possibly happen on 64-bit overflows returning exactly zero
after rounding though. It is not really possible to cut the patch into
pieces as it changes the function's API, hence all callers.

Tests were run on about every relevant part (cookie maxlife/maxidle,
server inter, stats timeout, timeout*, cli's set timeout command,
tcp-request/response inspect-delay).
2019-06-07 19:32:02 +02:00
Frédéric Lécaille
b65717fa55 MINOR: peers: Optimization for dictionary cache lookup.
When we look up an dictionary entry in the cache used upon transmission
we store the last result in ->prev_lookup of struct dcache_tx so that
to compare it with the subsequent entries to look up and save performances.
2019-06-07 15:47:54 +02:00
Frédéric Lécaille
99de1d0479 MINOR: dict: Store the length of the dictionary entries.
When allocating new dictionary entries we store the length of the strings.
May be useful so that not to have to call strlen() too much often at runing
time.
2019-06-07 15:47:54 +02:00
Frédéric Lécaille
6c39198b57 MINOR peers: data structure simplifications for server names dictionary cache.
We store pointers to server names dictionary entries in a pre-allocated array of
ebpt_node's (->entries member of struct dcache_tx) to cache those sent to remote
peers. Consequently the ID used to identify the server name dictionary entry is
also used as index for this array. There is no need to implement a lookup by key
for this dictionary cache.
2019-06-07 15:47:54 +02:00
Willy Tarreau
1bfd6020ce MINOR: logs: use the new bitmap functions instead of fd_sets for encoding maps
The fd_sets we've been using in the log encoding functions are not portable
and were shown to break at least under Cygwin. This patch gets rid of them
in favor of the new bitmap functions. It was verified with the config below
that the log output was exactly the same before and after the change :

    defaults
        mode http
        option httplog
        log stdout local0
        timeout client 1s
        timeout server 1s
        timeout connect 1s

    frontend foo
        bind :8001
        capture request header chars len 255

    backend bar
        option httpchk "GET" "/" "HTTP/1.0\r\nchars: \x01\x02\x03\x04\x05\x06\x07\x08\x09\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"
        server foo 127.0.0.1:8001 check
2019-06-07 11:13:24 +02:00
Willy Tarreau
7355b040d1 MINOR: tools: add new bitmap manipulation functions
We now have ha_bit_{set,clr,flip,test} to manipulate bitfields made
of arrays of longs. The goal is to get rid of the remaining non-portable
FD_{SET,CLR,ISSET} that still exist at a few places.
2019-06-07 10:44:49 +02:00
Willy Tarreau
ad660e3f84 BUILD: stream-int: avoid a build warning in dev mode in si_state_bit()
The BUG_ON() test emits a warning about an always-true comparison regarding
<state> which cannot be lower than zero. Let's get rid of it.
2019-06-06 16:42:08 +02:00
Willy Tarreau
3b285d7fbd MINOR: stream-int: make si_sync_send() from the send code of si_update_both()
Just like we have a synchronous recv() function for the stream interface,
let's have a synchronous send function that we'll be able to call from
different places. For now this only moves the code, nothing more.
2019-06-06 16:36:19 +02:00
Willy Tarreau
236c4298b3 MINOR: stream-int: split si_update() into si_update_rx() and si_update_tx()
We should not update the two directions at once, in fact we should update
the Rx path after recv() and the Tx path after send(). Let's start by
splitting the update function in two for this.
2019-06-06 16:36:19 +02:00
Willy Tarreau
8c603ded39 MEDIUM: stream-int: make idle-conns switch to ST_RDY
The purpose of making idle-conns switch to SI_ST_CON was to make the
transition detectable and the operation retryable in case of connection
error. Now we have the RDY state for this which is much more suitable
since it indicates a validated connection on which we didn't necessarily
send anything yet. This will still lead to a transition to EST while not
requiring unnatural write polling nor connect timeouts.
2019-06-06 16:36:19 +02:00
Willy Tarreau
4f283fa604 MEDIUM: stream-int: introduce a new state SI_ST_RDY
The main reason for all the trouble we're facing with stream interface
error or timeout reports during the connection phase is that we currently
can't make the difference between a connection attempt and a validated
connection attempt. It is problematic because we tend to switch early
to SI_ST_EST but can't always do what we want in this state since it's
supposed to be set when we don't need to visit sess_establish() again.

This patch introduces a new state betwen SI_ST_CON and SI_ST_EST, which
is SI_ST_RDY. It indicates that we've verified that the connection is
ready. It's a transient state, like SI_ST_DIS, that cannot persist when
leaving process_stream(). For now it is not set, only verified in various
tests where SI_ST_CON was used or SI_ST_EST depending on the cases.

The stream-int state diagram was minimally updated to reflect the new
state, though it is largely obsolete and would need to be seriously
updated.
2019-06-06 16:36:19 +02:00
Willy Tarreau
7ab22adbf7 MEDIUM: stream-int: remove dangerous interval checks for stream-int states
The stream interface state checks involving ranges were replaced with
checks on a set of states, already revealing some issues. No issue was
fixed, all was replaced in a one-to-one mapping for easier control. Some
checks involving a strict difference were also replaced with fields to
be clearer. At this stage, the result must be strictly equivalent. A few
tests were also turned to their bit-field equivalent for better readability
or in preparation for upcoming changes.

The test performed in the SPOE filter was swapped so that the closed and
error states are evicted first and that the established vs conn state is
tested second.
2019-06-06 16:36:19 +02:00
Willy Tarreau
bedcd698b3 MINOR: stream-int: use bit fields to match multiple stream-int states at once
At some places we do check for ranges of stream-int states but those
are confusing as states ordering is not well known (e.g. it's not obvious
that CER is between CON and EST). Let's create a bit field from states so
that we can match multiple states at once instead. The new enum si_state_bit
contains SI_SB_* which are state bits instead of state values. The function
si_state_in() indicates if the state in argument is one of those represented
by the bit mask in second argument.
2019-06-06 16:36:19 +02:00
Olivier Houchard
03abf2d31e MEDIUM: connections: Remove CONN_FL_SOCK*
Now that the various handshakes come with their own XPRT, there's no
need for the CONN_FL_SOCK* flags, and the conn_sock_want|stop functions,
so garbage-collect them.
2019-06-05 18:03:38 +02:00
Olivier Houchard
fe50bfb82c MEDIUM: connections: Introduce a handshake pseudo-XPRT.
Add a new XPRT that is used when using non-SSL handshakes, such as proxy
protocol or Netscaler, instead of taking care of it in conn_fd_handler().
This XPRT is installed when any of those is used, and it removes itself once
the handshake is done.
This should allow us to remove the distinction between CO_FL_SOCK* and
CO_FL_XPRT*.
2019-06-05 18:03:38 +02:00
Olivier Houchard
2e055483ff MINOR: connections: Add a new xprt method, add_xprt().
Add a new method to xprt_ops, add_xprt(), that changes the underlying
xprt to the one provided, and optionally provide the old one.
2019-06-05 18:03:38 +02:00
Olivier Houchard
5149b59851 MINOR: connections: Add a new xprt method, remove_xprt.
Add a new method to xprt_ops, remove_xprt. When called, if the provided
xprt_ctx is the same as the xprt's underlying xprt_ctx, it then uses the
new xprt provided, otherwise it calls the remove_xprt method of the next
xprt.
The goal is to be able to add a temporary xprt, that removes itself from
the chain when it did what it had to do. This will be used to implement
a pseudo-xprt for anything that just requires a handshake (such as the
proxy protocol).
2019-06-05 18:03:38 +02:00
Olivier Houchard
000694cf96 MINOR: ssl: Make ssl_sock_handshake() static.
ssl_sock_handshake is now only used by the ssl code itself, there's no need
to export it anymore, so make it static.
2019-06-05 18:03:38 +02:00
Olivier Houchard
ea8dd949e4 MEDIUM: ssl: Handle subscribe by itself.
As the SSL code may have different needs than the upper layer, ie it may want
to receive when the upper layer wants to right, instead of directly forwarding
the subscribe to the underlying xprt, handle it ourself. The SSL code will
know remember any subscribe call, and wake the tasklet when it is ready
for more I/O.
2019-06-05 18:03:38 +02:00
Christopher Faulet
54b5e214b0 MINOR: htx: Don't use end-of-data blocks anymore
This type of blocks is useless because transition between data and trailers is
obvious. And when there is no trailers, the end-of-message is still there to
know when data end for chunked messages.
2019-06-05 10:12:11 +02:00
Christopher Faulet
2d7c5395ed MEDIUM: htx: Add the parsing of trailers of chunked messages
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
2019-06-05 10:12:11 +02:00
Christopher Faulet
8f3c256f7e MEDIUM: cache/htx: Always store info about HTX blocks in the cache
It was only done for the headers (including the EOH marker). data were prefixed
by the info field of these blocks. The payload and the trailers of the messages
were stored in raw. The total size of headers and payload were kept in the
cached object state to help output formatting.

Now, info about each HTX block is store in the cache. Only data are allowed to
be splitted. Otherwise, all blocks of an HTX message are handled the same way,
both when storing a message in the cache and when delivering it from the
cache. This will help the cache implementation to be more robust to internal
changes in the HTX. Especially for the upcoming parsing of trailers. There is
also no more need to keep extra info in the cached object state.
2019-06-05 10:12:11 +02:00
Christopher Faulet
a4f9dd4a56 BUG/MINOR: channel/htx: Don't alter channel during forward for empty HTX message
In channel_htx_forward() and channel_htx_forward_forever(), if the HTX message
is empty, the underlying buffer may be really empty too. And we have no warranty
the caller will call htx_to_buf() later. And in practice, it is almost never
done. So the channel's buffer must not be altered. Otherwise, the buffer may be
considered as full (data == size) for an empty HTX message and no outgoing data.

This patch must be backported to 1.9.
2019-06-05 10:12:11 +02:00
Frédéric Lécaille
8d78fa7def MINOR: peers: Make peers protocol support new "server_name" data type.
Make usage of the APIs implemented for dictionaries (dict.c) and their LRU caches (struct dcache)
so that to send/receive server names used for the server by name stickiness. These
names are sent over the network as follows:

 - in every case we send the encode length of the data (STD_T_DICT), then
 - if the server names is not present in the cache used upon transmission (struct dcache_tx)
   we cache it and we the ID of this TX cache entry followed the encode length of the
   server name, and finally the sever name itseft (non NULL terminated string).
 - if the server name is present, we repead these operations but we only send the TX cache
   entry ID.

Upon receipt, the couple of (cache IDs, server name) are stored the LRU cache used
only upon receipt (struct dcache_rx). As the peers protocol is symetrical, the fact
that the server name is present in the received data (resp. or not) denotes if
the entry is absent (resp. or not).
2019-06-05 08:42:33 +02:00
Frédéric Lécaille
7da71293e4 MINOR: server: Add a dictionary for server names.
This patch only declares and defines a dictionary for the server
names (stored as ->id member field).
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
84d6046a33 MINOR: proxy: Add a "server by name" tree to proxy.
Add a tree to proxy struct to lookup by name for servers attached
to this proxy and populated it at parsing time.
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
5ad57ea85f MINOR: stick-table: Add "server_name" new data type.
This simple patch only adds definitions to create a new stick-table
data type ID and a new standard type to store information in relation
wich dictionary entries (STD_T_DICT).
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
74167b25f7 MINOR: peers: Add a LRU cache implementation for dictionaries.
We want to send some stick-table data fields stored as strings in dictionaries
without consuming too much memory and CPU. To do so we implement with this patch
a cache for send/received dictionaries entries. These dictionary of strings entries are
stored in others real dictionary entries with an identifier as key (unsigned int)
and a pointer to the dictionary of strings entries as values.
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
4a3fef834c MINOR: dict: Add dictionary new data structure.
This patch adds minimalistic definitions to implement dictionary new data structure
which is an ebtree of ebpt_node structs with strings as keys. Note that this has nothing
to see with real dictionary data structure (maps of keys in association with values).
2019-06-05 08:33:35 +02:00
Frédéric Lécaille
1673bbdf98 CLEANUP: peers: Remove tabs characters.
This patch only replaces very annoying tabulation characters by spaces
so that not to have to use again tabulations where they should not be used.
2019-06-05 08:33:34 +02:00
Willy Tarreau
7bb39d7cd6 CLEANUP: connection: remove the now unused CS_FL_REOS flag
Let's remove it before it gets uesd again. It was mostly replaced with
CS_FL_EOI and by mux-specific states or flags.
2019-06-03 14:23:33 +02:00
Willy Tarreau
7067b3a92e BUG/MINOR: deinit/threads: make hard-stop-after perform a clean exit
As reported in GH issue #99, when hard-stop-after triggers and threads
are in use, the chance that any thread releases the resources in use by
the other ones is non-null. Thus no thread should be allowed to deinit()
nor exit by itself.

Here we take a different approach. We simply use a 3rd possible value
for the "killed" variable so that all threads know they must break out
of the run-poll-loop and immediately stop.

This patch was tested by commenting the stream_shutdown() calls in
hard_stop() to increase the chances to see a stream use released
resources. With this fix applied, it never crashes anymore.

This fix should be backported to 1.9 and 1.8.
2019-06-02 11:30:07 +02:00
Alexander Liu
2a54bb74cd MEDIUM: connection: Upstream SOCKS4 proxy support
Have "socks4" and "check-via-socks4" server keyword added.
Implement handshake with SOCKS4 proxy server for tcp stream connection.
See issue #82.

I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found
at "https://www.openssh.com/txt/socks4.protocol". Please reference to it.

[wt: for now connecting to the SOCKS4 proxy over unix sockets is not
 supported, and mixing IPv4/IPv6 is discouraged; indeed, the control
 layer is unique for a connection and will be used both for connecting
 and for target address manipulation. As such it may for example report
 incorrect destination addresses in logs if the proxy is reached over
 IPv6]
2019-05-31 17:24:06 +02:00
Olivier Houchard
cfbb3e6560 MEDIUM: tasks: Get rid of active_tasks_mask.
Remove the active_tasks_mask variable, we can deduce if we've work to do
by other means, and it is costly to maintain. Instead, introduce a new
function, thread_has_tasks(), that returns non-zero if there's tasks
scheduled for the thread, zero otherwise.
2019-05-29 21:53:37 +02:00
Olivier Houchard
250031e444 MEDIUM: sessions: Introduce session flags.
Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when
we use NTLM authentication, and we should reuse the last connection. This
should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST.

This should be backported to 1.9.
2019-05-29 15:41:47 +02:00
Willy Tarreau
ef28dc11e3 MINOR: task: turn the WQ lock to an RW_LOCK
For now it's exclusively used as a write lock though, thus it remains
100% equivalent to the spinlock it replaces.
2019-05-28 19:15:44 +02:00
Willy Tarreau
186e96ece0 MEDIUM: buffers: relax the buffer lock a little bit
In lock profiles it's visible that there is a huge contention on the
buffer lock. The reason is that when offer_buffers() is called, it
systematically takes the lock before verifying if there is any
waiter. However doing so doesn't protect against races since a
waiter can happen just after we release the lock as well. Similarly
in h2 we take the lock every time an h2c is going to be released,
even without checking that the h2c belongs to a wait list. These
two have now been addressed by verifying non-emptiness of the list
prior to taking the lock.
2019-05-28 17:25:21 +02:00
Willy Tarreau
a8b2ce02b8 MINOR: activity: report the number of failed pool/buffer allocations
Haproxy is designed to be able to continue to run even under very low
memory conditions. However this can sometimes have a serious impact on
performance that it hard to diagnose. Let's report counters of failed
pool and buffer allocations per thread in show activity.
2019-05-28 17:25:21 +02:00
Willy Tarreau
2ae84e445d MEDIUM: poller: separate the wait time from the wake events
We have been abusing the do_poll()'s timeout for a while, making it zero
whenever there is some known activity. The problem this poses is that it
complicates activity diagnostic by incrementing the poll_exp field for
each known activity. It also requires extra computations that could be
avoided.

This change passes a "wake" argument to say that the poller must not
sleep. This simplifies the operations and allows one to differenciate
expirations from activity.
2019-05-28 17:25:21 +02:00
Willy Tarreau
0a7ef02074 MINOR: htx: make htx_add_data() return the transmitted byte count
In order to later allow htx_add_data() to transmit partial blocks and
avoid defragmenting the buffer, we'll need to return the number of bytes
consumed. This first modification makes the function do this and its
callers take this into account. At the moment the function still works
atomically so it returns either the block size or zero. However all
call places have been adapted to consider any value between zero and
the block size.
2019-05-28 14:48:59 +02:00
Willy Tarreau
d4908fa465 MINOR: htx: rename htx_append_blk_value() to htx_add_data_atonce()
This function is now dedicated to data blocks, and we'll soon need to
access it from outside in a rare few cases. Let's rename it and export
it.
2019-05-28 14:48:59 +02:00
Christopher Faulet
39744f792d MINOR: htx: Remove support of pseudo headers because it is unused
The code to handle pseudo headers is unused and with no real value. So remove
it.
2019-05-28 07:42:33 +02:00
Christopher Faulet
613346b60e MINOR: htx: remove the unused function htx_find_blk() 2019-05-28 07:42:33 +02:00
Christopher Faulet
dab5ab551d MINOR: channel/htx: Add functions to forward a part or all HTX payload
The functions channel_htx_fwd_payload() and channel_htx_fwd_all() should now be
used to forward, respectively, a part of the HTX payload or all of it. These
functions forward data and update the first block position.
2019-05-28 07:42:33 +02:00
Christopher Faulet
29f1758285 MEDIUM: htx: Store the first block position instead of the start-line one
We don't store the start-line position anymore in the HTX message. Instead we
store the first block position to analyze. For now, it is almost the same. But
once all changes will be made on this part, this position will have to be used
by HTX analyzers, and only in the analysis context, to know where the analyse
should start.

When new blocks are added in an HTX message, if the first block position is not
defined, it is set. When the block pointed by it is removed, it is set to the
block following it. -1 remains the value to unset the position. the first block
position is unset when the HTX message is empty. It may also be unset on a
non-empty message, meaning every blocks were already analyzed.

From HTX analyzers point of view, this position is always set during headers
analysis. When they are waiting for a request or a response, if it is unset, it
means the analysis should wait. But once the analysis is started, and as long as
headers are not forwarded, it points to the message start-line.

As mentionned, outside the HTX analysis, no code must rely on the first block
position. So multiplexers and applets must always use the head position to start
a loop on an HTX message.
2019-05-28 07:42:33 +02:00
Christopher Faulet
b2f4e83a28 MINOR: channel/htx: Add function to forward headers of an HTX message
The function channel_htx_fwd_headers() should now be used by HTX analyzers to
forward all headers of an HTX message, from the start-line to the corresponding
EOH. It takes care to update the star-line position.
2019-05-28 07:42:33 +02:00
Christopher Faulet
05c083ca8d MINOR: htx: Add a field to set the memory used by headers in the HTX start-line
The field hdrs_bytes has been added in the structure htx_sl. It should be used
to set how many bytes are help by all headers, from the start-line to the
corresponding EOH block. it must be set to -1 if it is unknown.
2019-05-28 07:42:12 +02:00
Christopher Faulet
9b04d22945 MINOR: connection: Remove the unused flag CO_RFL_KEEP_RSV 2019-05-28 07:42:12 +02:00
Christopher Faulet
2ae35045e2 MINOR: htx: Add function htx_get_max_blksz()
This functions should be used to get the maximum size for a block, not exceeding
the max amount of bytes passed in argument. Thus max may be set to -1 to have no
limit.
2019-05-28 07:42:12 +02:00