Commit Graph

2754 Commits

Author SHA1 Message Date
Thierry FOURNIER
9d5422a4b7 MINOR: task/notification: Is notifications registered ?
This function returns true is some notifications are registered.

This function is usefull for the following patch

   BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock

It should be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Olivier Houchard
09eeb7684d BUG/MEDIUM: tasks: Don't forget to increase/decrease tasks_run_queue.
Don't forget to increase tasks_run_queue when we're adding a task to the
tasklet list, and to decrease it when we remove a task from a runqueue,
or its value won't be accurate, and could lead to tasks not being executed
when put in the global run queue.

1.9-dev only, no backport is needed.
2018-05-28 15:20:55 +02:00
Tim Duesterhus
3fd1973d37 MINOR: http: Log warning if (add|set)-header fails
This patch adds a warning if an http-(request|reponse) (add|set)-header
rewrite fails to change the respective header in a request or response.

This usually happens when tune.maxrewrite is not sufficient to hold all
the headers that should be added.
2018-05-28 14:53:59 +02:00
Olivier Houchard
673867c357 MAJOR: applets: Use tasks, instead of rolling our own scheduler.
There's no real reason to have a specific scheduler for applets anymore, so
nuke it and just use tasks. This comes with some benefits, the first one
being that applets cannot induce high latencies anymore since they share
nice values with other tasks. Later it will be possible to configure the
applets' nice value. The second benefit is that the applet scheduler was
not very thread-friendly, having a big lock around it in prevision of this
change. Thus applet-intensive workloads should now scale much better with
threads.

Some more improvement is possible now : some applets also use a task to
handle timers and timeouts. These ones could now be simplified to use only
one task.
2018-05-26 20:03:30 +02:00
Olivier Houchard
1599b80360 MINOR: tasks: Make the number of tasks to run at once configurable.
Instead of hardcoding 200, make the number of tasks to be run configurable
using tune.runqueue-depth. 200 is still the default.
2018-05-26 20:03:24 +02:00
Olivier Houchard
b0bdae7b88 MAJOR: tasks: Introduce tasklets.
Introduce tasklets, lightweight tasks. They have no notion of priority,
they are just run as soon as possible, and will probably be used for I/O
later.

For the moment they're used to replace the temporary thread-local list
that was used in the scheduler. The first part of the struct is common
with tasks so that tasks can be cast to tasklets and queued in this list.
Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that
it cannot accidently be confused as not in the queue.

Pure tasklets are identifiable by their nice value of -32768 (which is
normally not possible).
2018-05-26 20:03:19 +02:00
Olivier Houchard
f6e6dc12cd MAJOR: tasks: Create a per-thread runqueue.
A lot of tasks are run on one thread only, so instead of having them all
in the global runqueue, create a per-thread runqueue which doesn't require
any locking, and add all tasks belonging to only one thread to the
corresponding runqueue.

The global runqueue is still used for non-local tasks, and is visited
by each thread when checking its own runqueue. The nice parameter is
thus used both in the global runqueue and in the local ones. The rare
tasks that are bound to multiple threads will have their nice value
used twice (once for the global queue, once for the thread-local one).
2018-05-26 19:27:29 +02:00
Olivier Houchard
9f6af33222 MINOR: tasks: Change the task API so that the callback takes 3 arguments.
In preparation for thread-specific runqueues, change the task API so that
the callback takes 3 arguments, the task itself, the context, and the state,
those were retrieved from the task before. This will allow these elements to
change atomically in the scheduler while the application uses the copied
value, and even to have NULL tasks later.
2018-05-26 19:23:57 +02:00
Willy Tarreau
0cd82e883e BUG/BUILD: threads: unbreak build without threads
A few users reported that building without threads was accidently broken
after commit 6b96f72 ("BUG/MEDIUM: pollers: Use a global list for fd
shared between threads.") due to all_threads_mask not being defined.
It's OK to set it to zero as other code parts do when threads are
enabled but only one thread is used.

This needs to be backported to 1.8.
2018-05-23 19:54:43 +02:00
Thierry Fournier
d5b073cf1f MINOR: lua: Improve error message
The function hlua_ctx_resume return less text message and more error
code. These error code allow the caller to return appropriate
message to the user.
2018-05-22 18:57:46 +02:00
Christopher Faulet
68db0235fd CLEANUP: spoe: Remove unused variables the agent structure
applets_act and applets_idle were used for debugging purpose. Now, these values
are part of the agent's counters.
2018-05-18 15:04:46 +02:00
Olivier Houchard
cb92f5cae4 MINOR: pollers: move polled_mask outside of struct fdtab.
The polled_mask is only used in the pollers, and removing it from the
struct fdtab makes it fit in one 64B cacheline again, on a 64bits machine,
so make it a separate array.
2018-05-06 06:27:34 +02:00
Olivier Houchard
6b96f7289c BUG/MEDIUM: pollers: Use a global list for fd shared between threads.
With the old model, any fd shared by multiple threads, such as listeners
or dns sockets, would only be updated on one threads, so that could lead
to missed event, or spurious wakeups.
To avoid this, add a global list for fd that are shared, using the same
implementation as the fd cache, and only remove entries from this list
when every thread as updated its poller.

[wt: this will need to be backported to 1.8 but differently so this patch
 must not be backported as-is]
2018-05-06 06:27:09 +02:00
Olivier Houchard
6a2cf8752c MINOR: fd: Make the lockless fd list work with multiple lists.
Modify fd_add_to_fd_list() and fd_rm_from_fd_list() so that they take an
offset in the fdtab to the list entry, instead of hardcoding the fd cache,
so we can use them with other lists.
2018-05-06 06:25:49 +02:00
Olivier Houchard
9b36cb4a41 BUG/MEDIUM: task: Don't free a task that is about to be run.
While running a task, we may try to delete and free a task that is about to
be run, because it's part of the local tasks list, or because rq_next points
to it.
So flag any task that is in the local tasks list to be deleted, instead of
run, by setting t->process to NULL, and re-make rq_next a global,
thread-local variable, that is modified if we attempt to delete that task.

Many thanks to PiBa-NL for reporting this and analysing the problem.

This should be backported to 1.8.
2018-05-04 20:11:04 +02:00
Willy Tarreau
760e81d356 MINOR: backend: implement random-based load balancing
For large farms where servers are regularly added or removed, picking
a random server from the pool can ensure faster load transitions than
when using round-robin and less traffic surges on the newly added
servers than when using leastconn.

This commit introduces "balance random". It internally uses a random as
the key to the consistent hashing mechanism, thus all features available
in consistent hashing such as weights and bounded load via hash-balance-
factor are usable. It is extremely convenient because one common concern
when using random is what happens when a server is hammered a bit too
much. Here that can trivially be avoided, like in the configuration below :

    backend bk0
        balance random
        hash-balance-factor 110
        server-template s 1-100 127.0.0.1:8000 check inter 1s

Note that while "balance random" internally relies on a hash algorithm,
it holds the same properties as round-robin and as such is compatible with
reusing an existing server connection with "option prefer-last-server".
2018-05-03 07:20:40 +02:00
Tim Duesterhus
e2b10bf491 MINOR: http: Add support for 421 Misdirected Request
This makes haproxy aware of HTTP 421 Misdirected Request, which
is defined in RFC 7540, section 9.1.2.
2018-04-28 07:03:39 +02:00
Aurlien Nephtali
abbf607105 MEDIUM: cli: Add payload support
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.

Per-command support will need to be added to take advantage of this
feature.

Signed-off-by: Aurlien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:19:33 +02:00
Willy Tarreau
174b06a572 MINOR: h2: detect presence of CONNECT and/or content-length
We'll need this in order to support uploading chunks. The h2 to h1
converter checks for the presence of the content-length header field
as well as the CONNECT method and returns these information to the
caller. The caller indicates whether or not a body is detected for
the message (presence of END_STREAM or not). No transfer-encoding
header is emitted yet.
2018-04-26 10:15:14 +02:00
Olivier Houchard
302f9ef055 BUG/MEDIUM: connection: Make sure we have a mux before calling detach().
In some cases, we call cs_destroy() very early, so early the connection
doesn't yet have a mux, so we can't call mux->detach(). In this case,
just destroy the associated connection.

This should be backported to 1.8.
2018-04-13 16:02:21 +02:00
Christopher Faulet
48aa13f286 BUG/MEDIUM: threads: Fix the max/min calculation because of name clashes
With gcc < 4.7, when HAProxy is built with threads, the macros
HA_ATOMIC_CAS/XCHG/STORE relies on the legacy __sync builtins. These macros
are slightly complicated than the versions relying on the '_atomic'
builtins. Internally, some local variables are defined, prefixed with '__' to
avoid name clashes with the caller.

On the other hand, the macros HA_ATOMIC_UPDATE_MIN/MAX call HA_ATOMIC_CAS. Some
local variables are also definied in these macros, following the same naming
rule as below. The problem is that '__new' variable is used in
HA_ATOMIC_MIN/_MAX and in HA_ATOMIC_CAS. Obviously, the behaviour is undefined
because '__new' in HA_ATOMIC_CAS is left uninitialized. Unfortunatly gcc fails
to detect this error.

To fix the problem, all internal variables to macros are now suffixed with name
of the macros to avoid clashes (for instance, '__new_cas' in HA_ATOMIC_CAS).

This patch must be backported in 1.8.
2018-04-10 11:07:56 +02:00
Christopher Faulet
caf2feca62 MINOR: spoe: Add counters to log info about SPOE agents
In addition to metrics about time spent in the SPOE, following counters have
been added:

  * applets : number of SPOE applets.
  * idles : number of idle applets.
  * nb_sending : number of streams waiting to send data.
  * nb_waiting : number of streams waiting for a ack.
  * nb_processed : number of events/groups processed by the SPOE (from the
                   stream point of view).
  * nb_errors : number of errors during the processing (from the stream point of
                view).

Log messages has been updated to report these counters. Following pattern has
been added at the end of the log message:

    ... <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed>
2018-04-05 15:13:54 +02:00
Christopher Faulet
7250b8fb5c MINOR: spoe: Add loggers dedicated to the SPOE agent
Now it is possible to configure a logger in a spoe-agent section using a "log"
line, as for a proxy. "no log", "log global" and "log <address> ..." syntaxes
are supported.
2018-04-05 15:13:54 +02:00
Christopher Faulet
28ac099907 MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries
With "log global" line, the global list of loggers are copied into the proxy's
struct. The list coming from the default section is also copied when a frontend
or a backend section is parsed. So it is possible to have duplicate entries in
the proxy's list. For instance, with this following config, all messages will be
logged twice:

    global
        log 127.0.0.1 local0 debug
        daemon

    defaults
        mode   http
        log    global
        option httplog

    frontend front-http
        log global
        bind *:8888
        default_backend back-http

    backend back-http
        server www 127.0.0.1:8000
2018-04-05 15:13:54 +02:00
Christopher Faulet
4b0b79dd56 MINOR: log: move 'log' keyword parsing in dedicated function
Now, the function parse_logsrv should be used to parse a "log" line. This
function will update the list of loggers passed in argument. It can release all
log servers when "no log" line was parsed (by the caller) or it can parse "log
global" or "log <address> ... " lines. It takes care of checking the caller
context (global or not) to prohibit "log global" usage in the global section.
2018-04-05 15:13:54 +02:00
Christopher Faulet
36bda1cd4a MINOR: spoe: Add options to store processing times in variables
"set-process-time" and "set-total-time" options have been added to store
processing times in the transaction scope, at each event and group processing,
the current one and the total one. So it is possible to get them.

TODO: documentation
2018-04-05 15:13:54 +02:00
Christopher Faulet
b2dd1e034c MINOR: spoe: Add metrics in to know time spent in the SPOE
Following metrics are added for each event or group of messages processed in the
SPOE:

  * processing time: the delay to process the event or the group. From the
                     stream point of view, it is the latency added by the SPOE
                     processing.
  * request time : It is the encoding time. It includes ACLs processing, if
                   any. For fragmented frames, it is the sum of all fragments.
  * queue time : the delay before the request gets out the sending queue. For
                 fragmented frames, it is the sum of all fragments.
  * waiting time: the delay before the reponse is received. No fragmentation
                  supported here.
  * response time: the delay to process the response. No fragmentation supported
                   here.
  * total time: (unused for now). It is the sum of all events or groups
                processed by the SPOE for a specific threads.

Log messages has been updated. Before, only errors was logged (status_code !=
0). Now every processing is logged, following this format:

  SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUC-CODE reqT/qT/wT/resT/pT

where:

  AGENT              is the agent name
  TYPE               is EVENT of GROUP
  NAME               is the event or the group name
  STREAM-ID          is an integer, the unique id of the stream
  STATUS_CODE        is the processing's status code
  reqT/qT/wT/resT/pT are delays descrive above

For all these delays, -1 means the processing was interrupted before the end. So
-1 for the queue time means the request was never dequeued. For fragmented
frames it is harder to know when the interruption happened.

For now, messages are logged using the same logger than the backend of the
stream which initiated the request.
2018-04-05 15:13:53 +02:00
Olivier Houchard
8ef1a6b0d8 BUG/MINOR: fd: Don't clear the update_mask in fd_insert.
Clearing the update_mask bit in fd_insert may lead to duplicate insertion
of fd in fd_updt, that could lead to a write past the end of the array.
Instead, make sure the update_mask bit is cleared by the pollers no matter
what.

This should be backported to 1.8.
[wt: warning: 1.8 doesn't have the lockless fdcache changes and will
 require some careful changes in the pollers]
2018-04-03 19:38:15 +02:00
Willy Tarreau
b011d8f4c4 MINOR: mux: add a "show_fd" function to dump debugging information for "show fd"
This function will be called from the CLI's "show fd" command to append some
extra mux-specific information that only the mux handler can decode. This is
supposed to help collect various hints about what is happening when facing
certain anomalies.
2018-03-30 14:41:19 +02:00
Willy Tarreau
4037a3f904 MINOR: cli/threads: make "show fd" report thread_sync_io_handler instead of "unknown"
The output was confusing when the sync point's dummy handler was shown.

This patch should be backported to 1.8 to help with troubleshooting.
2018-03-28 18:06:47 +02:00
Emmanuel Hocdet
4952985b71 REORG: compact "struct server"
Move use_ssl (bool value) in "struct server" hole.
2018-03-21 05:04:01 +01:00
Emmanuel Hocdet
4399c75f6c MINOR: proxy-v2-options: add crc32c
This patch add option crc32c (PP2_TYPE_CRC32C) to proxy protocol v2.
It compute the checksum of proxy protocol v2 header as describe in
"doc/proxy-protocol.txt".
2018-03-21 05:04:01 +01:00
Emmanuel Hocdet
6afd898988 MINOR: hash: add new function hash_crc32c
This function will be used to perform CRC32c computations. This is
required to compute proxy protocol v2 CRC32C tlv (PP2_TYPE_CRC32C).
2018-03-21 05:04:01 +01:00
Willy Tarreau
26fb5d8449 BUG/MEDIUM: fd/threads: ensure the fdcache_mask always reflects the cache contents
Commit 4815c8c ("MAJOR: fd/threads: Make the fdcache mostly lockless.")
made the fd cache lockless, but after a few iterations, a subtle part was
lost, consisting in setting the bit on the fd_cache_mask immediately when
adding an event. Now it was done only when the cache started to process
events, but the problem it causes is that fd_cache_mask isn't reliable
anymore as an indicator of presence of events to be processed with no
delay outside of fd_process_cached_events(). This results in some spurious
delays when processing inter-thread wakeups between tasks. Just restoring
the flag when the event is added is enough to fix the problem.

Kudos to Christopher for spotting this one!

No backport is needed as this is only in the development version.
2018-03-20 19:14:24 +01:00
Christopher Faulet
5cd4bbd7ab BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management
The management of the servers and the proxies queues was not thread-safe at
all. First, the accesses to <strm>->pend_pos were not protected. So it was
possible to release it on a thread (for instance because the stream is released)
and to use it in same time on another one (because we redispatch pending
connections for a server). Then, the accesses to stream's information (flags and
target) from anywhere is forbidden. To be safe, The stream's state must always
be updated in the context of process_stream.

So to fix these issues, the queue module has been refactored. A lock has been
added in the pendconn structure. And now, when we try to dequeue a pending
connection, we start by unlinking it from the server/proxy queue and we wake up
the stream. Then, it is the stream reponsibility to really dequeue it (or
release it). This way, we are sure that only the stream can create and release
its <pend_pos> field.

However, be careful. This new implementation should be thread-safe
(hopefully...). But it is not optimal and in some situations, it could be really
slower in multi-threaded mode than in single-threaded one. The problem is that,
when we try to dequeue pending connections, we process it from the older one to
the newer one independently to the thread's affinity. So we need to wait the
other threads' wakeup to really process them. If threads are blocked in the
poller, this will add a significant latency. This problem happens when maxconn
values are very low.

This patch must be backported in 1.8.
2018-03-19 10:03:06 +01:00
Christopher Faulet
510c0d67ef BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled
When a listener is temporarily disabled, we start by locking it and then we call
.pause callback of the underlying protocol (tcp/unix). For TCP listeners, this
is not a problem. But listeners bound on an unix socket are in fact closed
instead. So .pause callback relies on unbind_listener function to do its job.

Unfortunatly, unbind_listener hold the listener's lock and then call an internal
function to unbind it. So, there is a deadlock here. This happens during a
reload. To fix the problemn, the function do_unbind_listener, which is lockless,
is now exported and is called when a listener bound on an unix socket is
temporarily disabled.

This patch must be backported in 1.8.
2018-03-16 11:19:07 +01:00
Willy Tarreau
c41b3e8dff DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers()
This one is only used to compare pointers and NULL is permitted though
this is far from being clear.
2018-03-08 18:33:48 +01:00
Emmanuel Hocdet
253c3b7516 MINOR: connection: add proxy-v2-options authority
This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS
connection was negotiated. In this case, authority corresponds to the sni.
2018-03-01 11:38:32 +01:00
Emmanuel Hocdet
fa8d0f1875 MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key
This patch implement proxy protocol v2 options related to crypto information:
ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and
cert-key (PP2_SUBTYPE_SSL_KEY_ALG).
2018-03-01 11:38:28 +01:00
Emmanuel Hocdet
283e004a85 MINOR: ssl: add ssl_sock_get_cert_sig function
ssl_sock_get_cert_sig can be used to report cert signature short name
to log and ppv2 (RSA-SHA256).
2018-03-01 11:34:08 +01:00
Emmanuel Hocdet
96b7834e98 MINOR: ssl: add ssl_sock_get_pkey_algo function
ssl_sock_get_pkey_algo can be used to report pkey algorithm to log
and ppv2 (RSA2048, EC256,...).
Extract pkey information is not free in ssl api (lock/alloc/free):
haproxy can use the pkey information computed in load_certificate.
Store and use this information in a SSL ex_data when available,
compute it if not (SSL multicert bundled and generated cert).
2018-03-01 11:34:05 +01:00
Emmanuel Hocdet
ddc090bc55 MINOR: ssl: extract full pkey info in load_certificate
Private key information is used in switchctx to implement native multicert
selection (ecdsa/rsa/anonymous). This patch extract and store full pkey
information: dsa type and pkey size in bits. This can be used for switchctx
or to report pkey informations in ppv2 and log.
2018-03-01 11:33:18 +01:00
Christopher Faulet
ca6ef50661 BUG/MEDIUM: buffer: Fix the wrapping case in bi_putblk
When the block of data need to be split to support the wrapping, the start of
the second block of data was wrong. We must be sure to skup data copied during
the first memcpy.

This patch must be backported to 1.8.
2018-02-27 15:45:03 +01:00
Christopher Faulet
b2b279464c BUG/MEDIUM: buffer: Fix the wrapping case in bo_putblk
When the block of data need to be split to support the wrapping, the start of
the second block of data was wrong. We must be sure to skip data copied during
the first memcpy.

This patch must be backported to 1.8, 1.7, 1.6 and 1.5.
2018-02-27 15:45:03 +01:00
Yves Lafon
95317289e9 MINOR: stats: display the number of threads in the statistics.
Add the nbthread global variable to the output, matching nbproc.

This may be backported to 1.8
2018-02-26 11:53:46 +01:00
Willy Tarreau
364d745106 MINOR: debug/pools: make DEBUG_UAF also detect underflows
Since we use padding before the allocated page, it's trivial to place
the allocated address there and see if it gets mangled once we release
it.

This may be backported to stable releases already using DEBUG_UAF.
2018-02-22 14:18:45 +01:00
Willy Tarreau
5a9cce4653 BUG/MINOR: debug/pools: properly handle out-of-memory when building with DEBUG_UAF
Commit 158fa75 ("MINOR: pools: implement DEBUG_UAF to detect use after free")
implemented pool use-after-free detection, but the mmap() return value isn't
properly checked, preventing the call to pool_alloc_area() from returning
NULL. So on out-of-memory a mangled pointer is returned, causing a crash on
the pool_alloc() site instead of forcing a GC. It doesn't affect regular
operations however, just complicates complex bug investigations.

This fix should be backported to 1.8 and to 1.7.
2018-02-22 14:18:45 +01:00
Willy Tarreau
f161d0f51e BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs
Since commit cf975d4 ("MINOR: pools/threads: Implement lockless memory
pools."), we support lockless pools. However the parts dedicated to
detecting use-after-free are not present in this part, making DEBUG_UAF
useless in this situation.

The present patch sets a new define CONFIG_HAP_LOCKLESS_POOLS when such
a compatible architecture is detected, and when pool debugging is not
requested, then makes use of this everywhere in pools and buffers
functions. This way enabling DEBUG_UAF will automatically disable the
lockless version.

No backport is needed as this is purely 1.9-dev.
2018-02-22 14:18:45 +01:00
Tim Duesterhus
5e64286bab CLEANUP: standard: Fix typo in IPv6 mask example
IPv6 addresses with two double colons are invalid.

This typo was introduced in commit 471851713a.
2018-02-21 05:07:35 +01:00
Tim Duesterhus
05f6a43bd4 CLEANUP: pools: Remove unused end label in memory.h
This removes the end label from memory.h.

The labels are unused as of cf975d46bc
which is unreleased (and incidentally the first commit containing
those labels, thus they never have been used).
2018-02-20 08:30:13 +01:00