Commit Graph

39 Commits

Author SHA1 Message Date
Willy Tarreau
48b608026b MINOR: shctx: add a few BUG_ON() for consistency checks
The shctx code relies on sensitive conditions that are hard to infer
from the code itself, let's add some BUG_ON() to verify them. They
helped spot the previous bugs.
2021-11-19 19:25:13 +01:00
Willy Tarreau
cafe15c743 BUG/MINOR: shctx: do not look for available blocks when the first one is enough
In shctx_row_reserve_hot() we only leave if we've found the exact
requested size instead of at least as large, as is documented. This
results in extra lookups and free calls in the avail loop while it is
not needed, and participates to seeing a negative data_len early as
spotted in previous bugs.

It doesn't seem to have any other impact however, but it's better to
backport it to stable branches.
2021-11-19 19:25:13 +01:00
Willy Tarreau
b15e8a1c96 BUG/MEDIUM: shctx: leave the block allocator when enough blocks are found
In shctx_row_reserve_hot(), a missing break allows the avail loop to
loop for a while after having allocated the required blocks, possibly
leading to the point where it could trigger the watchdog after checking
up to 2 million blocks. In addition, the extra iteration may leave one
block assigned with size zero at the head of the avail list, and mark
it as being an isolated chain of 1 block. It's unclear whether this
could have had other consequences.

There is a non-negligible chance that it addreses bugs #1451 and #1284,
as the pattern observed in the loop looks exactly the same as the one
reported there in the crashes.

It's only marked medium because it is extremely hard to trigger. Here
the conditions were reproduced when starting 4k connections at once
requesting objects of random sizes between 0 and 20k to store them into
a small 1MB cache. However the watchdog will never trigger in such a case
so one needs to instrument the functions.

Thanks to Sohaib Ahmad and @g0uZ for providing useful traces.

This will need to be backported to all stable branches.
2021-11-19 19:25:13 +01:00
Willy Tarreau
6fd0450b47 CLEANUP: shctx: remove the different inter-process locking techniques
With a single process, we don't need to USE_PRIVATE_CACHE, USE_FUTEX
nor USE_PTHREAD_PSHARED anymore. Let's only keep the basic spinlock
to lock between threads.
2021-06-15 16:52:42 +02:00
Willy Tarreau
9e467af804 BUG/MEDIUM: shctx: use at least thread-based locking on USE_PRIVATE_CACHE
Since threads were introduced in 1.8, the USE_PRIVATE_CACHE mode of the
shctx was not updated to use locks. Originally it was meant to disable
sharing between processes, so it removes the lock/unlock instructions.
But with threads enabled, it's not possible to work like this anymore.

It's easy to see that once built with private cache and threads enabled,
sending violent SSL traffic to the the process instantly makes it die.
The HTTP cache is very likely affected as well.

This patch addresses this by falling back to our native spinlocks when
USE_PRIVATE_CACHE is used. In practice we could use them also for other
modes and remove all older implementations, but this patch aims at keeping
the changes very low and easy to backport. A new SHCTX_LOCK label was
added to help with debugging, but OTHER_LOCK might be usable as well
for backports.

An even lighter approach for backports may consist in always declaring
the lock (or reusing "waiters"), and calling pl_take_s() for the lock()
and pl_drop_s() for the unlock() operation. This could even be used in
all modes (process and threads), even when thread support is disabled.

Subsequent patches will further clean up this area.

This patch must be backported to all supported versions since 1.8.
2021-06-15 16:52:07 +02:00
Willy Tarreau
2b71810cb3 CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion
The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").

Let's use more explicit (but slightly longer) names now:

   LIST_ADD        ->       LIST_INSERT
   LIST_ADDQ       ->       LIST_APPEND
   LIST_ADDED      ->       LIST_INLIST
   LIST_DEL        ->       LIST_DELETE

The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.

The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.

The list doc was updated.
2021-04-21 09:20:17 +02:00
Willy Tarreau
f268ee8795 REORG: include: split global.h into haproxy/global{,-t}.h
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.

It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.

The UNIX_MAX_PATH definition was moved to compat.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
334099c324 REORG: include: move shctx to haproxy/shctx{,-t}.h
Minor cleanups were applied, some includes were missing from the types
file and some were incorrect in a few C files (duplicated or not using
path).
2020-06-11 10:18:57 +02:00
Willy Tarreau
853b297c9b REORG: include: split mini-clist into haproxy/list and list-t.h
Half of the users of this include only need the type definitions and
not the manipulation macros nor the inline functions. Moves the various
types into mini-clist-t.h makes the files cleaner. The other one had all
its includes grouped at the top. A few files continued to reference it
without using it and were cleaned.

In addition it was about time that we'd rename that file, it's not
"mini" anymore and contains a bit more than just circular lists.
2020-06-11 10:18:56 +02:00
Willy Tarreau
8d2b777fe3 REORG: ebtree: move the include files from ebtree to include/import/
This is where other imported components are located. All files which
used to directly include ebtree were touched to update their include
path so that "import/" is now prefixed before the ebtree-related files.

The ebtree.h file was slightly adjusted to read compiler.h from the
common/ subdirectory (this is the only change).

A build issue was encountered when eb32sctree.h is loaded before
eb32tree.h because only the former checks for the latter before
defining type u32. This was addressed by adding the reverse ifdef
in eb32tree.h.

No further cleanup was done yet in order to keep changes minimal.
2020-06-11 09:31:11 +02:00
Willy Tarreau
a7ddab0c25 BUG/MEDIUM: shctx: make sure to keep all blocks aligned
The blocksize and the extra field are not necessarily aligned on a
machine word. This can result in crashing an align-sensitive machine
when initializing the shctx area. Let's round both sizes up to a pointer
size to make this safe everywhere.

This fixes issue #512. This should be backported as far as 1.8.
2020-02-21 13:45:58 +01:00
Joseph Herlant
3952643b35 CLEANUP: Fix typos in the shctx subsystem
Fixes typos in the code comments of the shctx subsystem.
2018-12-02 18:40:29 +01:00
Frdric Lcaille
b80bc273a3 MINOR: shctx: Change max. object size type to unsigned int.
This change is there to prevent implicit conversions when comparing
shctx maximum object sizes with other unsigned values.
2018-10-26 04:54:40 +02:00
Frdric Lcaille
b7838afe6f MINOR: shctx: Add a maximum object size parameter.
This patch adds a new parameter to shctx_init() function to be used to
limit the size of each shared object, -1 value meaning "no limit".
2018-10-24 04:39:44 +02:00
Frdric Lcaille
0bec807e08 MINOR: shctx: Shared objects block by block allocation.
This patch makes shctx capable of storing objects in several parts,
each parts being made of several blocks. There is no more need to
walk through until reaching the end of a row to append new blocks.

A new pointer to a struct shared_block member, named last_reserved,
has been added to struct shared_block so that to memorize the last block which was
reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which
is used to memorize the last block used by shctx_row_data_append() to store the data.
2018-10-24 04:35:53 +02:00
Willy Tarreau
ddfbd83780 BUILD: shctx: do not depend on openssl anymore
The build breaks on a machine without openssl/crypto.h because shctx
still loads openssl-compat.h while it doesn't need it anymore since
the code was moved :

In file included from src/shctx.c:20:0:
include/proto/openssl-compat.h:3:28: fatal error: openssl/crypto.h: No such file or directory
 #include <openssl/crypto.h>

Just remove include openssl-compat from shctx.
2017-11-08 14:33:36 +01:00
William Lallemand
7217c46dfe MEDIUM: shctx: forbid shctx to read more than expected
Forbid shctx to read more than expected, it allows you to use a greater
value as a len with shctx_row_data_get(), the size of the destination
buffer for example.
2017-10-31 21:17:19 +01:00
William Lallemand
4f45bb9c46 MEDIUM: shctx: separate ssl and shctx
This patch reorganize the shctx API in a generic storage API, separating
the shared SSL session handling from its core.

The shctx API only handles the generic data part, it does not know what
kind of data you use with it.

A shared_context is a storage structure allocated in a shared memory,
allowing its usage in a multithread or a multiprocess context.

The structure use 2 linked list, one containing the available blocks,
and another for the hot locked blocks. At initialization the available
list is filled with <maxblocks> blocks of size <blocksize>. An <extra>
space is initialized outside the list in case you need some specific
storage.

+-----------------------+--------+--------+--------+--------+----
| struct shared_context | extra  | block1 | block2 | block3 | ...
+-----------------------+--------+--------+--------+--------+----
                                 <--------  maxblocks  --------->
                                            * blocksize

The API allows to store content on several linked blocks. For example,
if you allocated blocks of 16 bytes, and you want to store an object of
60 bytes, the object will be allocated in a row of 4 blocks.

The API was made for LRU usage, each time you get an object, it pushes
the object at the end of the list. When it needs more space, it discards

The functions name have been renamed in a more logical way, the part
regarding shctx have been prefixed by shctx_ and the functions for the
shared ssl session cache have been prefixed by sh_ssl_sess_.
2017-10-31 03:49:40 +01:00
William Lallemand
ed0b5ad1aa REORG: shctx: move ssl functions to ssl_sock.c
Move the ssl callback functions of the ssl shared session cache to
ssl_sock.c. The shctx functions still needs to be separated of the ssl
tree and data.
2017-10-31 03:48:39 +01:00
William Lallemand
3f85c9aec8 MEDIUM: shctx: allow the use of multiple shctx
Add an shctx argument which permits to create new independent shctx
area.
2017-10-31 03:44:11 +01:00
William Lallemand
24a7a75be6 REORG: shctx: move lock functions and struct
Move locks functions to proto/shctx.h, and structures to types/shctx.h
in order to simplify the split ssl/shctx.
2017-10-31 03:44:11 +01:00
William Lallemand
2a97966b08 CLEANUP: shctx: get ride of the shsess_packet{_hdr} structures
This patch removes remaining structures and fields which were never used
in the shctx code.
2017-10-31 03:44:11 +01:00
Dirkjan Bussink
1866d6d8f1 MEDIUM: ssl: Add support for OpenSSL 1.1.0
In the last release a lot of the structures have become opaque for an
end user. This means the code using these needs to be changed to use the
proper functions to interact with these structures instead of trying to
manipulate them directly.

This does not fix any deprecations yet that are part of 1.1.0, it only
ensures that it can be compiled against that version and is still
compatible with older ones.

[wt: openssl-0.9.8 doesn't build with it, there are conflicts on certain
     function prototypes which we declare as inline here and which are
     defined differently there. But openssl-0.9.8 is not supported anymore
     so probably it's OK to go without it for now and we'll see later if
     some users still need it. Emeric has reviewed this change and didn't
     spot anything obvious which requires special care. Let's try it for
     real now]
2016-11-08 20:54:41 +01:00
Vincent Bernat
3c2f2f207f CLEANUP: remove unneeded casts
In C89, "void *" is automatically promoted to any pointer type. Casting
the result of malloc/calloc to the type of the LHS variable is therefore
unneeded.

Most of this patch was built using this Coccinelle patch:

@@
type T;
@@

- (T *)
  (\(lua_touserdata\|malloc\|calloc\|SSL_get_app_data\|hlua_checkudata\|lua_newuserdata\)(...))

@@
type T;
T *x;
void *data;
@@

  x =
- (T *)
  data

@@
type T;
T *x;
T *data;
@@

  x =
- (T *)
  data

Unfortunately, either Coccinelle or I is too limited to detect situation
where a complex RHS expression is of type "void *" and therefore casting
is not needed. Those cases were manually examined and corrected.
2016-04-03 14:17:42 +02:00
Willy Tarreau
ce3f913e48 MINOR: stats: add counters for SSL cache lookups and misses
One important aspect of SSL performance tuning is the cache size,
but there's no metric to know whether it's large enough or not. This
commit introduces two counters, one for the cache lookups and another
one for cache misses. These counters are reported on "show info" on
the stats socket. This way, it suffices to see the cache misses
counter constantly grow to know that a larger cache could possibly
help.
2014-05-28 16:53:04 +02:00
Emeric Brun
cd1a526a90 MAJOR: ssl: Change default locks on ssl session cache.
Prevously pthread process shared lock were used by default,
if USE_SYSCALL_FUTEX is not specified.

This patch implements an OS independant kind of lock:
An active spinlock is usedf if USE_SYSCALL_FUTEX is not specified.

The old behavior is still available if USE_PTHREAD_PSHARED=1.
2014-05-08 22:46:32 +02:00
Emeric Brun
caa19cc867 BUG/MAJOR: ssl: Fallback to private session cache if current lock mode is not supported.
Process shared mutex seems not supported on some OSs (FreeBSD).

This patch checks errors on mutex lock init to fallback
on a private session cache (per process cache) in error cases.
2014-05-08 22:46:32 +02:00
Emeric Brun
f27af0dcc6 BUG/MEDIUM: shctx: makes the code independent on SSL runtime version.
struct SSL(ssl_st) defintion changed between openssl versions and must not be dereferenced.
2013-04-26 19:15:52 +02:00
Emeric Brun
22890a1225 MINOR: ssl: Setting global tune.ssl.cachesize value to 0 disables SSL session cache. 2012-12-28 14:48:13 +01:00
Emeric Brun
af9619da3e MEDIUM: ssl: manage shared cache by blocks for huge sessions.
Sessions using client certs are huge (more than 1 kB) and do not fit
in session cache, or require a huge cache.

In this new implementation sshcachesize set a number of available blocks
instead a number of available sessions.

Each block is large enough (128 bytes) to store a simple session (without
client certs).

Huge sessions will take multiple blocks depending on client certificate size.

Note: some unused code for session sync with remote peers was temporarily
      removed.
2012-12-04 10:56:56 +01:00
Emeric Brun
78617e51fd BUG/MINOR: ssl: One free session in cache remains unused. 2012-12-03 19:39:40 +01:00
Emeric Brun
786991e8b7 BUG/MEDIUM: ssl: Fix handshake failure on session resumption with client cert.
Openssl session_id_context was not set on cached sessions so handshake returns an error.
2012-11-26 18:43:21 +01:00
Willy Tarreau
338a4fc2a8 BUILD: ssl: fix shctx build on older compilers
gcc < 3 breaks on shctx because of the missing arg in the lock macros.
We don't need the arg at all, it's not used.
2012-10-18 19:03:00 +02:00
Emeric Brun
ce08baa36d BUG/MINOR: build: Fix failure with USE_OPENSSL=1 and USE_FUTEX=1 on archs i486 and i686. 2012-10-05 20:00:03 +02:00
Emeric Brun
9faf071acb MINOR: ssl: add build param USE_PRIVATE_CACHE to build cache without shared memory
It removes dependencies with futex or mutex but ssl performances decrease
using nbproc > 1 because switching process force session renegotiation.

This can be useful on small systems which never intend to run in multi-process
mode.
2012-10-02 08:34:38 +02:00
Emeric Brun
4b3091e54e MINOR: ssl: disable shared memory and locks on session cache if nbproc == 1
We don't needa to lock the memory when there is a single process. This can
make a difference on small systems where locking is much more expensive than
just a test.
2012-10-02 08:34:38 +02:00
Willy Tarreau
ee2e3a4027 BUILD: ssl: use MAP_ANON instead of MAP_ANONYMOUS
FreeBSD uses the former, Linux uses the latter but generally also
defines the former as an alias of the latter. Just checked on other
OSes and AIX defines both. So better use MAP_ANON which seems to be
more commonly defined.
2012-09-04 15:45:21 +02:00
Willy Tarreau
18b2059a75 BUILD: ssl: fix shctx build on RHEL with futex
On RHEL/CentOS, linux/futex.h uses an u32 type which is never declared
anywhere. Let's set it with a #define in order to fix the issue without
causing conflicts with possible typedefs on other platforms.
2012-09-04 12:26:26 +02:00
Emeric Brun
3e541d1c03 MEDIUM: ssl: add shared memory session cache implementation.
This SSL session cache was developped at Exceliance and is the same that
was proposed for stunnel and stud. It makes use of a shared memory area
between the processes so that sessions can be handled by any process. It
is only useful when haproxy runs with nbproc > 1, but it does not hurt
performance at all with nbproc = 1. The aim is to totally replace OpenSSL's
internal cache.

The cache is optimized for Linux >= 2.6 and specifically for x86 platforms.
On Linux/x86, it makes use of futexes for inter-process locking, with some
x86 assembly for the locked instructions. On other architectures, GCC
builtins are used instead, which are available starting from gcc 4.1.

On other operating systems, the locks fall back to pthread mutexes so
libpthread is automatically linked. It is not recommended since pthreads
are much slower than futexes. The lib is only linked if SSL is enabled.
2012-09-03 22:36:33 +02:00