haproxy

Commit Graph

Author	SHA1	Message	Date
Willy Tarreau	eb3d5f464d	MEDIUM: ring: use the topmost bit of the tail as a lock We're now locking the tail while looking for some room in the ring. In fact it's still while writing to it, but the goal definitely is to get rid of the lock ASAP. For this we reserve the topmost bit of the tail as a lock, which may have as a possible visible effect that buffers will be limited to 2GB instead of 4GB on 32-bit machines (though in practise, good luck for allocating more than 2GB contiguous on 32-bit), but in practice since the size is read with atol() and some operating systems limit it to LONG_MAX unless passing negative numbers, the limit is already there. For now the impact on x86_64 is significant (drop from 2.35 to 1.4M/s on 48 threads on EPYC 24 cores) but this situation is only temporary so that changes can be reviewable and bisectable. Other approaches were attempted, such as using XCHG instead, which is slightly faster on x86 with low thread counts (but causes more write contention), and forces readers to stall under heavy traffic because they can't access a valid value for the queue anymore. A CAS requires preloading the value and is les good on ARMv8.1. XADD could also be considered with 12-13 upper bits of the offset dedicated to locking, but that looks overkill.	2024-03-25 17:34:19 +00:00
Willy Tarreau	dd8ea5d928	MEDIUM: ring: align the head and tail fields in the ring_storage structure We really want to let the readers and writers act on different areas, so we want to have the tail and the head on separate cache lines, themselves separate from the rest of the ring. Doing so improves the performance from 2.15 to 2.35M msg/s at 48 threads on a 24-core EPYC. This increases the header space from 32 to 192 bytes when threads are enabled. But since we already have the header size available in the file, haring remains able to detect the aligned vs unaligned formats and call dump_v2a() when aligned is detected.	2024-03-25 17:34:19 +00:00
Willy Tarreau	bf3dead20c	MEDIUM: ring: remove the struct buffer from the ring The purpose is to store a head and a tail that are independent so that we can further improve the API to update them independently from each other. The struct was arranged like the original one so that as long as a ring has its head set to zero (i.e. no recycling) it will continue to work. The new format is already detectable thanks to the "rsvd" field which indicates the number of reserved bytes at the beginning. It's located where the buffer's area pointer previously was, so that older versions of haring can continue to open the ring in repair mode, and newer ones can use the fact that the upper bits of that variable are zero to guess that it's working with the new format instead of the old one. Also let's keep in mind that the layout will further change to place some alignment constraints. The haring tool will thus updated based on this and it detects that the rsvd field is smaller than a page and that the sum of it with the size equals the mapped size, in which case it uses the new dump_v2() function instead of dump_v1(). The new function also creates a buffer from the ring's area, size, head and tail and calls the generic one so that no other code had to be adapted.	2024-03-25 17:34:19 +00:00
Willy Tarreau	88e141b823	DEV: haring: automatically use the advertised ring header size Instead of emitting a warning, since we don't need the ring struct anymore, we can just read what we need, parse the buffer and use the advertised offset. Thus for now -f is simply ignored.	2024-03-09 11:23:52 +01:00
Willy Tarreau	77d7c35243	DEV: haring: split the code between ring and buffer By splitting the initialization and the parsing of the ring, we'll ease the support for multiple ring sizes and get rid of the annoyances of the optional lock.	2024-03-09 11:23:52 +01:00
Willy Tarreau	4dddbb63a0	DEV: haring: make haring not depend on the struct ring itself haring needs to be self-sufficient about the ring format so that it continues to build when the ring API changes. Let's import the struct ring definition and call it "ring_v1".	2024-03-09 11:23:52 +01:00
Willy Tarreau	cbbee15462	CLEANUP: ring: rename the ring lock "RING_LOCK" instead of "LOGSRV_LOCK" The ring lock was initially mostly used for the logs and used to inherit its name in lock stats. Now that it's exclusively used by rings, let's rename it accordingly.	2023-09-20 21:38:33 +02:00
Willy Tarreau	b83bf68ec0	DEV: haring: update readme to suggest using the same build options for haring It's not necessarily obvious so better suggest it there to use the same build options, and indicate the tradeoffs (e.g. depend on more libs).	2023-05-04 08:13:44 +02:00
Willy Tarreau	46e0ea33e2	DEV: haring: automatically disable DEBUG_STRICT Ideally haring should be compiled with the same options as haproxy so that ring headers have the same size (e.g. with/without locks, with/ without lock debugging). But when enabling DEBUG_STRICT, BUG_ON() is enabled and breaks the build by making references to complain() and ha_backtrace_to_stderr(). Let's just disable DEBUG_STRICT before opening include files. This is sufficient to address the problem. This may be backorted to older versions that include haring.	2023-05-04 08:09:02 +02:00
Willy Tarreau	d9c7188633	MEDIUM: ring: make the offset relative to the head/tail instead of absolute The ring's offset currently contains a perpetually growing custor which is the number of bytes written from the start. It's used by readers to know where to (re)start reading from. It was made absolute because both the head and the tail can change during writes and we needed a fixed position to know where the reader was attached. But this is complicated, error-prone, and limits the ability to reduce the lock's coverage. In fact what is needed is to know where the reader is currently waiting, if at all. And this location is exactly where it stored its count, so the absolute position in the buffer (the seek offset from the first storage byte) does represent exactly this, as it doesn't move (we don't realign the buffer), and is stable regardless of how head/tail changes with writes. This patch modifies this so that the application code now uses this representation instead. The most noticeable change is the initialization, where we've kept ~0 as a marker to go to the end, and it's now set to the tail offset instead of trying to resolve the current write offset against the current ring's position. The offset was also used at the end of the consuming loop, to detect if a new write had happened between the lock being released and taken again, so as to wake the consumer(s) up again. For this we used to take a copy of the ring->ofs before unlocking and comparing with the new value read in the next lock. Since it's not possible to write past the current reader's location, there's no risk of complete rollover, so it's sufficient to check if the tail has changed. Note that the change also has an impact on the haring consumer which needs to adapt as well. But that's good in fact because it will rely on one less variable, and will use offsets relative to the buffer's head, and the change remains backward-compatible.	2023-02-24 09:26:30 +01:00
Willy Tarreau	e06ba90318	DEV: haring: add a new option "-r" to automatically repair broken files In case a file-backed ring was not properly synced before being dumped, the output can look bogus due to the head pointer not being perfectly up to date. In this case, passing "-r" will make haring automatically skip entries not starting with a zero, and resynchronize with the rest of the messages. This should be backported to 2.6.	2023-01-24 12:13:14 +01:00
Willy Tarreau	cc51c9a733	DEV: haring: support remapping LF in contents with CR VT Some traces may contain LF characters which are quite cumbersome to deal with using the common tools. Given that the utility still has access to the raw traces and knows where the delimiters are, let's offer the possibility to remap LF characters to a different sequence. Here we're using CR VT which will have the same visual appearance but will remain on the same line for grep etc. This behavior is enabled by the -l option. It's not enabled by default because it's 50% slower due to content processing.	2022-08-12 12:11:30 +02:00
Willy Tarreau	75014fcd4d	DEV: haring: add a simple utility to read file-backed rings With the ability to back a memory ring into an mmapped file, it makes sense to be able to dump these files. That's what this utility does. The entire ring is dumped to stdout. It's well suited to large dumps, it converts roughly 6 GB of logs per second. The utility is really meant for developers at the moment. It might evolve into a more general tool but at the moment it's still possible that it might need to be run under gdb to process certain crash dumps. Also at the moment it must not be used on a ring being actively written to or it will dump garbage. The code is made so that we can envision later to attach to a live ring and dump live contents, but this requires that the utility is built with the exact same options (threads etc), and that the file is opened read-write. For now these parts have been commented out, waiting for a reasonably balanced and non-intrusive solution to be found (e.g. signals must be intercepted so that the tool cannot leave the ring with a watcher present). If it is detected that the memory layout of the ring struct differs, a warning is emitted. At the end, if an error occurs, a warning is printed as well (this does happen when the process is not cleanly stopped, but it indicates the end was reached).	2022-08-12 11:48:32 +02:00

13 Commits