mirror of
http://git.haproxy.org/git/haproxy.git/
synced 2025-01-09 23:39:55 +00:00
4e65fc66f6
This is the second attempt at importing the updated mt_list code (commit
59459ea3). The previous one was attempted with commit c618ed5ff4
("MAJOR:
import: update mt_list to support exponential back-off") but revealed
problems with QUIC connections and was reverted.
The problem that was faced was that elements deleted inside an iterator
were no longer reset, and that if they were to be recycled in this form,
they could appear as busy to the next user. This was trivially reproduced
with this:
$ cat quic-repro.cfg
global
stats socket /tmp/sock1 level admin
stats timeout 1h
limited-quic
frontend stats
mode http
bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3
timeout client 5s
stats uri /
$ ./haproxy -db -f quic-repro.cfg &
$ h2load -c 10 -n 100000 --npn h3 https://127.0.0.1:8443/
=> hang
This was purely an API issue caused by the simplified usage of the macros
for the iterator. The original version had two backups (one full element
and one pointer) that the user had to take care of, while the new one only
uses one that is transparent for the user. But during removal, the element
still has to be unlocked if it's going to be reused.
All of this sparked discussions with Fred and Aurélien regarding the still
unclear state of locking. It was found that the lock API does too much at
once and is lacking granularity. The new version offers a much more fine-
grained control allowing to selectively lock/unlock an element, a link,
the rest of the list etc.
It was also found that plenty of places just want to free the current
element, or delete it to do anything with it, hence don't need to reset
its pointers (e.g. event_hdl). Finally it appeared obvious that the
root cause of the problem was the unclear usage of the list iterators
themselves because one does not necessarily expect the element to be
presented locked when not needed, which makes the unlock easy to overlook
during reviews.
The updated version of the list presents explicit lock status in the
macro name (_LOCKED or _UNLOCKED suffixes). When using the _LOCKED
suffix, the caller is expected to unlock the element if it intends to
reuse it. At least the status is advertised. The _UNLOCKED variant,
instead, always unlocks it before starting the loop block. This means
it's not necessary to think about unlocking it, though it's obviously
not usable with everything. A few _UNLOCKED were used at obvious places
(i.e. where the element is deleted and freed without any prior check).
Interestingly, the tests performed last year on QUIC forwarding, that
resulted in limited traffic for the original version and higher bit
rate for the new one couldn't be reproduced because since then the QUIC
stack has gaind in efficiency, and the 100 Gbps barrier is now reached
with or without the mt_list update. However the unit tests definitely
show a huge difference, particularly on EPYC platforms where the EBO
provides tremendous CPU savings.
Overall, the following changes are visible from the application code:
- mt_list_for_each_entry_safe() + 1 back elem + 1 back ptr
=> MT_LIST_FOR_EACH_ENTRY_LOCKED() or MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
+ 1 back elem
- MT_LIST_DELETE_SAFE() no longer needed in MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
=> just manually set iterator to NULL however.
For MT_LIST_FOR_EACH_ENTRY_LOCKED()
=> mt_list_unlock_self() (if element going to be reused) + NULL
- MT_LIST_LOCK_ELT => mt_list_lock_full()
- MT_LIST_UNLOCK_ELT => mt_list_unlock_full()
- l = MT_LIST_APPEND_LOCKED(h, e); MT_LIST_UNLOCK_ELT();
=> l=mt_list_lock_prev(h); mt_list_lock_elem(e); mt_list_unlock_full(e, l)
669 lines
31 KiB
Plaintext
669 lines
31 KiB
Plaintext
MT_LIST: multi-thread aware doubly-linked lists
|
|
|
|
Abstract
|
|
--------
|
|
|
|
mt_lists are a form of doubly-linked lists that support thread-safe standard
|
|
list operations such as insert / append / delete / pop, as well as a safe
|
|
iterator that supports deletion and concurrent use.
|
|
|
|
Principles
|
|
----------
|
|
|
|
The lists are designed to minimize contention in environments where elements
|
|
may be concurrently manipulated at different locations. The principle is to
|
|
act on the links between the elements instead of the elements themselves. This
|
|
is achieved by temporarily "cutting" these links, which effectively consists in
|
|
replacing the ends of the links with special pointers serving as a lock, called
|
|
MT_LIST_BUSY. An element is considered locked when both its next and prev
|
|
pointers are equal to this MT_LIST_BUSY pointer. A link is locked when both of
|
|
its ends are equal to this MT_LIST_BUSY pointer, i.e. the next pointer of the
|
|
element at the source of the link and the prev pointer of the element the link
|
|
points to. It's worth noting that a locked link by definition no longer exists
|
|
since neither end knows where it was pointing to, unless a backup of it was
|
|
made prior to locking it.
|
|
|
|
The next and prev pointers are replaced by the list manipulation functions
|
|
using atomic exchange. This means that the caller knows if the element it tries
|
|
to replace was already locked or if it owns it. In order to replace a link,
|
|
both ends of the link must be owned by the thread willing to replace it.
|
|
Similarly when adding or removing an element, both ends of the elements must be
|
|
owned by the thread trying to manipulate the element.
|
|
|
|
Appending or inserting elements comes in two flavors: the standard one which
|
|
considers that the element is already owned by the thread and ignores its
|
|
contents; this is the most common usage for a link that was just allocated or
|
|
extracted from a list. The second flavor doesn't trust the thread's ownership
|
|
of the element and tries to own it prior to adding the element; this may be
|
|
used when this element is a shared one that needs to be placed into a list.
|
|
|
|
Removing an element always consists in owning the two links surrounding it,
|
|
hence owning the 4 pointers.
|
|
|
|
Scanning the list consists in locking the element to (re)start from, locking
|
|
the link used to jump to the next element, then locking that element and
|
|
unlocking the previous one. All types of concurrency issues are supported
|
|
there, including elements disappearing while trying to lock them. It is
|
|
perfectly possible to have multiple threads scan the same list at the same
|
|
time, and it's usually efficient. However, if those threads face a single
|
|
contention point (e.g. pause on a locked element), they may then restart
|
|
working from the same point all at the same time and compete for the same links
|
|
and elements for each step, which will become less efficient. However, it does
|
|
work fine.
|
|
|
|
There's currently no support for shared locking (e.g. rwlocks), elements and
|
|
links are always exclusively locked. Since locks are attempted in a sequence,
|
|
this creates a nested lock pattern which could theoretically cause deadlocks
|
|
if adjacent elements were locked in parallel. This situation is handled using
|
|
a rollback mechanism: if any thread fails to lock any element or pointer, it
|
|
detects the conflict with another thread and entirely rolls back its operations
|
|
in order to let the other thread complete. This rollback is what aims at
|
|
guaranteeing forward progress. There is, however, a non-null risk that both
|
|
threads spend their time rolling back and trying again. This is covered using
|
|
exponential back-off that may grow to large enough values to let a thread lock
|
|
all the pointer it needs to complete an operation. Other mechanisms could be
|
|
implemented in the future such as rotating priorities or random lock numbers
|
|
to let both threads know which one must roll back and which one may continue.
|
|
|
|
Due to certain operations applying to the type of an element (iterator, element
|
|
retrieval), some parts do require macros. In order to avoid keeping too
|
|
confusing an API, all operations are made accessible via macros. However, in
|
|
order to ease maintenance and improve error reporting when facing unexpected
|
|
arguments, all the code parts that were compatible have been implemented as
|
|
inlinable functions instead. And in order to help with performance profiling,
|
|
it is possible to prevent the compiler from inlining all the functions that
|
|
may loop. As a rule of thumb, operations which only exist as macros do modify
|
|
one or more of their arguments.
|
|
|
|
All exposed functions are called "mt_list_something()", all exposed macros are
|
|
called "MT_LIST_SOMETHING()", possibly mapping 1-to-1 to the equivalent
|
|
function, and the list element type is called "mt_list".
|
|
|
|
|
|
Operations
|
|
----------
|
|
|
|
mt_list_append(el1, el2)
|
|
Adds el2 before el1, which means that if el1 is the list's head, el2 will
|
|
effectively be appended to the end of the list.
|
|
|
|
before:
|
|
+---+
|
|
|el2|
|
|
+---+
|
|
V
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<===>|el2|<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#=====================================================================#
|
|
|
|
|
|
mt_list_try_append(el1, el2)
|
|
Tries to add el2 before el1, which means that if el1 is the list's head,
|
|
el2 will effectively be appended to the end of the list. el2 will only be
|
|
added if it's deleted (loops over itself). The operation will return zero if
|
|
this is not the case (el2 is not empty anymore) or non-zero on success.
|
|
|
|
before:
|
|
#=========#
|
|
# +---+ #
|
|
#=>|el2|<=#
|
|
+---+
|
|
V
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<===>|el2|<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#=====================================================================#
|
|
|
|
|
|
mt_list_insert(el1, el2)
|
|
Adds el2 after el1, which means that if el1 is the list's head, el2 will
|
|
effectively be insert at the beginning of the list.
|
|
|
|
before:
|
|
+---+
|
|
|el2|
|
|
+---+
|
|
V
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>|el2|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#=====================================================================#
|
|
|
|
|
|
mt_list_try_insert(el1, el2)
|
|
Tries to add el2 after el1, which means that if el1 is the list's head,
|
|
el2 will effectively be inserted at the beginning of the list. el2 will only
|
|
be added if it's deleted (loops over itself). The operation will return zero
|
|
if this is not the case (el2 is not empty anymore) or non-zero on success.
|
|
|
|
before:
|
|
#=========#
|
|
# +---+ #
|
|
#=>|el2|<=#
|
|
+---+
|
|
V
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|el1|<===>|el2|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#=====================================================================#
|
|
|
|
|
|
mt_list_delete(el1)
|
|
Removes el1 from the list, and marks it as deleted, wherever it is. If
|
|
the element was already not part of a list anymore, 0 is returned,
|
|
otherwise non-zero is returned if the operation could be performed.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|el1|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#=====================================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
+---+
|
|
#=>|el1|<=#
|
|
# +---+ #
|
|
#=========#
|
|
|
|
|
|
mt_list_behead(l)
|
|
Detaches a list of elements from its head with the aim of reusing them to
|
|
do anything else. The head will be turned to an empty list, and the list
|
|
will be partially looped: the first element's prev will point to the last
|
|
one, and the last element's next will be NULL. The pointer to the first
|
|
element is returned, or NULL if the list was empty. This is essentially
|
|
used when recycling lists of unused elements, or to grab a lot of elements
|
|
at once for local processing. It is safe to be run concurrently with the
|
|
insert/append operations performed at the list's head, but not against
|
|
modifications performed at any other place, such as delete operation.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| L |<===>| A |<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#=====================================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| L |<=# ,--| A |<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<-.
|
|
# +---+ # | +---+ +---+ +---+ +---+ +---+ +---+ |
|
|
#=========# `-----------------------------------------------------------'
|
|
|
|
|
|
mt_list_pop(l)
|
|
Removes the list's first element, returns it deleted. If the list was empty,
|
|
NULL is returned. When combined with mt_list_append() this can be used to
|
|
implement MPMC queues for example. A macro MT_LIST_POP() is provided for a
|
|
more convenient use; instead of returning the list element, it will return
|
|
the structure holding the element, taking care of preserving the NULL.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| L |<===>| A |<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#=====================================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| L |<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
+---+
|
|
#=>| A |<=#
|
|
# +---+ #
|
|
#=========#
|
|
|
|
|
|
_mt_list_lock_next(elt)
|
|
Locks the link that starts at the next pointer of the designated element.
|
|
The link is replaced by two locked pointers, and a pointer to the next
|
|
element is returned. The link must then be unlocked using
|
|
_mt_list_unlock_next() passing it this pointer, or mt_list_unlock_link().
|
|
This function is not intended to be used by applications, and makes certain
|
|
assumptions about the state of the list pertaining to its use in iterators.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|elt|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|elt|x x| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
Return
|
|
value: &B
|
|
|
|
|
|
_mt_list_unlock_next(elt, back)
|
|
Unlocks the link that starts at the next pointer of the designated element
|
|
and is supposed to end at <back>. This function is not intended to be used
|
|
by applications, and makes certain assumptions about the state of the list
|
|
pertaining to its use in iterators.
|
|
|
|
before: back
|
|
\
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|elt|x x| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|elt|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
|
|
_mt_list_lock_prev(elt)
|
|
Locks the link that starts at the prev pointer of the designated element.
|
|
The link is replaced by two locked pointers, and a pointer to the prev
|
|
element is returned. The link must then be unlocked using
|
|
_mt_list_unlock_prev() passing it this pointer, or mt_list_unlock_link().
|
|
This function is not intended to be used by applications, and makes certain
|
|
assumptions about the state of the list pertaining to its use in iterators.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
Return
|
|
value: &A
|
|
|
|
|
|
_mt_list_unlock_prev(elt, back)
|
|
Unlocks the link that starts at the prev pointer of the designated element
|
|
and is supposed to end at <back>. This function is not intended to be used
|
|
by applications, and makes certain assumptions about the state of the list
|
|
pertaining to its use in iterators.
|
|
|
|
before: back
|
|
/
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
|
|
mt_list_lock_next(elt)
|
|
Cuts the list after the specified element. The link is replaced by two
|
|
locked pointers, and is returned as a list element. The list must then
|
|
be unlocked using mt_list_unlock_link() or mt_list_unlock_full() applied
|
|
to the returned list element.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|elt|<===>| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|elt|x x| B |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
Return elt B
|
|
value: <===>
|
|
|
|
|
|
mt_list_lock_prev(elt)
|
|
Cuts the list before the specified element. The link is replaced by two
|
|
locked pointers, and is returned as a list element. The list must then
|
|
be unlocked using mt_list_unlock_link() or mt_list_unlock_full() applied
|
|
to the returned list element.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
Return A elt
|
|
value: <===>
|
|
|
|
|
|
mt_list_lock_elem(elt)
|
|
Locks the element only. Both of its pointers are replaced by two locked
|
|
pointers, and the previous ones are returned as a list element. It's not
|
|
possible to remove such an element from a list since neighbors are not
|
|
locked. The sole purpose of this operation is to prevent another thread
|
|
from visiting this element during an operation. The element must then be
|
|
unlocked using mt_list_unlock_elem() applied to the returned element.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |=> x|elt|x <=| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
Return A C
|
|
value: <===>
|
|
|
|
|
|
mt_list_unlock_elem(elt, ends)
|
|
Unlocks the element only by restoring its backed up contents from <ends>,
|
|
as returned by a previous call to mt_list_lock_elem(elt). The ends of the
|
|
links are not affected, only the element is touched. This is intended to
|
|
terminate a critical section started by a call to mt_list_lock_elem(). It
|
|
may also be used on a fully locked element processed by mt_list_lock_full()
|
|
in which case it will leave the list still locked.
|
|
|
|
before:
|
|
A C
|
|
ends: <===>
|
|
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |=> x|elt|x <=| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
before:
|
|
A C
|
|
ends: <===>
|
|
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x|elt|x x| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x <=|elt|=> x| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
|
|
mt_list_unlock_self(elt)
|
|
Unlocks the element only by resetting it (i.e. making it loop over itself).
|
|
This is useful in the locked variant of iterators when the element is to be
|
|
removed from the list and first needs to be unlocked because it's shared
|
|
with other operations (such as a concurrent attempt to delete it from a
|
|
list), or simply in case it is to be recycled in a usable state. The ends
|
|
of the links are not affected, only the element is touched. This is
|
|
normally only used from within locked iterators, which perform a full lock
|
|
(both links are locked).
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x|elt|x x| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>|elt|<=# #=>| A |x x| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ # # +---+ +---+ +---+ +---+ +---+ #
|
|
#=========# #=================================================#
|
|
|
|
|
|
mt_list_lock_full(elt)
|
|
Locks both the element and its surrounding links. The extremities of the
|
|
previous links are returned as a single list element (which corresponds to
|
|
the element's before locking). The list must then be unlocked using
|
|
mt_list_unlock_full() to reconnect the element to the list and unlock
|
|
both, or mt_list_unlock_link() to effectively remove the element.
|
|
|
|
before:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x|elt|x x| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
Return A C
|
|
value: <=============>
|
|
|
|
|
|
mt_list_unlock_link(ends)
|
|
Connects two ends in a list together, effectively unlocking the list if it
|
|
was locked. It takes a list head which contains a pointer to the prev and
|
|
next elements to connect together. It normally is a copy of a previous link
|
|
returned by functions such as mt_list_lock_next(), mt_list_lock_prev(), or
|
|
mt_list_lock_full(). If applied after mt_list_lock_full(), it will result
|
|
in the list being reconnected without the element, which remains locked,
|
|
effectively deleting it. Note that this is not meant to be used from within
|
|
iterators, as the iterator will automatically and safely reconnect ends
|
|
after each iteration.
|
|
|
|
before:
|
|
A C
|
|
Ends: <===>
|
|
|
|
+---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ #
|
|
#=================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ #
|
|
#=================================================#
|
|
|
|
|
|
mt_list_unlock_full(elt, ends)
|
|
Connects the specified element to the elements pointed to by the specified
|
|
<ends>, which is a backup copy of the previous list member of the element
|
|
prior to locking it using mt_list_lock_full() or mt_list_lock_elem(). This
|
|
is normally used to unlock an element and a list, but may also be used to
|
|
manually insert an element into an opened list (which should still be
|
|
locked). The element's list member is technically assigned a copy of <ends>
|
|
and both sides point to the element. This must not be used inside an
|
|
iterator as it would also unlock the list itself and make the loop visit
|
|
nodes in an unknown state.
|
|
|
|
before:
|
|
+---+
|
|
elt: x|elt|x
|
|
+---+
|
|
A C
|
|
ends: <=============>
|
|
|
|
+---+ +---+ +---+ +---+ +---+
|
|
#=>| A |x x| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
after:
|
|
+---+ +---+ +---+ +---+ +---+ +---+
|
|
#=>| A |<===>|elt|<===>| C |<===>| D |<===>| E |<===>| F |<=#
|
|
# +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
#===========================================================#
|
|
|
|
|
|
MT_LIST_FOR_EACH_ENTRY_LOCKED(item, list_head, member, back)
|
|
Iterates <item> through a list of items of type "typeof(*item)" which are
|
|
linked via a "struct mt_list" member named <member>. A pointer to the head
|
|
of the list is passed in <list_head>. <back> is a temporary struct mt_list,
|
|
used internally. It contains a copy of the contents of the current item's
|
|
list member before locking it. This macro is implemented using two nested
|
|
loops, each defined as a separate macro for easier inspection. The inner
|
|
loop will run for each element in the list, and the outer loop will run
|
|
only once to do some cleanup and unlocking when the end of the list is
|
|
reached or user breaks from inner loop. It is safe to break from this macro
|
|
as the cleanup will be performed anyway, but it is strictly forbidden to
|
|
branch (goto or return) from the loop because skipping the cleanup will
|
|
lead to undefined behavior. During the scan of the list, the item has both
|
|
of its links locked, so concurrent operations on the list are safe. However
|
|
the thread holding the list locked must be careful not to perform other
|
|
locking operations. In order to remove the current element, setting <item>
|
|
to NULL is sufficient to make the inner loop not try to re-attach it. It is
|
|
recommended to reinitialize it though if it is expected to be reused, so as
|
|
not to leave its pointers locked. Same if other threads are trying to
|
|
concurrently operate on the element.
|
|
|
|
From within the loop, the list looks like this:
|
|
|
|
MT_LIST_FOR_EACH_ENTRY_LOCKED(item, lh, list, back) {
|
|
// A C
|
|
// back: <=============>
|
|
// item->list
|
|
// +---+ +---+ +-V-+ +---+ +---+ +---+
|
|
// #=>|lh |<===>| A |x x| |x x| C |<===>| D |<===>| E |<=#
|
|
// # +---+ +---+ +---+ +---+ +---+ +---+ #
|
|
// #===========================================================#
|
|
}
|
|
|
|
This means that only the current item as well as its two neighbors are
|
|
locked. It is thus possible to act on any other part of the list in
|
|
parallel (other threads might have begun slightly earlier). However if
|
|
a thread is too slow to proceed, other threads may quickly reach its
|
|
position, and all of them will then wait on the same element, slowing
|
|
down the progress.
|
|
|
|
|
|
MT_LIST_FOR_EACH_ENTRY_UNLOCKED(item, list_head, member, back)
|
|
Iterates <item> through a list of items of type "typeof(*item)" which are
|
|
linked via a "struct mt_list" member named <member>. A pointer to the head
|
|
of the list is passed in <list_head>. <back> is a temporary struct mt_list,
|
|
used internally. It contains a copy of the contents of the current item's
|
|
list member before resetting it. This macro is implemented using two nested
|
|
loops, each defined as a separate macro for easier inspection. The inner
|
|
loop will run for each element in the list, and the outer loop will run
|
|
only once to do some cleanup and unlocking when the end of the list is
|
|
reached or user breaks from inner loop. It is safe to break from this macro
|
|
as the cleanup will be performed anyway, but it is strictly forbidden to
|
|
branch (goto or return) from the loop because skipping the cleanup will
|
|
lead to undefined behavior. During the scan of the list, the item has both
|
|
of its neighbours locked, with both of its ends pointing to itself. Thus,
|
|
concurrent walks on the list are safe, but not direct accesses to the
|
|
element. In order to remove the current element, setting <item> to NULL is
|
|
sufficient to make the inner loop not try to re-attach it. There is no need
|
|
to reinitialize it since it is already done. If the element is left, it will
|
|
be re-attached to the list. This version is meant as a more user-friendly
|
|
method to walk over a list in which it is known by design that elements are
|
|
not directly accessed (e.g. a pure MPMC queue). The typical pattern which
|
|
corresponds to this case is when the first operation in the iterator's body
|
|
is a call to unlock the iterator, which is then no longer needed (though
|
|
harmless).
|
|
|
|
From within the loop, the list looks like this:
|
|
|
|
MT_LIST_FOR_EACH_ENTRY_UNLOCKED(item, lh, list, back) {
|
|
// back: A C
|
|
// item->list <===>
|
|
// +-V-+ +---+ +---+ +---+ +---+ +---+
|
|
// #>| |<# #=>|lh |<===>| A |x x| C |<===>| D |<===>| E |<=#
|
|
// # +---+ # # +---+ +---+ +---+ +---+ +---+ #
|
|
// #=======# #=================================================#
|
|
}
|
|
|
|
This means that only the current item's neighbors are locked. It is thus
|
|
possible to act on any other part of the list in parallel (other threads
|
|
might have begun slightly earlier) but not on the element. However if a
|
|
thread is too slow to proceed, other threads may quickly reach its
|
|
position, and all of them will then wait on the same element, slowing down
|
|
the progress.
|
|
|
|
|
|
Examples
|
|
--------
|
|
|
|
The example below collects up to 50 jobs from a shared list that are compatible
|
|
with the current thread, and moves them to a local list for later processing.
|
|
The same pointers are used for both lists and placed in an anonymous union.
|
|
|
|
struct job {
|
|
union {
|
|
struct list list;
|
|
struct mt_list mt_list;
|
|
};
|
|
unsigned long thread_mask; /* 1 bit per eligible thread */
|
|
/* struct-specific stuff below */
|
|
...
|
|
};
|
|
|
|
extern struct mt_list global_job_queue;
|
|
extern struct list local_job_queue;
|
|
|
|
struct mt_list back;
|
|
struct job *item;
|
|
int budget = 50;
|
|
|
|
/* collect up to 50 shared items */
|
|
MT_LIST_FOR_EACH_ENTRY_LOCKED(item, &global_job_queue, mt_list, back) {
|
|
if (!(item->thread_mask & current_thread_bit))
|
|
continue; /* job not eligible for this thread */
|
|
LIST_APPEND(&local_job_queue, &item->list);
|
|
item = NULL;
|
|
if (!--budget)
|
|
break;
|
|
}
|
|
|
|
/* process extracted items */
|
|
LIST_FOR_EACH(item, &local_job_queue, list) {
|
|
...
|
|
}
|