MINOR: ring: it's not x86 but all non-ARMv8.1 which needs the read before OR

Archs relying on CAS benefit from a read prior to FETCH_OR, so it's
not just x86 that benefits from this. Let's just change the condition
to only exclude __ARM_FEATURE_ATOMICS which is the only one faster
without.
This commit is contained in:
Willy Tarreau 2024-03-17 16:55:09 +01:00
parent e6fc167aec
commit 39df8c903d

View File

@ -280,10 +280,12 @@ ssize_t ring_write(struct ring *ring, size_t maxlen, const struct ist pfx[], siz
goto wait_for_flush;
__ha_cpu_relax_for_read();
#if defined(__x86_64__)
/* x86 prefers a read first */
if ((tail_ofs = HA_ATOMIC_LOAD(tail_ptr)) & RING_TAIL_LOCK)
#if !defined(__ARM_FEATURE_ATOMICS)
/* ARMv8.1-a has a true atomic OR and doesn't need the preliminary read */
if ((tail_ofs = HA_ATOMIC_LOAD(tail_ptr)) & RING_TAIL_LOCK) {
__ha_cpu_relax_for_read();
continue;
}
#endif
/* OK the queue is locked, let's attempt to get the tail lock */
tail_ofs = HA_ATOMIC_FETCH_OR(tail_ptr, RING_TAIL_LOCK);