2020-08-25 02:23:08 +00:00
|
|
|
static inline uintptr_t __get_tp()
|
2012-07-11 08:22:13 +00:00
|
|
|
{
|
2020-08-25 02:23:08 +00:00
|
|
|
register uintptr_t tp __asm__("$3");
|
fix excessively slow TLS performance on some mips models
commit 6d99ad91e869aab35a4d76d34c3c9eaf29482bad introduced this
regression as part of a larger change, based on an incorrect
assumption that rdhwr being part of the mips r2 ISA level meant that
the TLS register, known in the mips documentation as UserLocal, was
unconditionally present on chips providing this ISA level and would
not need trap-and-emulate. this turns out to be false.
based on research by Stanislav Kljuhhin and Abilio Marques, who
reported the problem as a performance regression on certain routers
using OpenWRT vs older uclibc-based versions, it turns out the mips
manuals document the UserLocal register as a feature that might or
might not be implemented or enabled, reflected by a cpu capability bit
in the CONFIG3 register, and that Linux checks for this and has to
explicitly enable it on models that have it.
thus, it's indeed possible that r2+ chips can lack the feature,
bringing us back to the situation where Linux only has a fast
trap-and-emulate path for the case where the destination register is
$3. so, always read the thread pointer through $3. this may incur a
gratuitous move to the desired final register on chips where it's not
needed, but it really doesn't matter.
2021-08-12 22:07:44 +00:00
|
|
|
#if __mips_isa_rev < 2
|
2018-10-16 18:08:01 +00:00
|
|
|
__asm__ (".word 0x7c03e83b" : "=r" (tp) );
|
2016-04-03 10:42:37 +00:00
|
|
|
#else
|
2018-10-16 18:08:01 +00:00
|
|
|
__asm__ ("rdhwr %0, $29" : "=r" (tp) );
|
2012-09-07 16:18:14 +00:00
|
|
|
#endif
|
2020-08-25 02:23:08 +00:00
|
|
|
return tp;
|
2012-07-11 08:22:13 +00:00
|
|
|
}
|
|
|
|
|
2012-10-15 22:51:53 +00:00
|
|
|
#define TLS_ABOVE_TP
|
2018-06-01 23:52:01 +00:00
|
|
|
#define GAP_ABOVE_TP 0
|
2012-10-15 22:51:53 +00:00
|
|
|
|
2020-08-25 02:04:52 +00:00
|
|
|
#define TP_OFFSET 0x7000
|
fix local-dynamic model TLS on mips and powerpc
the TLS ABI spec for mips, powerpc, and some other (presently
unsupported) RISC archs has the return value of __tls_get_addr offset
by +0x8000 and the result of DTPOFF relocations offset by -0x8000. I
had previously assumed this part of the ABI was actually just an
implementation detail, since the adjustments cancel out. however, when
the local dynamic model is used for accessing TLS that's known to be
in the same DSO, either of the following may happen:
1. the -0x8000 offset may already be applied to the argument structure
passed to __tls_get_addr at ld time, without any opportunity for
runtime relocations.
2. __tls_get_addr may be used with a zero offset argument to obtain a
base address for the module's TLS, to which the caller then applies
immediate offsets for individual objects accessed using the local
dynamic model. since the immediate offsets have the -0x8000 adjustment
applied to them, the base address they use needs to include the
+0x8000 offset.
it would be possible, but more complex, to store the pointers in the
dtv[] array with the +0x8000 offset pre-applied, to avoid the runtime
cost of adding 0x8000 on each call to __tls_get_addr. this change
could be made later if measurements show that it would help.
2015-06-25 22:22:00 +00:00
|
|
|
#define DTP_OFFSET 0x8000
|
|
|
|
|
2015-11-02 17:39:28 +00:00
|
|
|
#define MC_PC pc
|