MINOR: config: Add the threads support in cpu-map directive

Now, it is possible to bind CPU at the thread level instead of the process level
by defining a thread set in "cpu-map" directives. Thus, its format is now:

  cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...

where <process-set> and <thread-set> must follow the format:

  all | odd | even | number[-[number]]

Having a process range and a thread range in same time with the "auto:" prefix
is not supported. Only one range is supported, the other one must be a fixed
number. But it is allowed when there is no "auto:" prefix.

Because it is possible to define a mapping for a process and another for a
thread on this process, threads will be bound on the intersection of their
mapping and the one of the process on which they are attached. If the
intersection is null, no specific binding will be set for the threads.
This commit is contained in:
Christopher Faulet 2017-11-22 16:50:41 +01:00 committed by Willy Tarreau
parent 11da456e77
commit cb6a94510d
4 changed files with 116 additions and 42 deletions

View File

@ -656,44 +656,64 @@ chroot <jail dir>
with superuser privileges. It is important to ensure that <jail_dir> is both
empty and unwritable to anyone.
cpu-map [auto:]<"all"|"odd"|"even"|process_num[-[process_num]]> <cpu-set>...
On Linux 2.6 and above, it is possible to bind a process to a specific CPU
set. This means that the process will never run on other CPUs. The "cpu-map"
directive specifies CPU sets for process sets. The first argument is the
process number to bind. This process must have a number between 1 and 32 or
64, depending on the machine's word size, and any process IDs above nbproc
are ignored. It is possible to specify a range with two such number delimited
by a dash ('-'). It also is possible to specify all processes at once using
cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...
On Linux 2.6 and above, it is possible to bind a process or a thread to a
specific CPU set. This means that the process or the thread will never run on
other CPUs. The "cpu-map" directive specifies CPU sets for process or thread
sets. The first argument is a process set, eventually followed by a thread
set. These sets have the format
all | odd | even | number[-[number]]
<number>> must be a number between 1 and 32 or 64, depending on the machine's
word size. any process IDs above nbrpoc and any thread IDs above nbthread are
ignored. It is possible to specify a range with two such number delimited by
a dash ('-'). It also is possible to specify all processes at once using
"all", only odd numbers using "odd" or even numbers using "even", just like
with the "bind-process" directive. The second and forthcoming arguments are
CPU sets. Each CPU set is either a unique number between 0 and 31 or 63 or a
range with two such numbers delimited by a dash ('-'). Multiple CPU numbers
or ranges may be specified, and the processes will be allowed to bind to all
of them. Obviously, multiple "cpu-map" directives may be specified. Each
"cpu-map" directive will replace the previous ones when they overlap.
or ranges may be specified, and the processes or threads will be allowed to
bind to all of them. Obviously, multiple "cpu-map" directives may be
specified. Each "cpu-map" directive will replace the previous ones when they
overlap. A thread will be bound on the intersection of its mapping and the
one of the process on which it is attached. If the intersection is null, no
specific binding will be set for the thread.
Ranges can be partially defined. The higher bound can be omitted. In such
case, it is replaced by the corresponding maximum value, 32 or 64 depending
on the machine's word size.
Examples:
cpu-map 1- 0- # will be replaced by "cpu-map 1-64 0-63"
# or "cpu-map 1-32 0-31" depending on the machine's
# word size.
The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process to a CPU by incrementing process and CPU
sets. To be valid, both sets must have the same size. No matter the
declaration order of the CPU sets, it will be bound from the lower to the
higher bound.
automatically bind a process or a thread to a CPU by incrementing
process/thread and CPU sets. To be valid, both sets must have the same
size. No matter the declaration order of the CPU sets, it will be bound from
the lowest to the highest bound. Having a process and a thread range with the
"auto:" prefix is not supported. Only one range is supported, the other one
must be a fixed number.
Examples:
cpu-map 1-4 0-3 # bind processes 1 to 4 on the first 4 CPUs
cpu-map 1/all 0-3 # bind all threads of the first process on the
# first 4 CPUs
cpu-map 1- 0- # will be replaced by "cpu-map 1-64 0-63"
# or "cpu-map 1-32 0-31" depending on the machine's
# word size.
# all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1
# and so on.
# and so on.
cpu-map auto:1-4 0-3
cpu-map auto:1-4 0-1 2-3
cpu-map auto:1-4 3 2 1 0
# all these lines bind the thread 1 to the cpu 0, the thread 2 to cpu 1
# and so on.
cpu-map auto:1/1-4 0-3
cpu-map auto:1/1-4 0-1 2-3
cpu-map auto:1/1-4 3 2 1 0
# bind each process to exaclty one CPU using all/odd/even keyword
cpu-map auto:all 0-63
cpu-map auto:even 0-31
@ -703,6 +723,12 @@ cpu-map [auto:]<"all"|"odd"|"even"|process_num[-[process_num]]> <cpu-set>...
cpu-map auto:1-4 0 # invalid
cpu-map auto:1 0-3 # invalid
# invalid cpu-map because automatic binding is used with a process range
# and a thread range.
cpu-map auto:all/all 0 # invalid
cpu-map auto:all/1-4 0 # invalid
cpu-map auto:1-4/all 0 # invalid
crt-base <dir>
Assigns a default directory to fetch SSL certificates from when a relative
path is used with "crtfile" directives. Absolute locations specified after

View File

@ -163,8 +163,10 @@ struct global {
} ux;
} unix_bind;
#ifdef USE_CPU_AFFINITY
unsigned long cpu_map[LONGBITS]; /* list of CPU masks for the 32/64 first processes */
__decl_hathreads(unsigned long thread_map[LONGBITS][LONGBITS]); /* list of CPU masks for the 32/64 first threads per process */
struct {
unsigned long proc[LONGBITS]; /* list of CPU masks for the 32/64 first processes */
unsigned long thread[LONGBITS][LONGBITS]; /* list of CPU masks for the 32/64 first threads per process */
} cpu_map;
#endif
struct proxy *stats_fe; /* the frontend holding the stats settings */
struct vars vars; /* list of variables for the process scope. */

View File

@ -617,7 +617,7 @@ int parse_process_number(const char *arg, unsigned long *proc, int *autoinc, cha
unsigned int low, high;
if (!isdigit((int)*arg)) {
memprintf(err, "'%s' is not a valid PROC number.\n", arg);
memprintf(err, "'%s' is not a valid number.\n", arg);
return -1;
}
@ -632,8 +632,8 @@ int parse_process_number(const char *arg, unsigned long *proc, int *autoinc, cha
}
if (low < 1 || low > LONGBITS || high > LONGBITS) {
memprintf(err, "'%s' is not a valid PROC number/range."
" It supports PROC numbers from 1 to %d.\n",
memprintf(err, "'%s' is not a valid number/range."
" It supports numbers from 1 to %d.\n",
arg, LONGBITS);
return 1;
}
@ -1706,8 +1706,9 @@ int cfg_parse_global(const char *file, int linenum, char **args, int kwm)
else if (strcmp(args[0], "cpu-map") == 0) {
/* map a process list to a CPU set */
#ifdef USE_CPU_AFFINITY
unsigned long proc = 0, cpus;
int i, n, autoinc;
char *slash;
unsigned long proc = 0, thread = 0, cpus;
int i, j, n, autoinc;
if (!*args[1] || !*args[2]) {
Alert("parsing [%s:%d] : %s expects a process number "
@ -1718,32 +1719,76 @@ int cfg_parse_global(const char *file, int linenum, char **args, int kwm)
goto out;
}
if ((slash = strchr(args[1], '/')) != NULL)
*slash = 0;
if (parse_process_number(args[1], &proc, &autoinc, &errmsg)) {
Alert("parsing [%s:%d] : %s : %s\n", file, linenum, args[0], errmsg);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
if (slash) {
if (parse_process_number(slash+1, &thread, NULL, &errmsg)) {
Alert("parsing [%s:%d] : %s : %s\n", file, linenum, args[0], errmsg);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
*slash = '/';
if (autoinc && my_popcountl(proc) != 1 && my_popcountl(thread) != 1) {
Alert("parsing [%s:%d] : %s : '%s' : unable to automatically bind "
"a process range _AND_ a thread range\n",
file, linenum, args[0], args[1]);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
}
if (parse_cpu_set((const char **)args+2, &cpus, &errmsg)) {
Alert("parsing [%s:%d] : %s : %s\n", file, linenum, args[0], errmsg);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
if (autoinc && my_popcountl(proc) != my_popcountl(cpus)) {
Alert("parsing [%s:%d] : %s : PROC range and CPU sets must have the same size to be auto-assigned\n",
if (autoinc &&
my_popcountl(proc) != my_popcountl(cpus) &&
my_popcountl(thread) != my_popcountl(cpus)) {
Alert("parsing [%s:%d] : %s : PROC/THREAD range and CPU sets "
"must have the same size to be automatically bound\n",
file, linenum, args[0]);
err_code |= ERR_ALERT | ERR_FATAL;
goto out;
}
for (i = n = 0; i < LONGBITS; i++) {
if (proc & (1UL << i)) {
if (autoinc) {
/* No mapping for this process */
if (!(proc & (1UL << i)))
continue;
/* Mapping at the process level */
if (!thread) {
if (!autoinc)
global.cpu_map.proc[i] = cpus;
else {
n += my_ffsl(cpus >> n);
global.cpu_map[i] = (1UL << (n-1));
global.cpu_map.proc[i] = (1UL << (n-1));
}
continue;
}
/* Mapping at the thread level */
for (j = 0; j < LONGBITS; j++) {
/* Np mapping for this thread */
if (!(thread & (1UL << j)))
continue;
if (!autoinc)
global.cpu_map.thread[i][j] = cpus;
else {
n += my_ffsl(cpus >> n);
global.cpu_map.thread[i][j] = (1UL << (n-1));
}
else
global.cpu_map[i] = cpus;
}
}
#else

View File

@ -2726,12 +2726,12 @@ int main(int argc, char **argv)
#ifdef USE_CPU_AFFINITY
if (proc < global.nbproc && /* child */
proc < LONGBITS && /* only the first 32/64 processes may be pinned */
global.cpu_map[proc]) /* only do this if the process has a CPU map */
global.cpu_map.proc[proc]) /* only do this if the process has a CPU map */
#ifdef __FreeBSD__
{
cpuset_t cpuset;
int i;
unsigned long cpu_map = global.cpu_map[proc];
unsigned long cpu_map = global.cpu_map.proc[proc];
CPU_ZERO(&cpuset);
while ((i = ffsl(cpu_map)) > 0) {
@ -2741,7 +2741,7 @@ int main(int argc, char **argv)
ret = cpuset_setaffinity(CPU_LEVEL_WHICH, CPU_WHICH_PID, -1, sizeof(cpuset), &cpuset);
}
#else
sched_setaffinity(0, sizeof(unsigned long), (void *)&global.cpu_map[proc]);
sched_setaffinity(0, sizeof(unsigned long), (void *)&global.cpu_map.proc[proc]);
#endif
#endif
/* close the pidfile both in children and father */
@ -2895,13 +2895,14 @@ int main(int argc, char **argv)
#ifdef USE_CPU_AFFINITY
/* Now the CPU affinity for all threads */
for (i = 0; i < global.nbthread; i++) {
if (global.cpu_map[relative_pid-1])
global.thread_map[relative_pid-1][i] &= global.cpu_map[relative_pid-1];
if (global.cpu_map.proc[relative_pid-1])
global.cpu_map.thread[relative_pid-1][i] &= global.cpu_map.proc[relative_pid-1];
if (i < LONGBITS && /* only the first 32/64 threads may be pinned */
global.thread_map[relative_pid-1][i]) /* only do this if the thread has a THREAD map */
global.cpu_map.thread[relative_pid-1][i]) /* only do this if the thread has a THREAD map */
pthread_setaffinity_np(threads[i],
sizeof(unsigned long), (void *)&global.thread_map[relative_pid-1][i]);
sizeof(unsigned long),
(void *)&global.cpu_map.thread[relative_pid-1][i]);
}
#endif /* !USE_CPU_AFFINITY */