Commit Graph

427 Commits

Author SHA1 Message Date
Thomas Schoebel-Theuer
b9383da97c infra: remove unwanted rmdir() 2017-05-04 10:04:12 +02:00
Thomas Schoebel-Theuer
ac2c901943 infra: remove unwanted chmod() 2017-05-04 10:04:02 +02:00
Thomas Schoebel-Theuer
f654129e94 compat: disable aio when necessary 2017-05-04 09:16:17 +02:00
Thomas Schoebel-Theuer
0c714a8bfc infra: start dual compatibility with/out prepatch
Automatic detection whether the prepatch is applied or not.
2017-05-04 09:10:44 +02:00
Thomas Schoebel-Theuer
eaa6fc0efc infa: introduce wrapper layer for compatibiliy with multiple kernels
This is needed for adaptation of the out-of-tree MARS version to multiple
kernel versions.

It will be much simplified after upstream merging, and/or
removed/replaced by something better.
2017-05-04 09:09:19 +02:00
Thomas Schoebel-Theuer
79c7ffe9d4 infra: only allow compilation as a module 2017-05-04 06:14:02 +02:00
Thomas Schoebel-Theuer
d1988b3d7c copy: leave lifelock when EOF position decreases 2017-04-04 08:03:09 +02:00
Thomas Schoebel-Theuer
85ca001f9f copy: remove obsolete variable 2017-04-04 07:45:46 +02:00
Thomas Schoebel-Theuer
84a9273080 main: fix detection of logfile sequence holes 2017-02-16 07:21:09 +01:00
Thomas Schoebel-Theuer
1f11a21f53 aio: decrease context table 2017-02-09 10:13:31 +01:00
Thomas Schoebel-Theuer
1b46726241 main: avoid flipping of syncstatus update 2017-02-09 10:13:21 +01:00
Thomas Schoebel-Theuer
d726df70f3 client: correct timeout error code 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
f62a090575 copy: safeguard power_led_off 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
d897f9060e infra: fix forced shutdown of bricks 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
bb89cf0dbb infra: show brick creation timestamp in debuglogs 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
7bdf6ed6c2 infra: show additional variable in debug log 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
1080474ecc all: use new wrapper 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
e370af69e1 infra: use new wrapper 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
0c76f0f1fd infra: wrapper for generic_{dis,}connect with locking 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
f0381455cb logger: increase position update frequency 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
fec2264766 main: fix unintended reset of syncstatus 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
300881a308 main: dont reset copy start_pos on network errors 2017-01-24 11:36:26 +01:00
Thomas Schoebel-Theuer
4e80236400 main: fix hang at rmmod 2017-01-24 11:36:26 +01:00
Thomas Schoebel-Theuer
b04db9a5ef main: fix NULL pointer deref
Regression from e969219fca
2016-10-27 11:49:12 +02:00
Thomas Schoebel-Theuer
cc87a72637 if: fix merge_bvec_fn() regression for old kernels 2016-10-23 12:21:04 +02:00
Thomas Schoebel-Theuer
b6ef899ded Revert "if: remove obsolete merge_bvec_fn()"
This reverts commit d96b6e3fbf.

Altough newer kernels don't have this anymore, old kernels
need it.

Make it dependend from the kernel version.
2016-10-23 11:54:01 +02:00
Thomas Schoebel-Theuer
a92077dd5a infra: use static inline for cpu_clock() (kernel 4.7)
Avoid compiler warnings caused by minor upstream changes
(2c923e94cd9c6acff3b22f0ae29cfe65e2658b40)
2016-08-25 15:39:06 +02:00
Thomas Schoebel-Theuer
0972d2b20d infra: adapt to new crypto interface (kernel 4.6) 2016-08-25 15:39:06 +02:00
Thomas Schoebel-Theuer
d6e5b979ac aio: adapt to changes in get_unused_fd()
Only relevant for the out-of-tree version.

The AIO stuff needs to be re-implemented anyway.
2016-08-25 15:39:06 +02:00
Thomas Schoebel-Theuer
bab7ba6300 if: adapt to kernel 4.4 BLK_QC_T_NONE
see dece16353ef47d8d33f5302bc158072a9d65e26f
2016-08-25 07:16:40 +02:00
Thomas Schoebel-Theuer
d96b6e3fbf if: remove obsolete merge_bvec_fn() 2016-08-25 07:16:40 +02:00
Thomas Schoebel-Theuer
67977d7abf if: adapt bio_endio() to kernel 4.3 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
500ddbc97f bio: adapt bio_endio() to kernel 4.3 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
d04e8e23c4 if: adapt to renamed congestion handling (kernel 4.2) 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
275cc2a195 if: adapt to missing bi_cnt (kernel 4.2) 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
cf8ee66490 bio: adapt to missing BIO_EOPNOTSUPP (kernel 4.2) 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
d2abf4d64f net: adapt to new sk_net_refcnt (kernel 4.2) 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
5f6c2a25fe if: move and enable blk_cleanup_queue() 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
7d4dce3e27 infra: compatibility to new filldir_t 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
07887e1f74 net: compatibility to kernel 3.19 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
2ea01ece5f proc: fix ctl_table conventions 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
df7105dfe2 light: make lockdep happy 2016-08-25 07:16:39 +02:00
Thomas Schoebel-Theuer
3c244706a5 main: fix replay_code report in primary mode
After a primary --force, the error couldn't go away in case of
a defective logfile. Months later, sysadmins were needlessly alarmed
when looking at the primary.
2016-08-09 09:37:09 +02:00
Thomas Schoebel-Theuer
e969219fca main: safeguard versionlink appearance
In some rare cases (e.g. damaged /mars or crashed primaries),
the versionlink belonging to a  logfile may be missing.

Don't insist on the existence of a versionlink if the logfile is
stemming from myself (automatic self-repair).
2016-08-09 09:37:09 +02:00
Thomas Schoebel-Theuer
634499d3d2 all: testing of hangs 2016-08-09 09:37:09 +02:00
Thomas Schoebel-Theuer
90653476f6 all: crash testing hardening infrastructure
This is important for even more hardening of MARS.
Simulate crashes at the "wrong moment", typically with
IO requests flying, or just before a symlink update.

Only for debugging. Never use for production.
2016-08-09 09:34:19 +02:00
Thomas Schoebel-Theuer
f89e0a7d96 marsadm: lowlevel IP address commands
This is absolutely necessary for coping with changes in network
setups.
2016-03-09 09:42:38 +01:00
Thomas Schoebel-Theuer
e7f41563f2 main: fix livelock at end of sync
Only observed on very fast hardware.
Leaving the loop may unnecessarily take a long time.
2016-03-08 11:37:41 +01:00
Thomas Schoebel-Theuer
04b2f2120e Kbuild: fix external 1&1 build process 2016-03-03 12:42:41 +01:00
Thomas Schoebel-Theuer
a5f8f3e464 main: rename mars_light.c to mars_main.c 2016-03-03 09:35:16 +01:00
Thomas Schoebel-Theuer
4d31d09534 all: remove CONFIG_MARS_BIGMODULE 2016-03-03 09:33:34 +01:00
Thomas Schoebel-Theuer
daa701edf1 light: s/light_class/main_class/g 2016-03-03 09:05:01 +01:00
Thomas Schoebel-Theuer
2990b9362e light: s/light_thread/main_thread/g 2016-03-03 09:04:04 +01:00
Thomas Schoebel-Theuer
42a8bfaa60 all: s/light_(worker|checker)/main_\1/g 2016-03-03 08:57:07 +01:00
Thomas Schoebel-Theuer
dd4748bb52 light: clarify code 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
8fa728a0c9 light: fix annoying unnecessary error message 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
8abcbf196d light: safeguard sync vs replay 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
e70ac4df8c light: safeguard position update 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
fafad9512a light: always update position symlinks at logger switchoff 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
42c2dc98da light: fix typo in replay link comparison 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
a312e3d93b light: fix memory leak
regression from f235b76900
2016-03-01 11:58:09 +01:00
Thomas Schoebel-Theuer
8bc1e80488 light: safeguard skipping of logfiles in disconnected state.
Found by code inspection, neither in practice nor by testing.

Should not occur in practice, because it could only occur after
marsadm pause-fetch, which is an exceptional state only to be entered
for maintenance or for emergency failover.

Skipping over an incorrect logfile at a secondary may produce an
unnecessary split brain.

Fix the potential problem by doing it only after "primary --force",
and by never creating a new logfile, always by re-using existing
logfiles.
2016-02-10 06:44:00 +01:00
Thomas Schoebel-Theuer
f235b76900 light: fix potential deadlock on restart after inconsistent symlinks
This has been found by testing.

In extremely rare cases, such after crashes at the "wrong moment"
or after defective /mars filesystems, the replay link could show a
different length than the corresponding versionlink.

The versionlink wouldn't be updated anymore when additionally the
logfile has the same length than the replay link.

The incorrect versionlink will then lead to a lock.

Fix the problem by using the _minimum_ of all length indicators.
For safty, or when in doubt, replay more data, which will in turn
update the versionlink again to its correct value.
2016-02-10 06:24:27 +01:00
Thomas Schoebel-Theuer
8e2de8288d light: fix missing versionlink upon slow or defective IO
Some primary appeared to have died, and was rebooted.
In the meantime, the old secondary was forcefully switched
to primary.

Afterwards, the old primary = new secondary got stuck because 2
versionlinks, which had been _produced_ by _himself_, were
missing, but they were present at the new primary = old secondary!

How could this happen?

All transaction logfiles were fully present and correct everywhere.

However, the old primary kern.log showed that a problem with the
RAID system must have existed. In addition, the RAID controller
errorlog also reported some problems which appeared to have healed.

Problem analysis shows the following possibility:

The transaction logger can continue to write data, even via
fsync(), while the _writeback_ of other parts of the /mars filesystem
(e.g. symlink updates) got stuck for a long time due to an IO problem.

Usually, slow or even missing symlink updates are no problem because
upon recovery after a reboot, everything is healed by transaction
replay (possibly replaying much more data than really necessary,
but this does not affect semantics, and it is even advantageous
when RAID disks might contain defective data).

There is one exception: after a logrotate, the corresponding new
versionlink should appear after a small time. Otherwise, the
above mentioned scenario could emerge.

We use sync_filesystem() to ensure that any versionlink update
to a _new_ versionlink is either guaranteed to become persistent,
or (in case of IO problems) the mars_light thread will hang, which
will be (hopefully) noticed soon by monitoring.
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
ea48664a14 light: disallow primary from rotating over damaged logfiles
Only a secondary is allowed to do this, because we assume that
logfile replay has the property of "anytime consistency"
only there.

When a primary cannot recover after a crash due to a defective
logfile, this is not true. The primary is simply lost in such a
(rare) case. Observed 2 times during almost 8 millions of
operating hours.

In such a case, hardware is truly defective, and you have only
the following options:

1) switchover to a secondary via "primary --force", OR

2) deconstruct the resource everywhere, run fsck or similar on
whatever replica seems to be the best version,
and reconstruct the resource from scratch, OR

3) restore your backup.
2016-01-21 08:09:47 +01:00
Thomas Schoebel-Theuer
acdb9d7a42 light: fix reset of replay-code
Reset was forgotten in secondary role. Do it always whenever
a logfile is actually rotated.
2016-01-20 14:48:43 +01:00
Thomas Schoebel-Theuer
496e57e1e1 logger: add new indicator for damaged logfiles 2016-01-15 17:10:58 +01:00
Thomas Schoebel-Theuer
d67336420d light: fix becoming primary when logfiles are damaged
When logfile replay aborts with an error, becoming primary would be
impossible.
Without this, repair would be only possible by complete destruction
of the resource.

A previous version of this patch introduced
/proc/sys/mars/allow_primary_when_damaged which would complicate
the sysadmin interface. People would be unsure what to do.
2016-01-13 14:12:02 +01:00
Thomas Schoebel-Theuer
3eedff125d infra: fix comparison
Under weird circumstances, when a new symlink contents was just a
shortened version (prefix) of the old one, the symlink was not updated.
2016-01-02 10:18:33 +01:00
Thomas Schoebel-Theuer
d18c60f232 infra: fix potential fault
Very old idiotic bug.
Under some circumstances, a byte beyond the end of a non-null-terminated
string (such as produced by the VFS) might be read, potentially leading
to a page fault just one byte after a page border.
2016-01-02 10:18:33 +01:00
Thomas Schoebel-Theuer
25d954051b logger: move ranking array from stack to brick instance
Don't allocate this on the stack, it might grow too big in future.
Reduces the risk of stack overflows (not observed until now, but
suspected).
2016-01-02 10:18:22 +01:00
Thomas Schoebel-Theuer
045d0e0356 logger: fix potential deadlock caused by incorrect accounting
Never observed in practice, found by testing with kernel upstream
versions.
2016-01-02 09:43:22 +01:00
Thomas Schoebel-Theuer
c1ee80f9f4 server: fix memory leak on writes
This was unnoticed for a long time because it simply did not occur
in ordinary MARS Light workloads.
2015-10-19 07:24:20 +02:00
Thomas Schoebel-Theuer
54d8433b21 light: fix spelling 2015-10-07 10:46:04 +02:00
Thomas Schoebel-Theuer
4d8dc3a619 logger: fix spelling 2015-10-07 10:45:51 +02:00
Thomas Schoebel-Theuer
af6ac736c5 if: fix wrong error code ENOSYS 2015-10-07 10:44:44 +02:00
Thomas Schoebel-Theuer
66d200dbf1 infra: fix wrong error code ENOSYS 2015-10-07 10:44:35 +02:00
Thomas Schoebel-Theuer
c6235c71d5 aio: fix race on shutdown 2015-07-15 10:38:49 +02:00
Thomas Schoebel-Theuer
550d02935e sio: fix race on shutdown 2015-07-15 10:38:49 +02:00
Thomas Schoebel-Theuer
91f458fe66 sio: convert to new mapfree infrastructure 2015-07-15 10:38:49 +02:00
Thomas Schoebel-Theuer
c39a2988b7 light: fix long-lasting switchoff at end of sync 2015-06-17 11:33:27 +02:00
Thomas Schoebel-Theuer
4ecd6937c7 light: don't try fetching from (none) 2015-06-17 11:33:27 +02:00
Thomas Schoebel-Theuer
6eb5cefc19 infra: clean buffer cache on opening block devices 2015-06-17 11:33:18 +02:00
Thomas Schoebel-Theuer
7cbb705882 logger: safeguard endio() calling conventions 2015-05-05 08:46:28 +02:00
Thomas Schoebel-Theuer
18f1ae84f3 logger: fix race on completion refcount 2015-05-05 08:46:28 +02:00
Thomas Schoebel-Theuer
876625d66a light: disallow modprobe when UUID is missing 2015-03-23 13:48:11 +01:00
Thomas Schoebel-Theuer
9cb5b54cdc infra: remove outdated code 2015-03-23 13:48:11 +01:00
Thomas Schoebel-Theuer
7d66938666 aio: fix portability to changed kernels / kthread implementation
In the long term, mars_aio will be replaced anyway because it
uses userspace concepts like ioctx.

Don't use the internal kthread_stop_nowait() anymore.
It is too cumbersome to catch up with upstream development.
2015-03-23 13:48:10 +01:00
Thomas Schoebel-Theuer
77714f374e aio: safeguard ioctx 2015-03-23 13:48:10 +01:00
Thomas Schoebel-Theuer
a12450d891 if: fix potential race on plugged requests 2015-03-23 13:48:10 +01:00
Thomas Schoebel-Theuer
7f565f77b6 light: prohibit communication with wrong UUID 2015-03-06 11:49:54 +01:00
Thomas Schoebel-Theuer
a1d7faa2fe infra: safeguard mapfree pointers 2015-02-27 11:32:57 +01:00
Thomas Schoebel-Theuer
7ced30b24c infra: report peak IO latencies 2015-02-27 11:32:57 +01:00
Thomas Schoebel-Theuer
c35065fe97 infra: report global IO hangs 2015-02-27 11:32:57 +01:00
Thomas Schoebel-Theuer
c1823bbfab light: report actually running buildtag 2015-02-27 11:32:56 +01:00
Thomas Schoebel-Theuer
736489eccd light: suppress irrelevant warning 2015-02-24 15:51:28 +01:00
Thomas Schoebel-Theuer
036953fa54 light: provisionary allow fetch during detach 2015-02-24 15:51:28 +01:00
Thomas Schoebel-Theuer
0453fbae9b light: fix race on rmmod 2015-02-24 15:51:27 +01:00
Thomas Schoebel-Theuer
f10e7358ad light: stop syncing upon logfile holes 2015-02-24 15:51:26 +01:00
Thomas Schoebel-Theuer
827b5b5192 light: fix syncpos indication of inconsistency 2015-02-24 12:08:41 +01:00