Commit Graph

68 Commits

Author SHA1 Message Date
Thomas Schoebel-Theuer 1c3468985a infra: increase hash table 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer d60326ca42 infra: earlier stop searching in unordered list part 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer b9964cd6c6 infra: skip non-member dents and subtrees 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer bf682f1273 all: minimum link update frequency, default 10s
Otherwise sysadmins might draw the wrong conclusion that something
might be hanging, instead of taking just a long time.
2020-08-12 08:56:29 +02:00
Thomas Schoebel-Theuer bc8ff9048c main: new scalable alivelinks 2020-08-02 12:10:20 +02:00
Thomas Schoebel-Theuer d24c57e50a all: bump features version 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer 6d9ffefb84 infra: new helper mars_is_mountpoint() 2020-07-31 09:26:16 +02:00
Thomas Schoebel-Theuer 7467aa9939 infra: allow pushing links to peers 2020-07-24 22:42:46 +02:00
Thomas Schoebel-Theuer 8d9ac84b46 infra: extend cmds with 2 strings 2020-07-20 21:20:47 +02:00
Thomas Schoebel-Theuer 8946873739 infra: new trigger code conventions 2020-07-20 21:20:09 +02:00
Thomas Schoebel-Theuer 3afad273fd infra: also send prot level over dents 2020-07-08 22:14:03 +02:00
Thomas Schoebel-Theuer e938add256 main: compute worst features version in cluster 2020-04-13 10:54:19 +02:00
Thomas Schoebel-Theuer 692cb442c8 infra: separate feature version for strategy layer 2020-04-13 10:54:19 +02:00
Thomas Schoebel-Theuer 19d20567fd all: reduce brick list traversals 2020-04-13 10:52:38 +02:00
Thomas Schoebel-Theuer 343670b52d infra: remove superfluous parameter 2020-04-13 10:52:38 +02:00
Thomas Schoebel-Theuer 333760bc1a infra: simplify mars_kill_brick_when_possible() 2020-04-13 10:52:38 +02:00
Thomas Schoebel-Theuer 59c9cedeeb infra: prepare subtree creation 2020-04-13 10:52:38 +02:00
Thomas Schoebel-Theuer 5e97d05ecb infra: introduce and obey d_subtree 2020-04-13 10:52:38 +02:00
Thomas Schoebel-Theuer 52fe09c3ca infra: remove obsolete d_global 2020-04-13 10:52:38 +02:00
Thomas Schoebel-Theuer c9f7eebe24 infra: tune global hash 2020-04-13 09:55:19 +02:00
Thomas Schoebel-Theuer aed146691a infra: add constructor for mars_global 2020-04-13 09:55:19 +02:00
Thomas Schoebel-Theuer 96561ba0d3 main: userspace control for compat_deletions 2020-04-08 20:39:38 +02:00
Thomas Schoebel-Theuer 722d99487f all: remove unnecessary uid 2020-04-08 03:32:36 +02:00
Thomas Schoebel-Theuer 37348ba2c8 infra: allow ordered symlink creation 2020-04-08 03:32:34 +02:00
Thomas Schoebel-Theuer e4a83b9461 infra: introduce ordered_readlink() 2020-04-06 15:14:11 +02:00
Thomas Schoebel-Theuer 8097fe2971 infra: separate dent list retrieval for remote communication 2020-04-01 06:12:28 +02:00
Thomas Schoebel-Theuer 3ab97f26b5 infra: allow fetching full dent info from peers 2020-03-26 20:16:39 +01:00
Thomas Schoebel-Theuer 222f048937 all: adapt to new timespec64 type 2019-12-25 09:19:07 +01:00
Thomas Schoebel-Theuer 8b0d52e705 server: remove deprecated loadavg quirk 2019-12-25 09:19:06 +01:00
Thomas Schoebel-Theuer 900ed3cbd8 infra: speed up by dent hashing 2019-07-10 11:27:37 +02:00
Thomas Schoebel-Theuer ee08ab587e infra: introduce hash_table and hash_link 2019-07-10 11:27:37 +02:00
Thomas Schoebel-Theuer b1861be0a9 infa: add quick dent list for speedup 2019-07-10 11:27:37 +02:00
Thomas Schoebel-Theuer 930d33e338 infra: prepare dent quick_list speedup 2019-07-10 11:27:37 +02:00
Thomas Schoebel-Theuer ee1e1ab1bb EOL: fully merge branch 'mars0.1.y' into mars0.1a.y 2019-07-10 11:26:15 +02:00
Thomas Schoebel-Theuer c922bafa52 infra: additional global mem limit 2019-06-26 11:00:17 +02:00
Thomas Schoebel-Theuer 14d6e84fed infra: remove dead code 2019-06-26 10:57:27 +02:00
Thomas Schoebel-Theuer e393decd3c Merge branch 'mars0.1.y' into mars0.1a.y 2018-03-19 06:57:49 +01:00
Thomas Schoebel-Theuer dedaa5b55f infra: new timestamp ordering 2018-03-13 08:29:48 +01:00
Thomas Schoebel-Theuer 1022c21ac6 Merge branch 'mars0.1.y' into mars0.1a.y 2018-02-01 06:25:02 +01:00
Thomas Schoebel-Theuer 5818d254ce main: remote_trigger after deletions 2018-01-31 07:50:50 +01:00
Thomas Schoebel-Theuer a41c0f8f98 main: run some additional peer threads 2017-07-05 08:01:47 +02:00
Thomas Schoebel-Theuer c3f931f660 main: remove obsolete 1&1-specific sync feature 2017-02-20 15:29:28 +01:00
Thomas Schoebel-Theuer 0c76f0f1fd infra: wrapper for generic_{dis,}connect with locking 2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer 42a8bfaa60 all: s/light_(worker|checker)/main_\1/g 2016-03-03 08:57:07 +01:00
Thomas Schoebel-Theuer 8e2de8288d light: fix missing versionlink upon slow or defective IO
Some primary appeared to have died, and was rebooted.
In the meantime, the old secondary was forcefully switched
to primary.

Afterwards, the old primary = new secondary got stuck because 2
versionlinks, which had been _produced_ by _himself_, were
missing, but they were present at the new primary = old secondary!

How could this happen?

All transaction logfiles were fully present and correct everywhere.

However, the old primary kern.log showed that a problem with the
RAID system must have existed. In addition, the RAID controller
errorlog also reported some problems which appeared to have healed.

Problem analysis shows the following possibility:

The transaction logger can continue to write data, even via
fsync(), while the _writeback_ of other parts of the /mars filesystem
(e.g. symlink updates) got stuck for a long time due to an IO problem.

Usually, slow or even missing symlink updates are no problem because
upon recovery after a reboot, everything is healed by transaction
replay (possibly replaying much more data than really necessary,
but this does not affect semantics, and it is even advantageous
when RAID disks might contain defective data).

There is one exception: after a logrotate, the corresponding new
versionlink should appear after a small time. Otherwise, the
above mentioned scenario could emerge.

We use sync_filesystem() to ensure that any versionlink update
to a _new_ versionlink is either guaranteed to become persistent,
or (in case of IO problems) the mars_light thread will hang, which
will be (hopefully) noticed soon by monitoring.
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer aa09d7df30 all: clarify license GPLv2+ 2014-11-25 18:09:17 +01:00
Thomas Schoebel-Theuer 1439d30ffb all: port to newer kernels (up to 3.15) 2014-06-18 12:10:55 +02:00
Thomas Schoebel-Theuer 2f4696a9cc all: fix logfile size propagation 2014-03-31 06:59:09 +02:00
Thomas Schoebel-Theuer 6050b4157f infra: make string allocation fully dynamic 2014-03-26 11:43:05 +01:00
Thomas Schoebel-Theuer 2fc05b5373 light: allow limiting the sync parallelism 2014-03-19 17:49:40 +01:00