Asynchronous Block-Level Storage Replication
Go to file
Thomas Schoebel-Theuer 6de8dc9639 trans_logger: remove broken queue depth limit
It wqas broken by concept: large requests from userspace max be
split into many mrefs. Thus a limit on units of #mrefs is not
comparable to userspace limits.

Instead, the ordinary nr_requests limits on kernel device queues
should suffice to get to the intended effect.
2013-04-02 15:35:13 +02:00
docu doc: add presentation slides from LCA2013 2013-01-29 22:32:22 +01:00
pre-patches prepatch for openvzs rhel6 based sources 2013-01-23 20:06:49 +01:00
sy_old trans_logger: remove broken queue depth limit 2013-04-02 15:35:13 +02:00
testing add some small testscripts 2013-01-23 20:05:37 +01:00
userspace marsadm: Fixed check_splitbrain() to allow obsolete version link syntax 2013-03-11 14:39:17 +01:00
.gitattributes infra: add .gitignore 2013-01-08 15:53:47 +01:00
.gitignore infra: add .gitignore 2013-01-08 15:53:47 +01:00
AUTHORS all: prepare publication at github 2013-01-25 11:58:46 +01:00
brick_atomic.h infra: introduce tracing of atomics in mref 2013-01-23 20:06:52 +01:00
brick_checking.h infra: add CHECK_ASPECT() macro 2013-01-23 20:06:52 +01:00
brick_locks.h infra: rewrite brick_say to work with threads 2013-01-23 20:06:49 +01:00
brick_mem.c mem: add debugging of order0 operations 2013-01-23 20:07:02 +01:00
brick_mem.h mem: add debugging of order0 operations 2013-01-23 20:07:02 +01:00
brick_say.c brick_say: make debug messages runtime-selectable 2013-01-23 20:07:02 +01:00
brick_say.h brick_say: make debug messages runtime-selectable 2013-01-23 20:07:02 +01:00
brick.c infra: fix potential infinite brick_wait (robustness) 2013-01-23 20:06:55 +01:00
brick.h infra: simplify brick_yield() 2013-01-23 20:06:55 +01:00
ChangeLog all: prepare publication at github 2013-01-25 11:58:46 +01:00
COPYING all: prepare publication at github 2013-01-25 11:58:46 +01:00
INSTALL all: prepare publication at github 2013-01-25 11:58:46 +01:00
Kconfig mem: add debugging of order0 operations 2013-01-23 20:07:02 +01:00
lib_limiter.c lib_limiter: fix bad delay computation 2013-01-23 20:06:53 +01:00
lib_limiter.h infra: fix potential signedness problem with limiter 2013-01-23 20:06:52 +01:00
lib_log.c lib_log: safeguard seq_nr 2013-01-23 20:06:59 +01:00
lib_log.h lib_log: use standard chunk_size in log_read() 2013-01-23 20:06:59 +01:00
lib_mapfree.c infra: factor out mapfree infrastructure from aio 2013-01-23 20:07:02 +01:00
lib_mapfree.h infra: factor out mapfree infrastructure from aio 2013-01-23 20:07:02 +01:00
lib_pairing_heap.h import mars-99.tgz 2013-01-08 15:54:28 +01:00
lib_queue.h all: IO scheduling improvements, tuning 2013-01-23 20:06:49 +01:00
lib_rank.c lib_rank: fix bad ranking computation 2013-01-23 20:06:53 +01:00
lib_rank.h lib_rank: fix potential integer overflow 2013-01-23 20:06:51 +01:00
lib_timing.c infra: add lib_timing 2013-01-23 20:06:49 +01:00
lib_timing.h trans_logger: cease queue banning upon real progress 2013-01-23 20:06:57 +01:00
Makefile infra: factor out mapfree infrastructure from aio 2013-01-23 20:07:02 +01:00
mars_aio.c infra: factor out mapfree infrastructure from aio 2013-01-23 20:07:02 +01:00
mars_aio.h infra: factor out mapfree infrastructure from aio 2013-01-23 20:07:02 +01:00
mars_bio.c bio: use new mapfree infrastructure 2013-01-23 20:07:02 +01:00
mars_bio.h bio: use new mapfree infrastructure 2013-01-23 20:07:02 +01:00
mars_buf.c infra: introduce tracing of atomics in mref 2013-01-23 20:06:52 +01:00
mars_buf.h improve detection of memleaks 2013-01-20 23:23:49 +01:00
mars_check.c all: replace kthread by brick_thread wrapper 2013-01-23 20:06:50 +01:00
mars_check.h import mars-51.tgz 2013-01-08 15:54:04 +01:00
mars_client.c client: fix termination upon receiver error 2013-01-23 20:06:55 +01:00
mars_client.h client: fix termination upon receiver error 2013-01-23 20:06:55 +01:00
mars_copy.c copy: disallow write overlapping by default 2013-01-23 20:07:00 +01:00
mars_copy.h copy: disallow write overlapping by default 2013-01-23 20:07:00 +01:00
mars_dummy.c reanimate some unused old code, only for debugging 2013-01-20 23:23:44 +01:00
mars_dummy.h import mars-51.tgz 2013-01-08 15:54:04 +01:00
mars_generic.c infra: fix mm faking 2013-01-23 20:07:01 +01:00
mars_if.c if: fix kunmap() 2013-01-23 20:07:00 +01:00
mars_if.h if: add statistics on skip_sync 2013-01-23 20:06:58 +01:00
mars_net.c net: fix sock_release() leak 2013-01-23 20:07:00 +01:00
mars_net.h net: fix sock_release() leak 2013-01-23 20:07:00 +01:00
mars_server.c server: fix races, completely separate server bricks from main bricks 2013-01-23 20:07:02 +01:00
mars_server.h server: fix races, completely separate server bricks from main bricks 2013-01-23 20:07:02 +01:00
mars_sio.c all: use mapping_set_gfp_mask() everywhere 2013-01-23 20:07:01 +01:00
mars_sio.h statistics for sio 2013-01-20 23:24:09 +01:00
mars_trans_logger.c trans_logger: remove broken queue depth limit 2013-04-02 15:35:13 +02:00
mars_trans_logger.h trans_logger: remove broken queue depth limit 2013-04-02 15:35:13 +02:00
mars_usebuf.c infra: introduce tracing of atomics in mref 2013-01-23 20:06:52 +01:00
mars_usebuf.h improve detection of memleaks 2013-01-20 23:23:49 +01:00
mars.h infra: make killing of useless bricks selectable 2013-01-23 20:07:02 +01:00
meta.h import mars-118.tgz 2013-01-08 15:54:38 +01:00
NEWS all: prepare publication at github 2013-01-25 11:58:46 +01:00
README doc: improve README 2013-01-31 21:33:46 +01:00

GPLed software AS IS, sponsored by 1&1 Internet AG (www.1und1.de).

Contact: tst@1und1.de

--------------------------------

Abstract:

MARS Light is almost a drop-in replacement for DRBD
(that is, block-level storage replication).

In contrast to plain DRBD, it works _asynchronously_ and over
arbitrary distances. My regular testing runs between datacenters
in the US and Europe. MARS uses very different technology under the
hood, similar to transaction logging of database systems.

Reliability: application and replication are completely decoupled.
Networking problems (e.g. packet loss, bottlenecks) have no
impact onto your application at the primary side.

Anytime consistency: on a secondary node, its version of the
block device is always consistent in itself, but may be outdated
(represent a former state from the primary side). Thanks to
incremental replication of the transaction logfiles, usually the
lag-behind will be only a few seconds, or parts of a second.

Synchronous or near-synchronous operating modes are planned for
the future, but are expected to _reliably_ work only over short 
distances (less than 50km), due to fundamental properties
of the network.

WARNING! Current stage is BETA. Don't put productive data on it!

Documentation: currently very rudimentary, some even in German.
This will be fixed soon.

Concepts:

There is a 2-years old concept paper in German which is so much outdated,
that I don't want to publish it. Please be patient until I write a
comprehensive paper at the concept level in English.

For the meantime, please look at my presentation about MARS at LCA2013
(linux.conf.au or look into ./docu/).

History:

As you can see in the git log, it evolved from a very experimental
concept study, starting in the Summer of 2010.
At this time, I was working on it in my spare time.

In Summer 2011, an "official" internal 1&1 project started, which aimed
to deliver a proof of concept.
In February 2012, a pilot system was rolled out to an internal statistics
server, which collects statistics data from thousands of other servers,
and thus produces a very heavy random-access write load, formerly
replicated with DRBD (which led to performance problems due to massive
randomness). After switching to MARS, the performance was provably
better.
This server was selected because potential loss of statistics data
would be not be that critical as with other productive data, but
nevertheless it operates on productive data and loads.

After curing some small infancy problems, this server runs until today
(end of January 2013) without problems. Our sysadmins even switched the
primary side a few times, without informing me, so I could
sleep better at night without knowing what they did ;)

In Summer 2012, the next "official" internal 1&1 project started. Its goal
is to reach enterprise grade, and therefore to rollout MARS Light on
~10 productive servers, starting with less critical systems like ones
for test webspaces etc. This project will continue until Summer 2013.

Hopefully, there will be a followup project for mass rollout to some
thousands of servers.

In December 2012 (shortly before Christmas), I got the official permission
from our CTO Henning Kettler to publish MARS under GPL on github.

Many thanks to him!

Before that point, I was bound to my working contract which keeps internal
software as secret by default (when there is no explicit permission).

Now there is a chance to build up an opensource
community for MARS, partially outside of 1&1.

Please contribute! I will be open.

I also try to respect the guidelines from Linus, but probably this
will need more work. Help is always welcome!