doc: section on reliability CentralStorage vs LocalSharding

This commit is contained in:
Thomas Schoebel-Theuer 2018-05-26 23:22:59 +02:00 committed by Thomas Schoebel-Theuer
parent 4bdcba5ca5
commit a5b038c3b6
1 changed files with 650 additions and 0 deletions

View File

@ -1224,6 +1224,656 @@ reference "sec:Distributed-vs-Local:"
. .
\end_layout \end_layout
\begin_layout Subsection
Reliability Differences CentralStorage vs Sharding
\begin_inset CommandInset label
LatexCommand label
name "subsec:Reliability-Differences-CentralStorage"
\end_inset
\end_layout
\begin_layout Standard
In this section, we look at
\emph on
fatal
\emph default
failures only, ignoring temporary failures.
A fatal failure of a storage is an incident which needs to be corrected
by
\series bold
restore from backup
\series default
.
\end_layout
\begin_layout Standard
By definition, even a
\emph on
highly redundant
\emph default
CentralStorage is
\emph on
nevertheless
\emph default
a SPOF = Single Point of Failure.
This also applies to fatal failures.
\end_layout
\begin_layout Standard
Some people are incorrectly arguing with redundancy.
However, the problem is that
\emph on
any
\emph default
system, even a highly redundant one, can fail fatally.
There exists no perfect system on earth.
One of the biggest known sources of fatal failure is
\series bold
human error
\series default
.
\end_layout
\begin_layout Standard
In contrast, sharded storage (for example the LocalSharding model, see also
section
\begin_inset CommandInset ref
LatexCommand ref
reference "subsec:Variants-of-Sharding"
\end_inset
) has MPOF = Multiple Points Of Failure.
It is unlikely that many shards are failing fatally at the same time, because
shards are
\emph on
independent
\emph default
\begin_inset Foot
status open
\begin_layout Plain Layout
When all shards are residing in the same datacenter, there exists a SPOF
by power loss or other impacts onto the whole datacenter.
However, this applies to both the CentralStorage and to the LocalSharding
model.
In contrast to CentralStorage, LocalSharding can be more easily distributed
over multiple datacenters.
\end_layout
\end_inset
from each other by definition.
\end_layout
\begin_layout Standard
What is the difference from the viewpoint of customers of the services?
\end_layout
\begin_layout Standard
When a CentralStorage fails fatally, a
\emph on
huge
\emph default
number of customers will be affected for a
\emph on
long
\emph default
time (see the example German webhoster mentioned in section
\begin_inset CommandInset ref
LatexCommand ref
reference "subsec:Latencies-and-Throughput"
\end_inset
).
Reason: restore from backup will take extremely long because huge masses
of data have to be restored.
MTBF = Mean Time Between Failures is (hopefully) longer thanks to redundancy,
but MTTR = Mean Time To Repair is also very long.
\end_layout
\begin_layout Standard
With (Local)Sharding, the risk of
\emph on
some
\emph default
fatal incident
\emph on
somewhere
\emph default
in the sharding pool is higher, but the
\series bold
\emph on
size
\series default
\emph default
of such an incident is smaller in three dimensions at the same time:
\end_layout
\begin_layout Enumerate
There are much
\series bold
less customers affected
\series default
(typically only
\begin_inset Formula $1$
\end_inset
shard out of
\begin_inset Formula $n$
\end_inset
shards).
\end_layout
\begin_layout Enumerate
\series bold
MTTR
\series default
= Mean Time To Repair is typically much better because there is much less
data to be restored.
\end_layout
\begin_layout Enumerate
\series bold
Residual risk
\series default
plus resulting fatal damage by
\series bold
un-repairable problems
\series default
is thus lower.
\end_layout
\begin_layout Standard
What does this mean from the viewpoint of an investor of a big
\begin_inset Quotes eld
\end_inset
global player
\begin_inset Quotes erd
\end_inset
company?
\end_layout
\begin_layout Standard
As is promised by the vendors, let us assume that failure of CentralStorage
might be occurring less frequently.
But
\emph on
when
\emph default
it happens on
\series bold
enterprise-critical mass data
\series default
, the stock exchange value of the affected company will be exposed to a
\series bold
hazard
\series default
.
This is not bearable from the viewpoint of an investor.
\end_layout
\begin_layout Standard
In contrast, the (Local)Sharding model is
\emph on
distributing
\emph default
the
\series bold
indispensible incidents
\series default
(because
\series bold
perfect systems do not exist
\series default
, and
\series bold
perfect humans do not exist
\series default
) to a lower number of customers with higher frequency, such that the
\series bold
total impact onto the business
\series default
becomes bearable.
\end_layout
\begin_layout Standard
Risk analysis of enterprise-critical use cases is summarized in the following
table:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="8" columns="3">
<features tabularvalignment="middle">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top" width="0pt">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
CentralStorage
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
(Local)Sharding
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Probability of
\emph on
some
\emph default
fatal incident
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
lower
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
higher
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
# Customers affected
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
very high
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
very low
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
MTBF per storage
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
higher
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
lower
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
MTTR per storage
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
higher
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
lower
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Unrepairable residual risk
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
higher
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
lower
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Total impact
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
higher
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
lower
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Investor's risk
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
unbearable
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
stock exchange compatible
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
\noindent
Summary: CentralStorage is something for
\end_layout
\begin_layout Itemize
\noindent
small to medium-sized companies which don't have the
\series bold
manpower
\series default
and the
\series bold
skills
\series default
for professionally building and operating a (Local)Sharding (or similar)
system for their enterprise-critical mass data their business is relying
upon.
\end_layout
\begin_layout Itemize
\series bold
\emph on
monolithic
\emph default
enterprise applications
\series default
like classical SAP which are anyway bound to a specific vendor, where you
cannot select a different solution (so-called
\series bold
Vendor Lock-In
\series default
).
\end_layout
\begin_layout Itemize
when your application
\series bold
is neither shardable
\series default
by construction (c.f.
section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Distributed-vs-Local:"
\end_inset
), or when doing so would be a too high effort,
\series bold
nor going to BigCluster
\begin_inset Foot
status open
\begin_layout Plain Layout
Theoretically, BigCluster can be used to create 1 single huge remote LV
(or 1 single huge remote FS instance) out of a pool of storage machines.
Double-check, better triple-check that such a
\series bold
big
\emph on
logical
\emph default
SPOF
\series default
is
\emph on
really
\emph default
needed, and cannot be circumvented by any means.
Only in such a case, the current version of MARS cannot help (yet), because
its
\emph on
current
\emph default
\emph on
focus
\emph default
is on a big number of machines each having relatively small LVs.
At 1&1 ShaHoLin, the biggest LVs are 40TiB at the moment, running for years
now, and bigger ones are certainly possible.
Only when current local RAID technology with external enclosures cannot
easily create a single LV in the petabyte scale, BigCluster is probably
the better solution (c.f.
section
\begin_inset CommandInset ref
LatexCommand vref
reference "sec:Reliability-Arguments-from"
\end_inset
).
\end_layout
\end_inset
\series default
(e.g.
Ceph / Swift / etc, see secion
\begin_inset CommandInset ref
LatexCommand vref
reference "sec:Reliability-Arguments-from"
\end_inset
) is an option.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
If you have an
\emph on
already sharded
\emph default
system, e.g.
in webhosting, don't convert it to a non-shardable one, and don't introduce
SPOFs needlessly.
You will introduce
\series bold
technical debts
\series default
which are likely to hurt back somewhen in future!
\end_layout
\begin_layout Standard
As a real big
\begin_inset Quotes eld
\end_inset
global player
\begin_inset Quotes erd
\end_inset
, or as a company being part of such a structure, you should be careful
when listening to
\begin_inset Quotes eld
\end_inset
marketing drones
\begin_inset Quotes erd
\end_inset
of proprietary CentralStorage vendors.
Always check your
\emph on
concrete
\emph default
use case.
Never believe in wrongly generalized claims, which are only valid in some
specific context, but do not really apply to your use case.
It could be about your
\emph on
life
\emph default
.
\end_layout
\begin_layout Subsection \begin_layout Subsection
Proprietary vs OpenSource Proprietary vs OpenSource
\begin_inset CommandInset label \begin_inset CommandInset label