DOC: update intro.txt for 2.2
A number of things have changed since last update, for example caching and fastcgi were not mentioned.
This commit is contained in:
parent
a4d9ee3d1c
commit
ec8962cb5a
162
doc/intro.txt
162
doc/intro.txt
|
@ -289,16 +289,23 @@ HAProxy is :
|
|||
|
||||
- a TCP proxy : it can accept a TCP connection from a listening socket,
|
||||
connect to a server and attach these sockets together allowing traffic to
|
||||
flow in both directions;
|
||||
flow in both directions; IPv4, IPv6 and even UNIX sockets are supported on
|
||||
either side, so this can provide an easy way to translate addresses between
|
||||
different families.
|
||||
|
||||
- an HTTP reverse-proxy (called a "gateway" in HTTP terminology) : it presents
|
||||
itself as a server, receives HTTP requests over connections accepted on a
|
||||
listening TCP socket, and passes the requests from these connections to
|
||||
servers using different connections.
|
||||
servers using different connections. It may use any combination of HTTP/1.x
|
||||
or HTTP/2 on any side and will even automatically detect the protocol
|
||||
spoken on each side when ALPN is used over TLS.
|
||||
|
||||
- an SSL terminator / initiator / offloader : SSL/TLS may be used on the
|
||||
connection coming from the client, on the connection going to the server,
|
||||
or even on both connections.
|
||||
or even on both connections. A lot of settings can be applied per name
|
||||
(SNI), and may be updated at runtime without restarting. Such setups are
|
||||
extremely scalable and deployments involving tens to hundreds of thousands
|
||||
of certificates were reported.
|
||||
|
||||
- a TCP normalizer : since connections are locally terminated by the operating
|
||||
system, there is no relation between both sides, so abnormal traffic such as
|
||||
|
@ -344,6 +351,23 @@ HAProxy is :
|
|||
compressed by the server, thus reducing the page load time for clients with
|
||||
poor connectivity or using high-latency, mobile networks.
|
||||
|
||||
- a caching proxy : it may cache responses in RAM so that subsequent requests
|
||||
for the same object avoid the cost of another network transfer from the
|
||||
server as long as the object remains present and valid. It will however not
|
||||
store objects to any persistent storage. Please note that this caching
|
||||
feature is designed to be maintenance free and focuses solely on saving
|
||||
haproxy's precious resources and not on save the server's resources. Caches
|
||||
designed to optimize servers require much more tuning and flexibility. If
|
||||
you instead need such an advanced cache, please use Varnish Cache, which
|
||||
integrates perfectly with haproxy, especially when SSL/TLS is needed on any
|
||||
side.
|
||||
|
||||
- a FastCGI gateway : FastCGI can be seen as a different representation of
|
||||
HTTP, and as such, HAProxy can directly load-balance a farm comprising any
|
||||
combination of FastCGI application servers without requiring to insert
|
||||
another level of gateway between them. This results in resource savings and
|
||||
a reduction of maintenance costs.
|
||||
|
||||
HAProxy is not :
|
||||
|
||||
- an explicit HTTP proxy, i.e. the proxy that browsers use to reach the
|
||||
|
@ -351,20 +375,15 @@ HAProxy is not :
|
|||
such as Squid. However HAProxy can be installed in front of such a proxy to
|
||||
provide load balancing and high availability.
|
||||
|
||||
- a caching proxy : it will return the contents received from the server as-is
|
||||
and will not interfere with any caching policy. There are excellent
|
||||
open-source software for this task such as Varnish. HAProxy can be installed
|
||||
in front of such a cache to provide SSL offloading, and scalability through
|
||||
smart load balancing.
|
||||
|
||||
- a data scrubber : it will not modify the body of requests nor responses.
|
||||
|
||||
- a web server : during startup, it isolates itself inside a chroot jail and
|
||||
drops its privileges, so that it will not perform any single file-system
|
||||
access once started. As such it cannot be turned into a web server. There
|
||||
are excellent open-source software for this such as Apache or Nginx, and
|
||||
HAProxy can be installed in front of them to provide load balancing and
|
||||
high availability.
|
||||
- a static web server : during startup, it isolates itself inside a chroot
|
||||
jail and drops its privileges, so that it will not perform any single file-
|
||||
system access once started. As such it cannot be turned into a static web
|
||||
server (dynamic servers are supported through FastCGI however). There are
|
||||
excellent open-source software for this such as Apache or Nginx, and
|
||||
HAProxy can be easily installed in front of them to provide load balancing,
|
||||
high availability and acceleration.
|
||||
|
||||
- a packet-based load balancer : it will not see IP packets nor UDP datagrams,
|
||||
will not perform NAT or even less DSR. These are tasks for lower layers.
|
||||
|
@ -375,33 +394,42 @@ HAProxy is not :
|
|||
3.2. How HAProxy works
|
||||
----------------------
|
||||
|
||||
HAProxy is a single-threaded, event-driven, non-blocking engine combining a very
|
||||
fast I/O layer with a priority-based scheduler. As it is designed with a data
|
||||
HAProxy is an event-driven, non-blocking engine combining a very fast I/O layer
|
||||
with a priority-based, multi-threaded scheduler. As it is designed with a data
|
||||
forwarding goal in mind, its architecture is optimized to move data as fast as
|
||||
possible with the least possible operations. As such it implements a layered
|
||||
model offering bypass mechanisms at each level ensuring data doesn't reach
|
||||
higher levels unless needed. Most of the processing is performed in the kernel,
|
||||
and HAProxy does its best to help the kernel do the work as fast as possible by
|
||||
giving some hints or by avoiding certain operation when it guesses they could
|
||||
be grouped later. As a result, typical figures show 15% of the processing time
|
||||
spent in HAProxy versus 85% in the kernel in TCP or HTTP close mode, and about
|
||||
30% for HAProxy versus 70% for the kernel in HTTP keep-alive mode.
|
||||
possible with the least possible operations. It focuses on optimizing the CPU
|
||||
cache's efficiency by sticking connections to the same CPU as long as possible.
|
||||
As such it implements a layered model offering bypass mechanisms at each level
|
||||
ensuring data doesn't reach higher levels unless needed. Most of the processing
|
||||
is performed in the kernel, and HAProxy does its best to help the kernel do the
|
||||
work as fast as possible by giving some hints or by avoiding certain operation
|
||||
when it guesses they could be grouped later. As a result, typical figures show
|
||||
15% of the processing time spent in HAProxy versus 85% in the kernel in TCP or
|
||||
HTTP close mode, and about 30% for HAProxy versus 70% for the kernel in HTTP
|
||||
keep-alive mode.
|
||||
|
||||
A single process can run many proxy instances; configurations as large as
|
||||
300000 distinct proxies in a single process were reported to run fine. Thus
|
||||
there is usually no need to start more than one process for all instances.
|
||||
300000 distinct proxies in a single process were reported to run fine. A single
|
||||
core, single CPU setup is far more than enough for more than 99% users, and as
|
||||
such, users of containers and virtual machines are encouraged to use the
|
||||
absolute smallest images they can get to save on operational costs and simplify
|
||||
troubleshooting. However the machine HAProxy runs on must never ever swap, and
|
||||
its CPU must not be artificially throttled (sub-CPU allocation in hypervisors)
|
||||
nor be shared with compute-intensive processes which would induce a very high
|
||||
context-switch latency.
|
||||
|
||||
It is possible to make HAProxy run over multiple processes, but it comes with
|
||||
a few limitations. In general it doesn't make sense in HTTP close or TCP modes
|
||||
because the kernel-side doesn't scale very well with some operations such as
|
||||
connect(). It scales pretty well for HTTP keep-alive mode but the performance
|
||||
that can be achieved out of a single process generally outperforms common needs
|
||||
by an order of magnitude. It does however make sense when used as an SSL
|
||||
offloader, and this feature is well supported in multi-process mode.
|
||||
Threading allows to exploit all available processing capacity by using one
|
||||
thread per CPU core. This is mostly useful for SSL or when data forwarding
|
||||
rates above 40 Gbps are needed. In such cases it is critically important to
|
||||
avoid communications between multiple physical CPUs, which can cause strong
|
||||
bottlenecks in the network stack and in HAProxy itself. While counter-intuitive
|
||||
to some, the first thing to do when facing some performance issues is often to
|
||||
reduce the number of CPUs HAProxy runs on.
|
||||
|
||||
HAProxy only requires the haproxy executable and a configuration file to run.
|
||||
For logging it is highly recommended to have a properly configured syslog daemon
|
||||
and log rotations in place. The configuration files are parsed before starting,
|
||||
and log rotations in place. Logs may also be sent to stdout/stderr, which can be
|
||||
useful inside containers. The configuration files are parsed before starting,
|
||||
then HAProxy tries to bind all listening sockets, and refuses to start if
|
||||
anything fails. Past this point it cannot fail anymore. This means that there
|
||||
are no runtime failures and that if it accepts to start, it will work until it
|
||||
|
@ -651,7 +679,7 @@ ensure the best global service continuity :
|
|||
HAProxy offers a fairly complete set of load balancing features, most of which
|
||||
are unfortunately not available in a number of other load balancing products :
|
||||
|
||||
- no less than 9 load balancing algorithms are supported, some of which apply
|
||||
- no less than 10 load balancing algorithms are supported, some of which apply
|
||||
to input data to offer an infinite list of possibilities. The most common
|
||||
ones are round-robin (for short connections, pick each server in turn),
|
||||
leastconn (for long connections, pick the least recently used of the servers
|
||||
|
@ -947,10 +975,10 @@ for logging purposes, which explains why it's still called "log-format". These
|
|||
strings contain escape characters allowing to introduce various dynamic data
|
||||
including variables and sample fetch expressions into strings, and even to
|
||||
adjust the encoding while the result is being turned into a string (for example,
|
||||
adding quotes). This provides a powerful way to build header contents or to
|
||||
customize log lines. Additionally, in order to remain simple to build most
|
||||
common strings, about 50 special tags are provided as shortcuts for information
|
||||
commonly used in logs.
|
||||
adding quotes). This provides a powerful way to build header contents, to build
|
||||
response data or even response templates, or to customize log lines.
|
||||
Additionally, in order to remain simple to build most common strings, about 50
|
||||
special tags are provided as shortcuts for information commonly used in logs.
|
||||
|
||||
|
||||
3.3.13. Basic features : HTTP rewriting and redirection
|
||||
|
@ -994,6 +1022,9 @@ redirects, among which :
|
|||
a specific cookie, dropping the query string, appending a slash if missing,
|
||||
and so on;
|
||||
|
||||
- a powerful "return" directive allows to customize every part of a response
|
||||
like status, headers, body using dynamic contents or even template files.
|
||||
|
||||
- all operations support ACL-based conditions;
|
||||
|
||||
|
||||
|
@ -1088,7 +1119,10 @@ server for example.
|
|||
|
||||
Each frontend and backend may use multiple independent log outputs, which eases
|
||||
multi-tenancy. Logs are preferably sent over UDP, maybe JSON-encoded, and are
|
||||
truncated after a configurable line length in order to guarantee delivery.
|
||||
truncated after a configurable line length in order to guarantee delivery. But
|
||||
it is also possible to sned them to stdout/stderr or any file descriptor, as
|
||||
well as to a ring buffer that a client can subscribe to in order to retrieve
|
||||
them.
|
||||
|
||||
|
||||
3.3.16. Basic features : Statistics
|
||||
|
@ -1106,6 +1140,9 @@ may import to draw graphs. The page may self-refresh to be used as a monitoring
|
|||
page on a large display. In administration mode, the page also allows to change
|
||||
server state to ease maintenance operations.
|
||||
|
||||
A Prometheus exporter is also provided so that the statistics can be consumed
|
||||
in a different format depending on the deployment.
|
||||
|
||||
|
||||
3.4. Advanced features
|
||||
----------------------
|
||||
|
@ -1158,6 +1195,8 @@ entries from ACLs and maps, update TLS shared secrets, apply connection limits
|
|||
and rate limits on the fly to arbitrary frontends (useful in shared hosting
|
||||
environments), and disable a specific frontend to release a listening port
|
||||
(useful when daytime operations are forbidden and a fix is needed nonetheless).
|
||||
Updating certificates and their configuration on the fly is permitted, as well
|
||||
as enabling and consulting traces of every processing step of the traffic.
|
||||
|
||||
For environments where SNMP is mandatory, at least two agents exist, one is
|
||||
provided with the HAProxy sources and relies on the Net-SNMP Perl module.
|
||||
|
@ -1233,6 +1272,10 @@ The common effects are spurious timeouts or application freezes. Thus if this
|
|||
behavior is detected on a system, it must be fixed, regardless of the fact that
|
||||
HAProxy protects itself against it.
|
||||
|
||||
On Linux, a new starting process may communicate with the previous one to reuse
|
||||
its listening file descriptors so that the listening sockets are never
|
||||
interrupted during the process' replacement.
|
||||
|
||||
|
||||
3.4.3. Advanced features : Scripting
|
||||
------------------------------------
|
||||
|
@ -1246,6 +1289,18 @@ authentication system for example. Please refer to the documentation in the file
|
|||
"doc/lua-api/index.rst" for more information on how to use Lua.
|
||||
|
||||
|
||||
3.4.4. Advanced features: Tracing
|
||||
---------------------------------
|
||||
|
||||
At any moment an administrator may connect over the CLI and enable tracing in
|
||||
various internal subsystems. Various levels of details are provided by default
|
||||
so that in practice anything between one line per request to 500 lines per
|
||||
request can be retrieved. Filters as well as an automatic capture on/off/pause
|
||||
mechanism are available so that it really is possible to wait for a certain
|
||||
event and watch it in detail. This is extremely convenient to diagnose protocol
|
||||
violations from faulty servers and clients, or denial of service attacks.
|
||||
|
||||
|
||||
3.5. Sizing
|
||||
-----------
|
||||
|
||||
|
@ -1386,7 +1441,11 @@ discover it was already fixed. This process also ensures that regressions in a
|
|||
stable branch are extremely rare, so there is never any excuse for not upgrading
|
||||
to the latest version in your current branch.
|
||||
|
||||
Branches are numbered with two digits delimited with a dot, such as "1.6". A
|
||||
Branches are numbered with two digits delimited with a dot, such as "1.6".
|
||||
Since 1.9, branches with an odd second digit are mostly focused on sensitive
|
||||
technical updates and more aimed at advanced users because they are likely to
|
||||
trigger more bugs than the other ones. They are maintained for about a year
|
||||
only and must not be deployed where they cannot be rolled back in emergency. A
|
||||
complete version includes one or two sub-version numbers indicating the level of
|
||||
fix. For example, version 1.5.14 is the 14th fix release in branch 1.5 after
|
||||
version 1.5.0 was issued. It contains 126 fixes for individual bugs, 24 updates
|
||||
|
@ -1405,6 +1464,11 @@ HAProxy is available from multiple sources, at different release rhythms :
|
|||
sources only, so whatever comes from there needs to be rebuilt and/or
|
||||
repackaged;
|
||||
|
||||
- GitHub : https://github.com/haproxy/haproxy/ : this is the mirror for the
|
||||
development branch only, which provides integration with the issue tracker,
|
||||
continuous integration and code coverage tools. This is exclusively for
|
||||
contributors;
|
||||
|
||||
- A number of operating systems such as Linux distributions and BSD ports.
|
||||
These systems generally provide long-term maintained versions which do not
|
||||
always contain all the fixes from the official ones, but which at least
|
||||
|
@ -1451,6 +1515,10 @@ branch, you need to proceed this way :
|
|||
|
||||
HA-Proxy version 1.5.0-994126-357 2015/07/02
|
||||
|
||||
In addition, versions 2.1 and above will include a "Status" line indicating
|
||||
whether the version is safe for production or not, and if so, till when, as
|
||||
well as a link to the list of known bugs affecting this version.
|
||||
|
||||
- for system-specific packages, you have to check with your vendor's package
|
||||
repository or update system to ensure that your system is still supported,
|
||||
and that fixes are still provided for your branch. For community versions
|
||||
|
@ -1531,7 +1599,15 @@ unless the traffic is low.
|
|||
When building large caching farms across multiple nodes, HAProxy can make use of
|
||||
consistent URL hashing to intelligently distribute the load to the caching nodes
|
||||
and avoid cache duplication, resulting in a total cache size which is the sum of
|
||||
all caching nodes.
|
||||
all caching nodes. In addition, caching of very small dumb objects for a short
|
||||
duration on HAProxy can sometimes save network round trips and reduce the CPU
|
||||
load on both the HAProxy and the Varnish nodes. This is only possible is no
|
||||
processing is done on these objects on Varnish (this is often referred to as
|
||||
the notion of "favicon cache", by which a sizeable percentage of useless
|
||||
downstream requests can sometimes be avoided). However do not enable HAProxy
|
||||
caching for a long time (more than a few seconds) in front of any other cache,
|
||||
that would significantly complicate troubleshooting without providing really
|
||||
significant savings.
|
||||
|
||||
|
||||
4.4. Alternatives
|
||||
|
|
Loading…
Reference in New Issue