« Scientific American Gives Up | Main | Pas un poisson...? »

avril 4, 2005

HTTP Load Balancing and Fail Over using DNS

This article describes how to set up two web servers and two load balancers in a cheap and efficient manner. Three scenarios are presented, the first where we have one Internet Service Provider (ISP), another one with two ISPs and the last one with two ISPs but with one prefered over the other.

In the web site business, we usually try to keep the best uptime possible. A great way of doing this is by having two identical web servers. In order to make two web servers look like one to the outside world, we need a load balancer. Having only one load balancer, however, introduces a new single point of failure, which is what we're trying to remove by adding a new web server. That's why we'll be using two load balancers. Other articles describe ways of doing this with CARP, I describe it using DNS, both solutions have strengths and weaknesses:

SystemStrengthsWeaknesses
DNS
  • Stateless
  • DNS is well established and tested
  • The load balancers work in parallel
  • Not really what DNS was designed for
  • Will introduce a delay (caused by a DNS timeout) when a load balancer is down
  • We need to use a small Time To Live (TTL) value for the web server's hostname, so that will increase the DNS traffic.
  • In the worst case scenario, a client will not be able to reach the web server for about 30 seconds.
CARP
  • Designed for this exact purpose.
  • Designed by the OpenBSD guys
  • In the worst case scenario, a client will not be able to reach the web server for a few seconds (existing connections will be cut).
  • Introduces states in the system
  • Another daemon is running on each server
  • Only one load balancer can work at a time (the other is a hot backup)
  • Relatively young technology/implementation

UPDATE (2005-04-22): After writing this article, I had the chance to read this analysis that says "that, while a majority of clients and local DNS servers honor DNS TTLs, a significant fraction does not (up to 47% of clients and local DNS servers collectively, and 14% of local DNS servers in our measurements). Moreover, those that violate TTLs do so by a large amount, in excess of 2 hours." That pretty much puts the last nail in the DNS approach's coffin. However as pointed out by the analysis, if you have a limited number of clients, you can make sure that they configure their DNS correctly so that they will respect your TTL values.

One ISP

 ISP1 -----<+>----------------------------------------
         Router      |       |       |       |
                     |       |       |       |
                   [LB1]   [LB2]   [WWW1]--[WWW2]

LB1 and LB2 are small servers (likely a Soekris) that have two daemons running on each one: a DNS server (tinydns) and a HTTP load balancer (pen)

WWW1 and WWW2 are two web servers, with a dedicated link in between for real-time synchronisation.

The nameservers on the registrar's list and in each DNS zone described below are: lb1.example.tld and lb2.example.tld.

LB1

This is our first load balancer/DNS server. It has a complete zone for example.tld, with a special ressource record (RR) that makes www.example.tld points to itself.

DNS RR: www.example.tld. 30 IN CNAME lb1.example.tld

tinydns syntax: Cwww.example.tld:lb1.example.tld:30

Pen syntax: pen -j /var/empty -u nobody -h lb1:80 www1:80 www2:80

LB2

This is a mirror of LB1, it makes www.example.tld point to itself.

DNS RR: www.example.tld. 30 IN CNAME lb2.example.tld

tinydns syntax: Cwww.example.tld:lb2.example.tld:30

Pen syntax: pen -j /var/empty -u nobody -h lb2:80 www1:80 www2:80

In this scenario, the only single point of failure is the network. To avoid this problem, we must use another ISP, as shown in the example below.

Two ISPs without preference between ISPs

 ISP1 -----<+>-------------------------------------------
         Router      |       |       |       |
                     |       |       |       |
                   [LB1]   [LB2]  [WWW1]--[WWW2]
                     |       |       |       |
         Router      |       |       |       |
 ISP2 -----<+>-------------------------------------------

LB1 and LB2 have four daemons running on each server: two DNS server (tinydns) and two load balancers (pen). One DNS server and one load balancer per interface.

WWW1 and WWW2 are two web servers, with a dedicated link in between for real-time synchronisation. They have three interfaces, one for each ISP and one for the synchronisation.

The nameservers on the registrar's list and in each zone below are: lb1-isp1.example.tld, lb2-isp1.example.tld, lb1-isp2.example.tld, and lb2-isp2.example.tld.

LB1-ISP1 (ISP1 network interface on LB1)

DNS RR: www.example.tld. 30 IN CNAME lb1-isp1.example.tld

tinydns syntax: Cwww.example.tld:lb1-isp1.example.tld:30

Pen syntax: pen -j /var/empty -u nobody -h lb1-isp1:80 www1-isp1:80 www2-isp1:80

LB1-ISP2 (ISP2 network interface on LB1)

DNS RR: www.example.tld. 30 IN CNAME lb1-isp2.example.tld

tinydns syntax: Cwww.example.tld:lb1-isp2.example.tld:30

Pen syntax: pen -j /var/empty -u nobody -h lb1-isp2:80 www1-isp2:80 www2-isp2:80

LB2-ISP1 (ISP1 network interface on LB2)

DNS RR: www.example.tld. 30 IN CNAME lb2-isp1.example.tld

tinydns syntax: Cwww.example.tld:lb2-isp1.example.tld:30

Pen syntax: pen -j /var/empty -u nobody -h lb1-isp1:80 www1-isp1:80 www2-isp1:80

LB2-ISP2 (ISP2 network interface on LB2)

DNS RR: www.example.tld. 30 IN CNAME lb2-isp2.example.tld

tinydns syntax: Cwww.example.tld:lb2-isp2.example.tld:30

Pen syntax: pen -j /var/empty -u nobody -h lb1-isp2:80 www1-isp2:80 www2-isp2:80

This scenario has the great advantages of being completely state-less (in a conceptual way) and having no single point of failure - except human error, maibe...

Two ISPs with preference between ISPs

   ISP1 -----<+>-------------------------------------------
 Prefered   Router      |       |       |       |
                        |       |       |       |
                      [LB1]   [LB2]  [WWW1]--[WWW2]
                        |       |       |       |
            Router      |       |       |       |
   ISP2 -----<+>-------------------------------------------

In some cases, one ISP is cheaper than the other, so for example if ISP1 was cheaper, you'd rather use it as much as possible. It's very similar to the scenario without preference, except for the DNS servers on the side of ISP2. I'll only show the configuration for these servers.

You need to have a way of figuring out if ISP1 is working. For this, we must introduce the concept of states, so this scenario is not "state-less" anymore.

If ISP1 is working (state ISP1UP), then we should only use it, otherwise we should use ISP2 (state ISP1DOWN). An easy way of doing this would be by doing a ping every 30 seconds to an external server from the ISP1 interface of each DNS servers (LB1 and LB2). This is actually the hardest step of the whole process, because ISP1 might be up only for some people, and it would be pretty hard to figure out when to consider it down or up.

LB1-ISP2 (ISP2 network interface on LB1)

DNS pseudo RR:
if (ISP1UP) then
    www.example.tld. 30 IN CNAME lb1-isp1.example.tld
else
    www.example.tld. 30 IN CNAME lb1-isp2.example.tld
end if

LB2-ISP2 (ISP2 network interface on LB2)

DNS pseudo RR:
if (ISP1UP) then
    www.example.tld. 30 IN CNAME lb2-isp1.example.tld
else
    www.example.tld. 30 IN CNAME lb2-isp2.example.tld
end if

This way, when ISP1 is running, ISP2 will only be transporting about half of the DNS queries, which should not amount to a lot of bandwidth compared to the HTTP traffic.

Posted by gfk at avril 4, 2005 7:03 PM

Comments

l'article est excellent. dans mon cas, j'essaie de de faire du multihoming sans utiliser BPG tout en gardant l'utilisation de CARP avec deux soekris 4801. c'est vraiment pas évident.. soit par exemple : ISP1 : connexion en pppoe ISP2 : connexion en pppoe je crois que pour le moment, je suis mieux d'en faire une croix.

Posted by: Frédéric Lebel at juillet 5, 2005 12:20 AM

Great Post.... DNS30 Professional Edition provides you an easy interface to interact with Amazon Route 53 service. It is a highly available and scalable DNS web service. DNS30 Professional Edition help you in providing a way to route end users to Internet applications. http://www.dns30.com/

Posted by: Dns30 [TypeKey Profile Page] at janvier 10, 2012 11:55 PM

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?