|
|
An Intro To Load Balancing
By Jay Fougere
Expert Author
Article Date: 2009-11-03
Load balancing is a general term used to describe several technologies. Distributing computing load amongst more than one server is the common theme in all types of load balancing. Clustering can be an effective form of load balancing, however I consider clustering to be unique enough that it is beyond the scope of this article.
Let us start by looking at reasons for load balancing. Often times the load of a particular application is simply too much for a single server. Other times, a service is considered too important to allow for downtime and load balancing is used for high availability. Many times load balancing is used for a combination of reasons.
There are many methods of load balancing dependent upon what is required of the service in question. As an example, a busy website will often times be served off several servers. If no persistence is required (ie. no SSL, no sessions, etc...), simply setting up round robin DNS entries can be a simple yet effective solution.
Round robin simply cycles through IP addresses for domain records. For instance, suppose I had the following entries in DNS:
www.example.com. IN A 192.168.1.100
www.example.com. IN A 192.168.1.101
www.example.com. IN A 192.168.1.102
www.example.com. IN A 192.168.1.103
The first request for www.example.com would get the server with IP address 192.168.1.100, while the next request would get the server with IP address 192.168.1.101 and so on.
While extremely simple and effective, round robin DNS is not a solution where a website requires some sort of persistence. Let us suppose you are running a large internet forum that has unique member logins for all users of the site. Using round robin a member may log in on one server, however when the member attempts to browse the site with the session created and reaches one of the other servers, the other server will have no record of that members session, and will thus force them to log in yet again. The site will appear broken and for all intents and purposes it is broken!
This is where OSI layer 4 load balancing with persistence can be an effective solution. Generally for this configuration there will be a dedicated load balancing machine with a single public IP address for the site and (on another network interface) a private IP address that will act as a gateway address for the actual web servers which will be residing on the private network block. Since this type of load balancing works by using a type of NAT (Network Address Translation) the load balancer can keep track of what servers answered which client requests, with respect to persistence. This will allow such things as sessions to function as expected.
Layer 4 switching/load balancing can be used to solve several problems. First of all, the web servers, being on a private subnet, can be isolated from many public attacks. Secondly, servers can be added and subtracted from the pool of available servers without affecting the uptime of the website. To truly make this setup redundant, usually two or more load balancers are configured with one being a master that is monitored by the others. If the master load balancer fails, the secondary load balancer(s) will assume the function of the master.
The last example of load balancing I want to discuss is known as Layer 7 switching or content switching. The basic idea here is that I have a single very large website. Each section of this site requires its own server or servers, yet I don't want to confuse my customers by serving different sections of the site from different subdomains. Let us consider my huge site, www.example.com. I have a forum section, a CMS section, a newsfeed section and images to serve up for all of these different sections.
I could use subdomains such as forums.example.com and images.example.com however, as I mentioned, I would like to maintain the same domain name for all sections of the site. You can set up a Layer 7/content switch to handle this as follows:
www.example.com/images --> 192.168.1.100
www.example.com/forum --> 192.168.1.101
www.example.com/cms --> 192.168.1.102
www.example.com/news --> 192.168.1.103
Of course each of those IP addresses could also represent another load balancer, thus achieving high availability and performance.
Now, load balancers come in all shapes and sizes depending on features and capacity. There are many dedicated load balancing machines such as those available from Barracuda (http://www.barracudanetworks.com/ns/products/) and Foundry Networks (http://www.f5.com/products/big-ip/).
There are also several software products that will work with commodity x86 hardware. I have found that HAProxy (http://www.haproxy.org/) can be an inexpensive robust solution that runs on Linux, *BSD and Solaris. HAProxy is a high availability load balancer with Layer 7/content switching features. This is also GPL'd OSS software, thus your only expense is the time you have invested in learning to use it. Also, if you are using redundant load balancers you will need some sort of "heartbeat" program. For Linux you can use keepalived (http://www.keepalived.org/) or Linux-HA's HeartBeat (http://www.linux-ha.org/Heartbeat).
For more information, please check out: http://en.wikipedia.org/wiki/Load_balancing_%28computing%29 and http://www.oreillynet.com/pub/a/oreilly/networking/news/bourke_1100.html
I hope you enjoyed this article -- Thank you for reading!
About the Author: Jay Fougere is the IT manager for the iEntry network. He also writes occasional articles. If you have any IT questions, please direct them to Jay@ientry.com.
|
|