Load Balancing Web Applications
This article offers an overview of several approaches to load balancing on Web application server clusters. A cluster is a group of servers running a Web application simultaneously, appearing to the world as if it were a single server. To balance server load, the system distributes requests to different nodes within the server cluster, with the goal of optimizing system performance. This results in higher availability and scalability — necessities in an enterprise, Web-based application.
High availability can be defined as redundancy. If one server cannot handle a request, can other servers in the cluster handle it? In a highly available system, if a single Web server fails, then another server takes over, as transparently as possible, to process the request.
Scalability is an application’s ability to support a growing number of users. If it takes an application 10 milliseconds(ms) to respond to one request, how long does it take to respond to 10,000 concurrent requests? Infinite scalability would allow it to respond to those in 10 ms; in the real world, it’s somwhere between 10 ms and a logjam. Scalability is a measure of a range of factors, including the number of simultaneous users a cluster can support and the time it takes to process a request.
Of the many methods available to balance a server load, the main two are:
- DNS round robin and
- Hardware load balancers.
DNS Round Robin
As most readers of ONJava probably know, the Domain Name Server (DNS) database maps host names to their IP addresses.
When you enter a URL in a browser (say, www.loadbalancedsite.com ), the browser sends a request to the DNS asking it to return the IP address of the site. This is called the DNS lookup. After the Web browser gets the IP address for that site, it contacts the site using the IP address, and displays the page for you.
The DNS server generally contains a single IP address mapped to a particular site name. In our fictional example, our site www.loadbalancedsite.com maps to the IP address 220.127.116.11
To balance server loads using DNS, the DNS server maintains several different IP addresses for a site name. The multiple IP addresses represent the machines in the cluster, all of which map to the same single logical site name. Using our example, www.loadbalancedsite.com could be hosted on three machines in a cluster with the following IP addresses:
In this case, the DNS server contains the following mappings:
When the first request arrives at the DNS server, it returns the IP address 18.104.22.168, the first machine. On the second request, it returns the second IP address: 22.214.171.124. And so on. On the fourth request, the first IP address is returned again.
Using the above DNS round robin, all of the requests to the particular site have been evenly distributed among all of the machines in the cluster. Therefore, with the DNS round robin method of load balancing, all of the nodes in the cluster are exposed to the net.
Advantages of DNS Round Robin
The main advantages of DNS round robin are that it’s cheap and easy:
Inexpensive and easy to set up. The system administrator only needs to make a few changes in the DNS server to support round robin, and many of the newer DNS servers already include support. It doesn’t require any code change to the Web application; in fact, Web applications aren’t aware of the load-balancing scheme in front of it.
Simplicity. It does not require any networking experts to set up or debug the system in case a problem arises.
Disadvantages of DNS Round Robin
Two main disadvantages of this software-based method of load balancing are that it offers no real support for server affinity and doesn’t support high availability.
No support for server affinity . Server affinity is a load-balancing system’s ability to manage a user’s requests, either to a specific server or any server, depending on whether session information is maintained on the server or at an underlying, database level.
Without server affinity, DNS round robin relies on one of three methods devised to maintain session control or user identity to requests coming in over HTTP, which is a stateless protocol.
When a user makes a first request, the Web server returns a text-based token uniquely identifying that user. Subsequent requests include this token using either cookies, URL rewriting, or hidden fields, allowing the server to appear to maintain a session between client and server. When a user establishes a session with one server, all subsequent requests usually go to the same server.
The problem is that the browser caches that server’s IP address. Once the cache expires, the browser makes another request to the DNS server for the IP address associated with the domain name. If the DNS server returns a differnt IP address, that of another server in the cluster, the session information is lost.
No support for high availability. Consider a cluster of n nodes. If a node goes down, then every n th request to the DNS server directs you to the dead node. An advanced router solves this problem by checking nodes at regular intervals, detecting failed nodes and removing them from the list, so no requests go to them. However, the problem still exists if the node is up but the Web application running on the node goes down.
Changes to the cluster take time to propagate through the rest of the Internet. One reason is that many large organizations — ISPs, corporations, agencies — cache their DNS requests to reduce network traffic and request time. When a user within these organizations makes a DNS request, it’s checked against the cache’s list of DNS names mapped to IP addresses. If it finds an entry, it returns the IP address to the user. If an entry is not found in its local cache, the ISP sends this DNS request to the DNS server and caches response.
When a cached entry expires, the ISP updates its local database by contacting other DNS servers. When your list of servers changes, it can take a while for the cached entries on other organizations’ networks to expire and look for the updated list of servers. During that period, a client can still attempt to hit the downed server node, if that client’s ISP still has an entry pointing to it. In such a case, some users of that ISP couldn’t access your site on their first attempt, even if your cluster has redundant servers up and running.
This is a bigger problem when removing a node than when adding one. When you drop a node, a user may be trying to hit a non-existing server. When you add one, that server may just be under-utilized until its IP address propogates to all the DNS servers.
What are your experiences with load balancing, and your thoughts about this approach presented here?
Post your comments
Although this method tries to balance the number of users on each server, it doesn’t necessarily balance the server load. Some users could demand a higher load of activity during their session than users on another server, and this methodology cannot guard against that inequity.