nginx-balancer-algorithm

Algorithm

Nginx now supports following load-balancing disciplines

  • Round-robin and weighted round-robin
  • Least-connected and weighted least-connected
  • Source-ip hash-based, and a weighted one
  • Generic hash
  • Consistent hash

Rule to choose a algorithm for your website, we’ll consider pros and cons of each method to help you narrow the range of choices to consider.

  • Running Tests to Compare Methods

    Whichever subset of load‑balancing methods you consider, we encourage you to test them to see which works best for your traffic. “Best” usually means shortest time to deliver responses to clients, but you might have different criteria. Testing is most straightforward if all servers have the same capacity. If not, you need to set server weights so that machines with more capacity receive more requests

Some metrics to check during testing are:

  • CPU and memory load – Look at the percentage of total capacity used, for both CPU and memory. If all servers aren’t equally loaded, traffic is not being distributed efficiently.

  • Server response time – If the time is consistently higher for some servers than others, somehow “heavier” requests (requiring more computation or calls to a database or other services) are getting directed to them in an unbalanced way. Try adjusting the weights, because the imbalance might be caused by incorrect weights rather than by a problem with the load‑balancing technique.

  • Total time to respond to the client – Again, consistently higher times for some servers suggest they’re getting a disproportionate share of time‑consuming requests. And again, you can try adjusting weights to see if that eliminates the issue.

  • Errors and failed requests – You need to make sure that the number of failed requests and other errors during the tests is not larger than is usual for your site. Otherwise you’re basing your decision on error conditions instead of realistic traffic. For some errors, the server can send its response more quickly than when the request succeeds. For HTTP response code 404 (File Not Found), for example, the server probably returns the error much more quickly than it could deliver the actual file if it existed. With the Least Connections and Least Time load‑balancing algorithms, this can lead the load balancer to favor a server that is actually not working well.

Pros, Cons, and Use Cases

Hash and IP Hash
The Hash and IP Hash load‑balancing techniques create a fixed association between a given type of client request (captured in the hash value) and a certain server. You might recognize this as session persistence – all requests with a given hash value always go to the same server.

The biggest drawback of these methods is that they are not guaranteed to distribute requests in equal numbers across servers, let alone balance load evenly. The hashing algorithm evenly divides the set of all possible hash values into “buckets”, one for each server in the upstream group, but there’s no way to predict whether the requests that actually occur will have hashes that are evenly distributed.

So it makes sense to use Hash or IP Hash when the benefit of maintaining sessions outweighs the possibly bad effects of unbalanced load. They are the only form of session persistence available in NGINX.

There are a couple cases where IP Hash – and Hash when the client IP address is in the key – don’t work:

  • When the client’s IP address can change during the session, for example when a mobile client switches from a WiFi network to a cellular one.
  • When the requests from a large number of clients are passing through a forward proxy, because the proxy’s IP address is used for all of them.

Changing the set of upstream servers usually forces recalculation of at least some of the mappings, breaking session persistence. You can reduce the number of recalculated mappings somewhat:

  • For the Hash method, include the consistent parameter to the hash directive
  • For the IP Hash method, before removing a server from the upstream group temporarily, add the down parameter to its server directive, as for web2 in the following example. The mappings are not recalculated, on the assumption that the server will soon return to service
    1
    2
    3
    4
    5
    6
    7
    upstream backend {
    ip_hash;

    server web1;
    server web2 down;
    server web3;
    }

Round Robin
The general consensus is that Round Robin works best when the characteristics of the servers and requests are unlikely to cause some servers to become overloaded relative to others. Some of the conditions are:

  • All the servers have about the same capacity. This requirement is less important if differences between servers are accurately represented by server weights.
  • All the servers host the same content.
  • Requests are pretty similar in the amount of time or processing power they require. If there’s a wide variation in request weight, a server can become overloaded because the load balancer happens to send it a lot of heavyweight requests in quick succession.
  • Traffic volume is not heavy enough to push servers to near full capacity very often. If servers are already heavily loaded, it’s more likely that Round Robin’s rote distribution of requests will push some servers “over the edge” into overload as described in the previous bullet.

Least Connections
Least Connections is the most suitable load‑balancing technique for the widest range of use cases, and particularly for production traffic.

Least Connections also effectively distributes workload across servers according to their capacity. A more powerful server fulfills requests more quickly, so at any given moment it’s likely to have a smaller number of connections still being processed (or even waiting for processing to start) than a server with less capacity. Least Connections sends each request to the server with the smallest number of current connections, and so is more likely to send requests to powerful servers.

(weighted)Least Connection

It is designed to distribute the load evenly among upstream servers, by selecting the one with the fewest number of active connections. If the upstream servers do not all have the same processing power, this can be indicated using the weight parameter to the server directive. The algorithm will take into account the different weighted servers when calculating the number of least connections.

1
2
3
4
5
upstream bakend { 
least_conn;
server 192.168.1.1:8080 weight=1;
server 192.168.1.2:8080 weight=2;
}

(Weighted)Round Robin

The load balancer runs through the list of upstream servers in sequence, assigning the next connection request to each one in turn. while weight is used to configure the weight. The default value is 1. The higher the weight, the more requests will be allocated to this server.The weight should be set according to the actual processing capacity of the server.

1
2
3
4
upstream bakend {    
server 192.168.1.1:8080 weight=1;
server 192.168.1.2:8080 weight=2;
}

(weighted)Generic Hash

With the Hash method, for each request the load balancer calculates a hash that is based on the combination of text and NGINX variables you specify, and associates the hash with one of the servers. It sends all requests with that hash to that server, so this method establishes a basic kind of session persistence.

1
2
3
4
5
upstream bakend { 
hash $scheme$request_uri;
server 192.168.1.1:8080 weight=1;
server 192.168.1.2:8080 weight=2;
}

(weighted)IP Hash

The certain IP addresses should always be mapped to the same upstream server. Nginx does this by using the first three octets of an IPv4 address or the entire IPv6 address, as a hashing key. The same pool of IP addresses are therefore always mapped to the same upstream server. So, this mechanism isn’t designed to ensure a fair distribution, but rather a consistent mapping between the client and upstream server. It is very useful for sessions, such as php store sessions on files which located on every web servers, in which case, it will be difficult to synchronize.

Why use first three octets of an IPV4, it is designed to optimize for ISP clients that are assigned IP addresses dynamically from a subnetwork (/24) range. In case of reboot or reconnection, the client’s address often changes to a different one in the /24 network range, but the connection still represents the same client, so there’s no reason to change the mapping to the server.

If, however, the majority of the traffic to your site is coming from clients on the same /24 network, IP Hash doesn’t make sense because it maps all clients to the same server. In that case (or if you want to hash on all four octets for another reason), instead use the Hash method with the $remote_addr variable.
hash $remote_addr;

1
2
3
4
5
upstream bakend {
ip_hash;
server 192.168.1.1:8080 weight=1;
server 192.168.1.2:8080 weight=2;
}

(weighted)Consistent Hash

Consistent hash algorithm: consistent_ Key is dynamically specified.

1
2
3
4
5
upstream bakend { 
hash $consistent_key consistent;
server 192.168.1.1:8080 weight=1;
server 192.168.1.2:8080 weight=2;
}