Wednesday, November 16, 2011

Amazon Elastic Load Balancer (ELB) performance characteristics

So at my new job, I get to use AWS stuff a lot. We have many, many servers, usually sitting behind a load-balancer of some sort. Amazon's documentation on these things isn't very clear, so I'm trying to figure out what the damned things are doing.

First off - a good thing. These Load Balancers are really easy to use. Adding a new instance is a few clicks away using the Console.

Another good thing - you can use multiple availability zones as a way to avoid trouble when an entire Availability Zone goes down (as happened less than a year ago). And here's where it gets ugly.

It seems to evenly split traffic across zones - even if you don't have a balanced number of instances in each zone. And it seems to determine which zone to hit based on some hash of the source address. So, if you have 2 instances in an Autoscale group, in two availability zones, and you hit your array from one IP really hard - bad things ensue. The one AZ you're hitting will max out the CPU, and the AZ you're not hitting will be nearly 100% idle. That adds up to 50% utilization on average - not enough to cause a scaling event (with my thresholds, anyway).

So, in short, if you have fojillions of people from all over hitting your services indiscriminately, I'm sure it'll be fine. But, if like me, you have 1000's of people (or so) from all over, some really hard from one particular IP - it may not be a good idea to try to spin up more than one AZ. And spinning up 4 AZ's seems silly - you'll definitely have more of a chance of at least one of them going bad, and until your load-balancer figures that out, you'll have one out of 'n' requests failing.

Another thing I've noticed - the ELB's seem to have a strict timeout on accessing the back-end service. If it doesn't get a response within 30 seconds, it's going to drop the connection and hit the service again. I had a service that was nearly getting DOS'ed by the load balancer that way. Make sure you have sane timeouts.

So the next thing I was curious about was whether or not the ELB would do any batching-together of any of the returned data - would it act like some kind of reverse-proxy? I wrote a tiny Node server which spat out a little data, waited 10 seconds, spat out some more, waited another 10 seconds, then finished. Here it is:



#!/usr/local/bin/node
"use strict";


var http=require('http');


http.createServer(function (req,res) {
  console.warn("CONNECTION RECEVIED - waiting 10 seconds: "+req.url);
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.write("initial output...\n");
  setTimeout(function () {
    console.warn("first stuff written for "+req.url);
    res.write("Here is some stuff before an additional delay\n");
    setTimeout(function () {
      console.warn("second stuff written for "+req.url);
      res.end('Hello World\n');
    },10000);
  },10000);


}).listen(80);

So - I couldn't tell any difference between hitting the server directly, or hitting the load balancer (if I telnetted straight to port 80). It acted the same way. There was definitely no batching.

What about if I have a slow-client, or something that's trying to upload something - will it get batched together on the POST side?

I modified my code to do a 'slow POST' - and it worked similarly - I couldn't actually tell the difference between running that on the load-balancer or running it on the instance directly.

I also wrote code to generate a large (1MB) response dynamically on the server-side, then wrote a client that would receive a little, pause the stream, resume it after a few seconds, pause it again, and so on. The one thing I noticed different between accessing the server directly versus the load balancer was that the server tended to give me data in 1024 byte chunks, whereas the load-balancer was giving me blocks closer to 1500 bytes. Weird! Well, maybe not - I *do* know for sure that the LB is reterminating the connection at the TCP level - the source IP address changes. I was writing the data in blocks of 1k, so maybe each write turned into exactly one packet of 1024 bytes? But in the LB side, the LB, when re-streaming my TCP data, was sending larger segments. Or so it would seem.

So it looks like I can't get rid of the reverse-proxy that sits on top of some of our Ruby code. Oh well.