Narayan Newton

Distributed Load Testing With Locust.io: Turning One Locust Into A Plague

Locust.io is a great tool for applying load in a controlled manner and measuring response. However, historically speaking nobody has really cared about a solo locust. They just aren't that concerning in the singular. Likewise, load applied from a single point to a moderately complicated infrastructure is both easy to block (or rate limit) and also not very representative of a real world situation. (Aside from the people you inevitably end up talking to who archive entire sites so they can have a local copy.)

So, to make your test more accurate and to differentiate yourself from that friend who keeps adding disks to their personal server so they can have a local copy of the internet, it is important to have a distributed test. In this article, I am going to cover one way to automate this process, using SaltStack and Locust.io. They are both written in Python, both very cool tools, and by combining their powers you can absolutely ensure nothing grows in a field.

First, let us address where we ended up in our previous article. We have a simple loadtest that hits some logged in pages and some anonymous pages. Is it a good load test? I don't know, actually...maybe? Quite a lot of load testing is understanding what is expected load on a site, what is expected usage patterns, etc. It probably isn't that great of a test, though. So either we can fix it or we can run it massively in parallel? The choice here is obvious: To The Cloud!

For this example, we are going to use my current configuration management system of choice, Saltstack. Salt and many cloud providers actually work quite well together, because Salt has a tool that interfaces with cloud APIs and allows you to spin up sets of VMs and bootstrap your configuration tree on them. This is very convenient for us, as what we want to do is quickly bootstrap a set of 10+ VMs and start a locust test runner on each. For this example, we are going to use EC2.

There are going to be three moving parts here, the locust test plan (and test runner), the salt configuration tree and the SaltCloud configuration files:

  • The salt-cloud utility is going to start and bootstrap our VMs
  • The salt configuration tree is going to install locust, configure the VM, and start locust
  • The locust test plan and locust master process is going to connect to each VM and manage our loadtest

 

Let us address each of these individually.

SaltCloud

For this demo we are going to use a vanilla CentOS7 VM as our master node. The first step on this node is to install salt and locust. You can get more information on this topic here: https://docs.saltstack.com/en/latest/topics/installation/rhel.html and if you are using CentOS7 on EC2, you can likely just install the epel-release package and then proceed to install the salt packages.

NOTE: Be sure to install the salt-cloud package as well as the normal ones.

Once salt is installed, we need to configure salt-cloud and pull in your EC2 keys. The first thing you should consider is what ID and KEY to use from AWS. I would highly advise not using your root credentials. Also, it should be noted we are about to throw around the word KEY quite a bit, but there are two keys we are going to be discussing. One is the AWS Access Key and Secret combination for your EC2 account. This key/secret combo is what authorizes you to use Amazon's API to spin up new VMs under your account. The other key we are going to discuss is the ssh key you are going to use to login to these new VMs. From here on out, I am going to refer to the Amazon Key as KEY and the SSH Key as SSHKEY.

NOTE: We are assuming the VM you are using to run your salt and locust master node is also in EC2 and in the same region. I would specifically create a new VM for this process so that you can create a new security group for it and a new SSH keypair for it and its loadtesting children.

The first step here is creating a new AWS user so that we are not using your root account credentials. Go into your account security settings in AWS ("My Security Credentials") and create a new user. Be sure to enable programmatic access for this user, since that is how we are going to use it. The next step here is defining permissions for your new user. I definitely recommend looking through the existing policies, as this screen could be an entire article in and of itself.

I am going to grant my user full access to EC2, as I don't have anything else in this account. The role I grant is AmazonEC2FullAccess. Be careful here, this role will give this user a lot of access to your account. Not as much as the root user, but still more than you would likely want to give this user in a long-term situation, or with an account that has other things in it.

This is a demo account, so I don't particularly care, but you may. If you are using a real AWS account, consider NOT using a real AWS account, and also limit this user more carefully. Once you create this user, you will be given the access key and secret for it. Be sure to copy these down as we are about to using these to configure salt-cloud.

Once you install the salt-cloud package, if you look in /etc/salt you will see several cloud related include directories (cloud.SOMETHING.d). In general what you do here is define a provider (in cloud.providers.d) and then a profile for the type of VM you will be launching (cloud.profiles.d) and then a map of the VMs you want to deploy (maybe in cloud.maps.d, some packages do not have this directory). Let us define the provider first:

/etc/salt/cloud.providers.d/ec2.conf

ec2-us-west-2-public:
  minion:
	master: <our master node's IP (the internal IP)>
  id: '<our AWS Key>'
  key: '<our AWS secret>'
  private_key: '<the path to our AWS SSHKEY that we downloaded upon the creation of the master node>'
  keyname: '<the SSHKEY name in AWS>'
  ssh_interface: private_ips
  securitygroup: <our security group in EC2>
  location: us-west-2
  availability_zone: us-west-2a
  provider: ec2
  del_root_vol_on_destroy: True
  del_all_vols_on_destroy: True
  rename_on_destroy: True

Some notes on the above. First, the above assumes we are using us-west. It should be fairly obvious how to change that. Second, note the provider: ec2 line. This entire article assumes EC2, but there are other providers. Finally, it is very important you consider the last 3 lines of this provider configuration when load testing. You do not want to leave volumes hanging around if you are launching, shutting down and re-launching VMs all the time. That is going to add up quickly.

Also, VMs on EC2 take a surprisingly long and ill-defined time to shutdown. Sometimes it happens quickly, sometimes it really does not. If your VMs are named load1-10 and you shut down the first set of 10 and want to start 10 new VMs, you will run into a naming conflict if load9 is still shutting down. The 'rename_on_destroy' parameter works around this by renaming all the VMs with a generated ID before shutting them down. This allows them to spin down in their own time, without blocking new launches.

NOTE: You will have to copy the SSHKEY that you used to create this VM (and which will be tied to all your loadtest VMs) to the VM, so that salt-cloud can use it to login to the VMs. Be sure to put it somewhere secure on the VM and chmod it correctly.

Now, let's define a profile for our VMs:

/etc/salt/cloud.profiles.d/test.conf

ec2_west_micro_dev:
  provider: ec2-us-west-2-public
  image: ami-d2c924b2
  size: t2.micro
  ssh_username: centos
  sync_after_install: grains

As you can see, after defining a provider most of the work is done. We specify the provider, choose an ami, a size, tell it about the centos ssh username and make sure it syncs the "grains", i.e. the "facts" that Salt uses to define its environment. (Grains of Salt…It should be noted SaltStack is from the Python community so the puns are a feature.)

Finally, let's define our map of VMs:

/etc/salt/cloud.maps.d/load.map

ec2_west_micro_dev:
  - load1
  - load2
  - load3
  - load4

So, the good news is these configuration files definitely get progressively more simple. The above is simply defining which profile to use, and then the names of the VMs. The number of names implies the number of VMs. As a final step, be sure to go into the EC2 security group for these VMs and allow traffic from itself. I like to keep the master node and the loadtest VMs in the same security group for this reason, it simplifies management.

We should now be ready to go! Of course, these VM's won't actually do anything and we aren't running Salt itself yet. So, let us address that.

SaltStack Configuration Management Tree

I really like SaltStack. I've used many, many configuration management systems. I started with CFEngineV2 and have gone through Puppet, Chef, Ansible...I even started writing one once (I deleted it when I sobered up). There is something about how Salt is designed, its documentation and its approach to orchestration that I really like. We are going to see none of these today, because our configuration "tree" is one step above a shell script. However, I wanted to take a moment to say I do recommend taking a look at Salt. It is a cool tool.

Our tree contains the following, all under /srv/salt (the path for the default salt tree can vary by distribution):

top.sls

base:
  '*':
 	- locust

/locust/init.sls:

mandatory_packages:
  pkg:
 	- installed
 	- names:
     	- python-devel
     	- python-virtualenv
     	- gcc
     	- gcc-c++
     	- screen


/home/centos/locust:
   virtualenv.managed:
   	- requirements: salt://locust/requirements.txt
   	- require:
       	- pkg: python-virtualenv  
       	- pkg: python-devel

start_locust:
	cmd.run:
    	- name: /bin/screen -S locust -d -m locust -f /root/testplan.py --slave --master-host=<IP OF OUR MASTER NODE (internal IP)>
    	- cwd: /home/centos/locust
    	- env:
        	- PATH: "/home/centos/locust/bin:$PATH"
    	- reqiure:
        	- file: /root/testplan.py

/root/testplan.py:
	file.managed:
    	- source: salt://locust/testplan.py

requirements.txt

locustio
bs4
pyzmq

testplan.py

from locust import HttpLocust, TaskSet, task, events
from bs4 import BeautifulSoup
import random

def is_static_file(f):
	if "/sites/default/files" in f:
    	return True
	else:
    	return False

def fetch_static_assets(session, response):
	resource_urls = set()
	soup = BeautifulSoup(response.text, "html.parser")

	for res in soup.find_all(src=True):
    	url = res['src']
    	if is_static_file(url):
        	resource_urls.add(url)
    	else:
        	print "Skipping: " + url

	for url in set(resource_urls):
    	#Note: If you are going to tag different static file paths differently,
    	#this is where I would normally do that.
    	session.client.get(url, name="(Static File)")

class AnonBrowsingUser(TaskSet):
	@task(10)
	def frontpage(l):
    	response = l.client.get("/")
    	fetch_static_assets(l, response)

class AuthBrowsingUser(TaskSet):
	def on_start(l):
    	response = l.client.get("/user/login", name="Login")
    	soup = BeautifulSoup(response.text, "html.parser")
    	drupal_form_id = soup.select('input[name="form_build_id"]')[0]["value"]
    	r = l.client.post("/user/login", {"name":"nnewton", "pass":"hunter2", "form_id":"user_login_form", "op":"Log+in", "form_build_id":drupal_form_id})

	@task(10)
	def frontpage(l):
    	response = l.client.get("/", name="Frontpage (Auth)")
    	fetch_static_assets(l, response)

class WebsiteAuthUser(HttpLocust):
	task_set = AuthBrowsingUser

class WebsiteAnonUser(HttpLocust):
	task_set = AnonBrowsingUser

What the above does is fairly guessable even without knowing salt. The top.sls file is the main file for a salt configuration tree and defines the environments (there is only one here, base) and then applies modules to hosts. In this top.sls we are applying the locust module to *, i.e. everything. The locust module installs some packages, defines a python virtualenv and has that virtualenv pull in locust.

Finally, it starts a screen session to run locust. We do this so that you can ssh into the VMs you spawn and connect to the screen session in order to view the output of your test runners in case something goes wrong. A big improvement here would be to have this running so that the test runner output was being logged and sent to a central loghost. I have not done this yet, but it shouldn't be difficult.

Once our salt tree is in place, we can start the salt master with systemctl start salt-master. We can now try launching some VMs.

You can do this by running the following commands:

salt-cloud --update-bootstrap # It is a good idea to update the bootstrap script salt-cloud uses
salt-cloud -m <path to your map file>

You can shut down the VMs with the following:

salt-cloud -m <path to your map file> -d

Once the VM's are started, you can run your salt tree on them by running salt '*' state.highstate. This will tell each VM to converge to the state defined in the tree. If you want to re-start the locust screen session after it exits, this is also the command to run. If you want to test which VMs are connected to the salt master, you can run the following to generate a roll call:

$ salt '*' test.ping
load1:
	True
load2:
	True

Locust.io

Finally we get to the star of the show. As compared to the other tools here, locust seems very simple to use. All we are going to do is start it with the --master argument in addition to the testplan and host to be tested. This tells locust that instead of running the loadtest itself, it is going to listen for connections from the other locust instances and then manage them running the loadtest (and collect the results). If you connect your browser to port :8089 on the VM running the locust master, you can watch as the other locust instances connect. When they are all ready, you can run your loadtest.

How many threads (users) you run with depends on how many each of your VMs can support. That will vary by the power of the VMs in question and how complicated your loadtest is. For this test, lets start with 100 per VM, for a total of 500 users.

When we press start, the test will run and each test VM will spin up the requisite threads and start testing. Results will feed back to the locust GUI as normal. In fact, most everything will act exactly like you're running a single instance. You can download results the same, view in-progress results the same, etc. However, if you review the log of the site being tested you will see the the load is being applied from all our different VMs.

In fact, it is likely that when you start locust you will see the locust processes connect fairly quickly. It will look somewhat like this:

[2017-05-10 18:42:30,883] ip-172-31-5-154/INFO/locust.main: Starting web monitor at *:8089
[2017-05-10 18:42:30,884] ip-172-31-5-154/INFO/locust.main: Starting Locust 0.7.5
[2017-05-10 18:42:30,928] ip-172-31-5-154/INFO/locust.runners: Client 'ip-172-31-47-48_1319d972c6c8e3c6dce380f7d870435c' reported as ready. Currently 1 clients ready to swarm.
[2017-05-10 18:42:30,934] ip-172-31-5-154/INFO/locust.runners: Client 'ip-172-31-33-192_74a2ff7658002d801a5defdc707b3594' reported as ready. Currently 2 clients ready to swarm.

When you stop the locust master process, the locust processes on the VMs will exit. You can restart them with salt as we noted in that previous section. This setup should allow you to run distributed load tests in EC2 and apply far more load to an infrastructure and more representative load. There are many aspects of the above setup that are suboptimal, but there is a lot here that could be easily improved upon. In particular, increasing reporting from the test running VMs would be very useful and perhaps could be addressed in a later article.

Psst! Looking to Keep Your Drupal Site Secure?

Tag1 Quo is the only Drupal monitoring solution that supports Drupal 6 LTS, Drupal 7, and Drupal 8 under one dashboard.