Drupal Load Testing
How to Start Drupal Load Testing with JMeter and Azure
The What
Let me preface this blog by stating that I am not a Drupal admin. However, I see Drupal enough to make me want to dive into it's architecture and understand how different components may help or hurt performance at high load, which got me to the question of "how do I actually load test Drupal".
In this post I want to cover how we can use a sitemap to run JMeter Load Tests against Drupal. This should lay the groundwork for being able to accurately understand how different architectural changes impact your site's performance.
The Why
Drupal is a popular CMS, but for better or worse Drupal is sometimes a mystery in terms of how it executes and performs. Cloud engineers and architects want to make sure their architectures are optimized and scaled to handle load, but usually there is no way to validate whether their design changes will truly improve their site before releasing it publicly. However, this isn't optimal since by then it's too late should we realize that our changes still don't improve performance at high-load times. On the flip side, we may realize that we are over-spending on resources to scale out for a load we may simply never receive. Ultimately, we would like a way to run experiments and understand our changes before we promote these changes to production.
The How
Prerequisite - Docker to Run Drupal
Before diving into the load test, I need a Drupal site that I can work with. I've decided to run this locally with the following Docker Compose file. You do not need to run your Drupal site as I do with containers, this is just the way I decided to get started quickly. I also use the demo unami site to create content when I navigate to the UI of the site and run the initial install:
version: '2'
services:
mysql:
image: mysql
environment:
MYSQL_ROOT_PASSWORD: example
MYSQL_DATABASE: drupaldb
volumes:
- 'mysql_data:/var/lib/mysql'
drupal:
image: drupal:10
ports:
- '80:80'
depends_on:
- mysql
volumes:
mysql_data:
driver: local
Question: What Paths Exist in the Site
It would be ideal that we load test the site by hitting all the unique paths exposed. How do we find this list of paths exposed in the site? One way may be to leverage a sitemap - this should document all the paths of our site.
Now, how do we generate a sitemap? After a random search of Drupal modules, the following simple sitemap module looks promising. We expect that this module will produce a sitemap.xml file that includes all the paths within our site. Let's test that hypothesis.
First, install the module with composer. I had to exec into the container to run this command:
composer require 'drupal/simple_sitemap:^4.1'
Install Drupal Simple Sitemap Module
Next, enable the module in the admin UI and configure the following settings:


You should then see the sitemap when navigating to the sitemap.xml path:

Next Question: How Can We Run a Load Test Using the Sitemap Paths
An open-source tool that is used for load testing is Apache JMeter. This tool is also the backbone for the Azure Load Testing Service. The key to answering this question boils down to "How do I make a JMeter script that crawls the sitemap and then iterates over each url?".
This is a handy link I found to get an example JMeter script that already has a sitemap crawler developed. This builds a test that will iterate through each link in the sitemap. A few updates were made to the script included below (take note that you need to update the variable values before running the test):
<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlan version="1.2" properties="5.0" jmeter="5.4.3">
<hashTree>
<TestPlan guiclass="TestPlanGui" testclass="TestPlan" testname="Crawl Sitemap Test Plan" enabled="true">
<stringProp name="TestPlan.comments"></stringProp>
<boolProp name="TestPlan.functional_mode">false</boolProp>
<boolProp name="TestPlan.tearDown_on_shutdown">true</boolProp>
<boolProp name="TestPlan.serialize_threadgroups">false</boolProp>
<elementProp name="TestPlan.user_defined_variables" elementType="Arguments" guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
<collectionProp name="Arguments.arguments"/>
</elementProp>
<stringProp name="TestPlan.user_define_classpath"></stringProp>
</TestPlan>
<hashTree>
<Arguments guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
<collectionProp name="Arguments.arguments">
<elementProp name="ConcurrentUsers" elementType="Argument">
<stringProp name="Argument.name">ConcurrentUsers</stringProp>
<stringProp name="Argument.value">25</stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">specify number of concurrent users/threads (default value 1)</stringProp>
</elementProp>
<elementProp name="RampUp" elementType="Argument">
<stringProp name="Argument.name">RampUp</stringProp>
<stringProp name="Argument.value">10</stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">how long to take to ramp up to the Concurrent Users (default value 1)</stringProp>
</elementProp>
<elementProp name="NumberOfExecutions" elementType="Argument">
<stringProp name="Argument.name">NumberOfExecutions</stringProp>
<stringProp name="Argument.value">1</stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">how many times to repeat for each users (default value 1)</stringProp>
</elementProp>
<elementProp name="Domain" elementType="Argument">
<stringProp name="Argument.name">Domain</stringProp>
<stringProp name="Argument.value"></stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">specify the domain of your site</stringProp>
</elementProp>
<elementProp name="SitemapUrl" elementType="Argument">
<stringProp name="Argument.name">SitemapUrl</stringProp>
<stringProp name="Argument.value">sitemap.xml</stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">set this to your sitemap url</stringProp>
</elementProp>
<elementProp name="SitemapUrlRegEx" elementType="Argument">
<stringProp name="Argument.name">SitemapUrlRegEx</stringProp>
<stringProp name="Argument.value">//url/loc</stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">this is regular expression to extract urls from the sitemap</stringProp>
</elementProp>
<elementProp name="DateTime" elementType="Argument">
<stringProp name="Argument.name">DateTime</stringProp>
<stringProp name="Argument.value">${__time(yyyymmdd-hhmm)}</stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">datetime stamp format included in the name of the exported results file</stringProp>
</elementProp>
<elementProp name="ResultsPath" elementType="Argument">
<stringProp name="Argument.name">ResultsPath</stringProp>
<stringProp name="Argument.value">c:\jmeter\results</stringProp>
<stringProp name="Argument.metadata">=</stringProp>
<stringProp name="Argument.desc">path to store summary report results file</stringProp>
</elementProp>
</collectionProp>
</Arguments>
<hashTree/>
<HeaderManager guiclass="HeaderPanel" testclass="HeaderManager" testname="HTTP Header manager" enabled="true">
<collectionProp name="HeaderManager.headers">
<elementProp name="User-Agent" elementType="Header">
<stringProp name="Header.name">User-Agent</stringProp>
<stringProp name="Header.value">Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36</stringProp>
</elementProp>
</collectionProp>
</HeaderManager>
<hashTree/>
<CookieManager guiclass="CookiePanel" testclass="CookieManager" testname="HTTP Cookie Manager" enabled="true">
<collectionProp name="CookieManager.cookies"/>
<boolProp name="CookieManager.clearEachIteration">true</boolProp>
<boolProp name="CookieManager.controlledByThreadGroup">false</boolProp>
</CookieManager>
<hashTree/>
<CacheManager guiclass="CacheManagerGui" testclass="CacheManager" testname="HTTP Cache Manager" enabled="false">
<boolProp name="clearEachIteration">true</boolProp>
<boolProp name="useExpires">false</boolProp>
<boolProp name="CacheManager.controlledByThread">false</boolProp>
</CacheManager>
<hashTree/>
<ThreadGroup guiclass="ThreadGroupGui" testclass="ThreadGroup" testname="Traverse Sitemap Thread Group" enabled="true">
<stringProp name="ThreadGroup.on_sample_error">continue</stringProp>
<elementProp name="ThreadGroup.main_controller" elementType="LoopController" guiclass="LoopControlPanel" testclass="LoopController" testname="Loop Controller" enabled="true">
<boolProp name="LoopController.continue_forever">false</boolProp>
<stringProp name="LoopController.loops">${NumberOfExecutions}</stringProp>
</elementProp>
<stringProp name="ThreadGroup.num_threads">${ConcurrentUsers}</stringProp>
<stringProp name="ThreadGroup.ramp_time">${RampUp}</stringProp>
<boolProp name="ThreadGroup.scheduler">false</boolProp>
<stringProp name="ThreadGroup.duration"></stringProp>
<stringProp name="ThreadGroup.delay"></stringProp>
<boolProp name="ThreadGroup.same_user_on_next_iteration">false</boolProp>
</ThreadGroup>
<hashTree>
<HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="Get Sitemap HTTP Request" enabled="true">
<elementProp name="HTTPsampler.Arguments" elementType="Arguments" guiclass="HTTPArgumentsPanel" testclass="Arguments" enabled="true">
<collectionProp name="Arguments.arguments"/>
</elementProp>
<stringProp name="HTTPSampler.domain">${Domain}</stringProp>
<stringProp name="HTTPSampler.port"></stringProp>
<stringProp name="HTTPSampler.protocol">https</stringProp>
<stringProp name="HTTPSampler.contentEncoding"></stringProp>
<stringProp name="HTTPSampler.path">${SitemapUrl}</stringProp>
<stringProp name="HTTPSampler.method">GET</stringProp>
<boolProp name="HTTPSampler.follow_redirects">true</boolProp>
<boolProp name="HTTPSampler.auto_redirects">false</boolProp>
<boolProp name="HTTPSampler.use_keepalive">true</boolProp>
<boolProp name="HTTPSampler.DO_MULTIPART_POST">false</boolProp>
<boolProp name="HTTPSampler.image_parser">true</boolProp>
<boolProp name="HTTPSampler.concurrentDwn">true</boolProp>
<stringProp name="HTTPSampler.embedded_url_re"></stringProp>
<stringProp name="HTTPSampler.connect_timeout"></stringProp>
<stringProp name="HTTPSampler.response_timeout"></stringProp>
</HTTPSamplerProxy>
<hashTree>
<XPathExtractor guiclass="XPathExtractorGui" testclass="XPathExtractor" testname="Get Site Urls Xpath Extractor" enabled="true">
<stringProp name="XPathExtractor.default">not found</stringProp>
<stringProp name="XPathExtractor.refname">SiteUrls</stringProp>
<stringProp name="XPathExtractor.matchNumber">-1</stringProp>
<stringProp name="XPathExtractor.xpathQuery">${SitemapUrlRegEx}</stringProp>
<boolProp name="XPathExtractor.validate">false</boolProp>
<boolProp name="XPathExtractor.tolerant">false</boolProp>
<boolProp name="XPathExtractor.namespace">false</boolProp>
</XPathExtractor>
<hashTree/>
</hashTree>
<ForeachController guiclass="ForeachControlPanel" testclass="ForeachController" testname="Traverse Site Urls Controller" enabled="true">
<stringProp name="ForeachController.inputVal">SiteUrls</stringProp>
<stringProp name="ForeachController.returnVal">SiteUrl</stringProp>
<boolProp name="ForeachController.useSeparator">true</boolProp>
</ForeachController>
<hashTree>
<HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="${SiteUrl}" enabled="true">
<elementProp name="HTTPsampler.Arguments" elementType="Arguments" guiclass="HTTPArgumentsPanel" testclass="Arguments" enabled="true">
<collectionProp name="Arguments.arguments"/>
</elementProp>
<stringProp name="HTTPSampler.domain"></stringProp>
<stringProp name="HTTPSampler.port"></stringProp>
<stringProp name="HTTPSampler.protocol"></stringProp>
<stringProp name="HTTPSampler.contentEncoding"></stringProp>
<stringProp name="HTTPSampler.path">${SiteUrl}</stringProp>
<stringProp name="HTTPSampler.method">GET</stringProp>
<boolProp name="HTTPSampler.follow_redirects">true</boolProp>
<boolProp name="HTTPSampler.auto_redirects">false</boolProp>
<boolProp name="HTTPSampler.use_keepalive">true</boolProp>
<boolProp name="HTTPSampler.DO_MULTIPART_POST">false</boolProp>
<boolProp name="HTTPSampler.image_parser">true</boolProp>
<boolProp name="HTTPSampler.concurrentDwn">true</boolProp>
<stringProp name="HTTPSampler.embedded_url_re"></stringProp>
<stringProp name="HTTPSampler.connect_timeout"></stringProp>
<stringProp name="HTTPSampler.response_timeout"></stringProp>
</HTTPSamplerProxy>
<hashTree/>
</hashTree>
<ResultCollector guiclass="SummaryReport" testclass="ResultCollector" testname="Summary Report" enabled="true">
<boolProp name="ResultCollector.error_logging">false</boolProp>
<objProp>
<name>saveConfig</name>
<value class="SampleSaveConfiguration">
<time>true</time>
<latency>true</latency>
<timestamp>true</timestamp>
<success>true</success>
<label>true</label>
<code>true</code>
<message>true</message>
<threadName>true</threadName>
<dataType>true</dataType>
<encoding>false</encoding>
<assertions>true</assertions>
<subresults>true</subresults>
<responseData>false</responseData>
<samplerData>false</samplerData>
<xml>false</xml>
<fieldNames>true</fieldNames>
<responseHeaders>false</responseHeaders>
<requestHeaders>false</requestHeaders>
<responseDataOnError>false</responseDataOnError>
<saveAssertionResultsFailureMessage>true</saveAssertionResultsFailureMessage>
<assertionsResultsToSave>0</assertionsResultsToSave>
<bytes>true</bytes>
<sentBytes>true</sentBytes>
<url>true</url>
<threadCounts>true</threadCounts>
<idleTime>true</idleTime>
<connectTime>true</connectTime>
</value>
</objProp>
<stringProp name="filename">\\${ResultsPath}\\${DateTime}_SitecoreCrawlSummaryReport.csv</stringProp>
</ResultCollector>
<hashTree/>
<ResultCollector guiclass="ViewResultsFullVisualizer" testclass="ResultCollector" testname="View Results Tree" enabled="true">
<boolProp name="ResultCollector.error_logging">false</boolProp>
<objProp>
<name>saveConfig</name>
<value class="SampleSaveConfiguration">
<time>true</time>
<latency>true</latency>
<timestamp>true</timestamp>
<success>true</success>
<label>true</label>
<code>true</code>
<message>true</message>
<threadName>true</threadName>
<dataType>true</dataType>
<encoding>false</encoding>
<assertions>true</assertions>
<subresults>true</subresults>
<responseData>false</responseData>
<samplerData>false</samplerData>
<xml>false</xml>
<fieldNames>true</fieldNames>
<responseHeaders>false</responseHeaders>
<requestHeaders>false</requestHeaders>
<responseDataOnError>false</responseDataOnError>
<saveAssertionResultsFailureMessage>true</saveAssertionResultsFailureMessage>
<assertionsResultsToSave>0</assertionsResultsToSave>
<bytes>true</bytes>
<sentBytes>true</sentBytes>
<url>true</url>
<threadCounts>true</threadCounts>
<idleTime>true</idleTime>
<connectTime>true</connectTime>
</value>
</objProp>
<stringProp name="filename"></stringProp>
</ResultCollector>
<hashTree/>
</hashTree>
</hashTree>
</hashTree>
</jmeterTestPlan>
JMeter Script to Crawl Sitemap and Iterate through Each URL
When we run it locally with JMeter, it looks like it's working well!


Final Step: How do I Run The Test in Azure Load Testing
Now that you have the JMeter script and have executed it locally, you can upload it to Azure Load Testing to run a test from the cloud. The benefits include using more execution engines for increased load and also capturing nice results and metrics alonside your Drupal deployment.
First, create an Azure Load Testing resource and then create a new test by uploading a JMeter script:


Continue through the test creation, adding what is necessary. From there you should see the test begin execution.

Summary
This hopefully helps to get a testing framework setup for your Drupal site that you can then use to evaluate different changes. Another use for this script is to pre-warm any caches you may have - you could run this with a single thread and hit all pages on your site.