JIRA Data Center – load testing JIRA software

An Atlassian product, JIRA is a software development tool used by agile teams in order to plan, track, and release software.

Recently, Automation Consultants, set out to find out more about JIRA’s performance characteristics by running some simple performance tests aimed at finding the first bottleneck. The exercise consisted of setting up a JIRA instance on typically sized hardware, and applying increasing load to it until its performance began to degrade. The question was, as we increased the load, which aspect of the system would reach capacity first: would it be CPU, memory or something else?

Another aspect of JIRA we were looking to investigate was its ability to use clustered application servers (i.e. more than one server running in parallel). Would having two servers increase the system’s capacity? This required using the Data Center edition of JIRA and a load balancer to spread the load between the two servers. The equipment used for this JIRA Data Center test were four virtual machines, one for the reverse proxy/load balancer, two for the JIRA application servers and one for the database. The servers all ran Microsoft Windows 2012. Microsoft SQL Server was used for the database. Crowd was installed on one of the JIRA application servers. A shared file system was used to store attachments.

It should be emphasised that these tests were simple and did not investigate many of the factors which occur in the real world, such as the effect of add-ons, integration with other Atlassian applications, complex usage patterns, large attachments, large issues and many more. They do however provide a starting point from which to understand JIRA’s performance characteristics.

Understanding JIRA’s performance characteristics

The initial configuration of each application server was 2 vitual CPU cores, 800MB of heap in the Java Virtual Machine and 4GB RAM in total. All tests were performed using the Nginx reverse proxy and load balancer and Microsoft’s Visual Studio Load and Performance tester. The algorithm used to balance load was based on a hash of the remote connections, such that once an IP connected to Nginx it would always be assigned to the same application server.

The first test was with 40 simulated concurrent users. Each user was running a script in which it did some typical user actions. The chosen actions were: login, create an issue, browse a project, view the dashboard, add a comment to an issue, run a search for all issues, run a search limited to 50 issues, view the dashboard again, and log out. Rather than have all the users log in at once, which would place a large and unrealistic load spike on the system, the users were set to log in in batches of two at ten second intervals.

In this first test, the system held up well against the applied load. The load was increased in later tests to 200 users and then to 300 users. At 200 users, the performance degraded by about 25%. It was found that increasing the JVM heap size had little or no effect, but increasing the number of processor cores to four on each application server improved performance such that it was similar to that seen with 40 users. In other words, with 4 CPU cores in each application server, the system supported 200 users as well as it did 40 users with only 2 CPU cores.

Varying the JVM heap size had little or no effect at these levels of load. The application server is based on Apache Tomcat, which runs the Atlassian server code in a Java container. Much of the application server code takes memory from the heap in the Java Virtual Machine and cannot access the server’s memory directly. To vary the amount of memory available to the application server, it is therefore necessary to vary the size of the JVM heap rather than the server’s own memory. In future tests, it could be interesting to reduce the size of the JVM heap until a performance degradation occurs. This would reveal the minimum amount of memory needed for a given level of load.

In these tests, the system did not appear to be constrained by disk i/o, network i/o or database limitations.

Test set-up

CPU specification: 2 virtual cores per server
RAM: 4GB – 12GB dynamic allocation per server
Number of IPs used for switching: 3

 

User behaviour: once a user completed their activity they log out and the next available user in the list will log on. The activities of a user are as follows:

Transaction name Actions taken
Load login Enters the URL and loads the login page
Login Enters the users credentials and logs into JIRA
Create issue Creates an issue in a project
Browse project Clicks a project to view
View dashboard Returns back to the dashboard in Jira
Search by ID Searches for an issue by its ID name
Add comment Adds a comment to the issue that was just searched for
Search all A search that returns all issues
Search 50 A search that returns only 50 issues
View dash 2 Returning to the JIRA dashboard a second time
Log out User logs out

 

Test 1: Testing 40 users for stability

Virtual CPUs: 2 per server
JVM max memory size: 800MB per server
Run duration: 30 minutes
Initial user count: 4
Max user count: 40
Step duration: 10
Step ramp time: 0
Step user count: 2

 

Results: this test was to ensure that the new configuration was working stably and that no new or old errors were occurring. Running 40 users for 30 minutes produced 14 errors in total and failed 4 of 683 transactions which is 0.59% of the total tests. This was deemed an acceptable level for the purpose of deriving performance results.

Processor time (% utilisation) of Server 1 and Server 2

JIRA

Web test summary

Test Scenario Total Passed Failed Tests/Sec Test Time 95% Test Time
WebTest1Coded Scenario1 683 679 4.00 0.38 88.0 89.4

 

Test 2: Testing 40 users with a different JVM size

Virtual CPUs: 2 per server
JVM max memory size: 3000MB per server
Run duration: 30 minutes
Initial user count: 4
Max user count: 40
Step duration: 10
Step ramp time: 0
Step user count: 2

 

The maximum size of memory for JVM was increased from 800MB to 300MB to see if there were any significant effects at a low load.

Processor time (% utilisation) of Server 1 and Server 2

Atlassian

Test Scenario Total Passed Failed Tests/Sec Test Time 95% Test Time
WebTest1Coded Scenario1 683 679 4.00 0.38 88.0 89.4

 

There is little to no difference to the processor time and the number of transactions passed when increasing the JVM memory size.

 

Test 3: Testing 200 users

Virtual CPUs: 2 per server
JVM max memory size: 800MB per server
Run duration: 30 Minutes
Initial user count: 4
Max user count: 200
Step duration: 10
Step ramp time: 0
Step user count: 2

 

Processor time (% utilisation) of Server 1 and 2

Screen Shot 2016-06-07 at 11.58.42

Results: Server 1 reaches a threshold violation and is reaching maximum capacity. The reason for Server 2 not being overloaded could be because crowd on Server 2 results in faster login requests or the load balancer sending more requests to Server 1.

Web test summary

Test Scenario Total Passed Failed Tests/Sec Test Time 95% Test Time
WebTest1Coded Scenario1 2,257 2,236 21.00 1.25 102 122

 

Despite the threshold violation there are few errors, with only 21 of 2,257 failures occurring, only 0.93% of all tests. These results show that the 95% test time has increased from ~90 seconds to 122 but this is expected because there has been a significant increase in the number of users on the system.

 

Test 4: Testing 200 users with a different JVM memory size

Virtual CPUs: 2 per server
JVM max memory size: 3000MB per server
Run duration: 30 minutes
Initial user count: 4
Max user count: 200
Step duration: 10
Step ramp time: 0
Step user count: 2

 

Processor time (% utilisation) of Server 1 and Server 2

Screen Shot 2016-06-07 at 12.00.05

Results: again increasing the JVM memory size has not had a noticeable impact on the results of the test. The bottleneck seems to lie with the processor. Despite one of the processors violating its threshold the results of the test are still acceptable with only 0.81% of tests failing. The next test shows the results of increasing processing power.

 

Test Scenario Total Passed Failed Tests/Sec Test Time 95% Test Time
WebTest1Coded Scenario1 2,355 2,336 19.0 1.31 97.8 112.3

 

Test 5: Testing 200 users with 4 virtual CPUs

Virtual CPUs: 4 per server
JVM max memory size: 3000MB per server
Run duration: 30 minutes
Initial user count: 4
Max user count: 200
Step duration: 10
Step ramp time: 0
Step user count: 2

 

Processor time (% utilisation) of Server 1 and 2

Screen Shot 2016-06-07 at 12.02.00

Web test summary

Test Scenario Total Passed Failed Tests/Sec Test Time 95% Test Time
WebTest1Coded Scenario1 2,546 2,517 29.0 1.41 90.2 94.5

 

Results: in this test, the number of virtual CPUs was doubled to 4 per server. The time taken for each test, which would reflect the average response time seen by a user (Test Time) returned to a similar level as that previously seen with 40 users and 2 CPUs per server. The maximum CPU utilisation went down to about 80%. This shows that with more CPU capacity, a greater number of users can be handled with similar response times.

 

Test 6: Testing 300 users with 4 virtual CPUs

Virtual CPUs: 4 per server
JVM max memory size: 3000MB per server
Run duration: 30 minutes
Initial user count: 4
Max user count: 300
Step duration: 10
Step ramp time: 0
Step user count: 2

 

Processor time (% utilisation) of Server 1 and Server 2

Screen Shot 2016-06-07 at 12.03.12 Web test summary

Test Scenario Total Passed Failed Tests/Sec Test Time 95% Test Time
WebTest1Coded Scenario1 2,816 2,774 42.0 1.56 99.0 127.4

 

In this test, the number of simulated users was increased by 50% to 300. The peak CPU utilisation increase to a maximum of approximately 85%. The test time (a proxy for the response time seen by the user) increased to 99.0 sec, similar to that seen with 200 users and two cores. This indicates that a doubling of processor capacity does not lead to a doubling of user capacity.

Summary

The main conclusion of this work is that the first performance bottleneck likely to be encountered with JIRA, assuming a standard hardware allocation, is in CPU capacity. Increasing the RAM in this test did not significantly improve performance. It should be treated as no more than a rule of thumb, however, because major increases in the resources requirements of JIRA can be brought about by the use of certain plugins, such as those which generate resource intensive reports. The tests also confirmed that JIRA can be scaled up by the use of more than one application server. Further tests could be done by varying more the types of transactions included, e.g. different kinds of updates and reads, as well as issue creation, and measuring the individual response times of the transactions. Leading plugins, such as Zephyr, Tempo Timesheets, EazyBI and ScriptRunner could also be tested. Finally, tests could be done with large volumes of data in the database and complex customisations, such as complex workflows, numerous agile boards and numerous dashboards.

 

Edited for web by Jordan Platt.

More
articles