Can machine learning unite capacity models and load testing?

With the move to DevOps and high-paced development, there is a greater and more frequent need to specify test environments to ensure that systems are working efficiently; yet the ability of enterprise to model and manage capacity accurately is immature.

Performance testers are theoretically well-placed to help but they may be naturally cautious about modeling capacity since testing functions can run up significant annual costs in capacity usage alone.  

You’ll have heard plenty about AI (artificial intelligence) and ML (machine learning) of late, and with good reason – delicate, complex and downright costly technology and tools are rapidly maturing into usable toolsets in a wide range of verticals. Analyst firms predict huge markets for AI and ML, indeed the number of enterprises implementing artificial intelligence (AI) grew 270 percent in the past four years and tripled in the past year, according to industry analyst Gartner’s 2019 CIO Survey Results showed that organizations across all industries use AI in a variety of applications but on the downside struggle with acute talent shortages.

“Four years ago, AI implementation was rare, only 10 percent of survey respondents reported that their enterprises had deployed AI or would do so shortly. For 2019, that number has leaped to 37 percent — a 270 percent increase in four years,” said Chris Howard, distinguished research vice president at Gartner. “If you are a CIO and your organization doesn’t use AI, chances are high that your competitors do, and this should be a concern.”

 AI and the testing community

However, rarely do these types of technology make a big impact in the testing community, but that is changing fast. At Edge Testing we have been applying ML-style mathematics and pattern recognition along with performance engineering for managing and modeling capacity. 

The fact is that capacity forecasting has been a perennial issue across the board, exacerbated by rapid technological change. The traditional solutions have been simple – either over-provision and take the hit on operational efficiency, or risk under-provision and accept potentially serious operational risks. Neither of these common strategies have solved the underlying issue, which requires the data to be interrogated effectively and rapidly enough to create actionable insights.

Our approach has been to use computer science techniques and advanced mathematics to develop autonomous systems that can automatically manage capacity throughout the testing process. To begin with, we trialed conventional computational rules but found that a more bespoke approach was required to gain actionable data. Although the concept of capacity management is relatively well-established, creating workable models and building reliable code in a complex modern cloud testing function scenario has not been so straightforward. However, the results have been dramatic.

In one example, we applied load modeling to a client that was specifying new hardware for its application and database tiers function as part of initial development. We found that the existing calculations and values had led to a significant capacity excess – of nearly £8 million. The client operates a classical hypervisor and VM Architecture and had initially requested 56 CPUs for each of their 16 application servers at a total of £10 million, but by mathematically calculating the weight of each transaction, we successfully reduced the required number of CPUs to 12 per application server, a revised total of £2.3 million, or a saving of £7.7 million.


In another case, a client using a JVM based load injection tool within a CICD Pipeline, the cost of generating load via this common mechanism was around £750K a year. Implementing a new method of load injection via a new injection architecture resulted in a £640K reduction in the persistent hosting cost. 

In fact, by applying autonomous functions, which automatically manage capacity, we reduced our clients AWS annual hosting cost rate by £1 million. These impressive figures are just the tip of the iceberg, we believe. Of course, this is not a panacea – the algorithms only really deliver when doing consistent regression testing – so one-off test scenarios can’t be accommodated. 

As Gartner also highlighted, internal talent shortages are often a barrier when getting to grips with internal data, especially when viewed through the lens of testing and cloud capacity modeling, both complex areas that create their own data silos. This makes the outsourcing of some data management functions increasingly desirable even for large enterprises, and thus the demand for outsourced capacity management should be significant – the early results speak for themselves.

Written by Matthew Clarke, Edge Testing’s Senior Performance Engineer and Test Specialist