The overarching goal for any performance engagement is to validate the Clients solution performance, diagnose performance defects and assist in remedying these. How to arrive at this goal is a little more complex than simply hitting your solution with an arbitrary amount of load and validating the site responds within a reasonable timeframe.
The core outcomes from the planning, requirements gathering, and communication phase of a project:
- Define Non-Functional Requirements which accurately represent the state required of the final solution.
- Workload models which exercise the system relative to how it will be used in the real world.
- A test environment, and infrastructure which meet your needs for appropriately testing the solution.
Without gathering the necessary data, and establishing clear requirements at the beginning of a project, it is difficult to define Non-Functional Requirements, create test scenarios which validate the Non-Functional Requirements, and develop workload models which add value for the client. Creation of your workload models, and test scenarios, can be relatively simple if a client defines their Non-Functional Requirements beforehand. If they haven’t you will need to identify what their Peak load requirements are, their target Baseline/Average load, the number of concurrent users for each workload, and the subset of Business Cases which will be used.
Gathering detailed information for analysis is the first step in defining or validating a client’s Non-Functional Requirements, and determining how you will test the Non-Functional Requirements.
Non-Functional Requirements defined by a client as follows are typical:
- The New Application Platform should meet performance requirements up to a concurrent load of 100 users creating documents and browsing x, y, z pieces of functionality.
- The Solution security service must respond to authorisation or authentication requests within 50 milliseconds.
- The Solution integration service must maintain an average response time of < 200ms per transaction for transaction volumes of less than 100 transactions per second.
- 95% of simple transactions must respond within 300ms.
This provides a starting point for gathering data, and how we will test the solution. However, the requirement: The New Application Platform should meet performance requirements up to a concurrent load of 100 users, along with others, is vague. It will fall on the Performance Engineer to validate these requirements, and determine what those 100 concurrent users will be doing.
To do this requires either further definition of anticipated load from the client, or Analytics to base your findings on. This could be simple data pulled from Google Analytics. This data will also be used to validate existing Non-Functional Requirements, and develop realistic Workload models for your Performance tests.
Figure 1. Google Analytics, high level overview. 30 days of data: April 1st – May 1st, 2007.
The graph above describes user activity day by day, with figures below the graph showing the totals for various statistics throughout the month. During this period, there are 5 spikes in traffic, with each beginning on the Monday of each week (2nd, 9th, 16th, 23rd, 30th) peaking mid-week then tapering off Friday, Saturday and Sunday. Wednesday, April 4th was the busiest day with 950 sessions.
Through this information, we can determine some key pieces of information:
- Concurrent Sessions (Users)
- Desired Throughput
- User Pacing/Thinktime
Delving further into the Analytics would help in identifying which critical business transactions/pages are used the most, and what % of users are required for any particular action. However, with this data we can use Little’s Law to derive the basic workload models for our Baseline, Peak, Stress and Soak tests.
Little’s Law can be defined simply as follows:
The long-term average number of customers in a stable system N is equal to the long-term average effective arrival rate, λ, multiplied by the average time a customer spends in the system, W. Expressed algebraically: N = λW.
To effectively use Little’s Law in defining meaningful workload models, and understand the average number of users in our system under test we need to define:
λ = Arrival Rate
T = Average Time on Site
Concurrent Users = λ x T
Arrival Rate – The average time between users entering the system. If Peak load statistics show 500 new requests per second, the arrival rate can be defined: λ = 1/500. You can more accurately define Arrival Rate by looking at T: Average Time Spent in The System (as described often as Residency) for users in the system. This information will be available from Analytics/System logs. How many unique users entered the system within that Residency period defines Peak Concurrency, and your Arrival rate will be defined:
λ = T/Peak Concurrency.
Exit Rate – The rate at which customers are leaving the system. We will assume a stable system for Little’s Law, e.g. the Arrival and Exit rate are the same.
For an established system, Arrival Rate and the Average Time Spent in the System can be defined through analysis of the traffic using analytics tools, e.g. parsing Web Server logs and determining the number of unique sessions over an hour, Google Analytics (as above) which has data on how many users enter the system, how long they are active, and how many pages they browsed before leaving, or other tools such as Splunk. If the client is expecting upticks in traffic following a new site, or expansion this data can be extrapolated to accommodate this increase in load.
Data from our example:
Arrival Rate (λ) = 950 Sessions per hour – this was the Peak observed.
Average Time on Site (T) = 126 Seconds
Concurrent Users = 950 (User Sessions) * 126 Seconds (Residency) / 24 / 3600 = ~1.4.
This establishes that, on average, there is generally only between 1 and 2 concurrent users on the site.
The example used describes a lightly used system, but these methods for gathering and correctly using data to create your Performance tests will work no matter the size of your organisation.