An Analysis of Analytics [Part 1]

Employees, Contractors, Consultants – Oh My!
February 22, 2017
An Examination of Software Testing Approaches
February 22, 2017

An Analysis of Analytics [Part 1]

It seems everybody loves metrics and has their own set of favorites. In the field of software development many metrics have come into favor, and later fallen out of favor. Without doubt, they can be used to interesting numbers and pretty graphs; but do they really provide value?

To determine net value, one must examine the costs incurred along with the benefits received to determine the ROI [Return on Investment]. For most corporate situations, this directly translates to money; for others the investment and benefit may be measured in other terms, but the calculation of ROI still translates to a quantifiable measure of “Was it worth it?”.

A big part of the challenge in determining ROI is that it is rarely known what will actually happen, or what the metrics will reveal. In many ways this is similar to the situation with insurance policies, medical checkups, and property inspections. Under the best of circumstances they are all expenditures that provide no actual return on the investment. Under other conditions they are critical to having a manageable situation rather than an unmitigated disaster. Prudent individuals and organizations select a sub-set of all of the available items (hopefully) based on an risk analysis and rarely (if ever) make use of every possible offering.

Selecting a set of metrics and analytics will benefit greatly by performing a set of calculations based on a similar set of costs/risk/benefit criteria. Since these calculations are often predictive in nature, a probability of occurrence needs to be assigned to the elements in the calculations. While the end result will still be a subjective judgment call, having these numbers will greatly facilitate focusing on those items most likely to provide benefit and discarding those that are likely to result in nothing but sunk costs.

Lets break down the costs into a number a few categories :

  1. Cost of Acquiring the Raw Information. This aspect is time sensitive, if the information is not captured at time of occurrence, it may be very difficult or impossible to calculate at a later time. If one opts not to acquire the data then there is likely to be hidden “lost-opportunity costs” if it is later desired to analyze the information.
  2. Cost of Maintaining the Information. While pure storage costs have become negligible in most circumstances, there are costs involved in ensuring that the data will be in a form what meaningful analysis can be done over time.
  3. Cost of Analyzing the Information. This is typically the largest of the costs. It is not just the processing of raw data into some type of summary form, but the human time determining what impact the numbers have and what actions need to be taken to achieve the desired goals.

Next we can look at a breakdown matrix of possible timelines (only a subset are covered at this point)

  1. No Data Acquired – Analysis would have had no impact. In this scenario, not incurring costs turns out to return the maximum tangible ROI.
  2. No Data Acquired – Analysis would have had a positive impact (either improvement or problem mitigation). In this scenario there is a lost opportunity to achieve a better ROI.
  3. Data Analyzed – No impact on outcome. This scenario represents a negative tangible ROI but does provide peace of mind and confirmation that all is well.
  4. Data Analyzed – Positive impact (either improvement or problem mitigation) achieved as a result. This scenario is the most commonly desired state providing an positive tangible ROI.

Of course these are large bucket summaries, and much better information can be gathered by assigning numeric values to the costs along with probabilities and values to the outcomes. It is also recommended to do this over a period of time, with individual numbers assigned to each period [perhaps per iteration/sprint], and the results aggregated over a longer interval [perhaps over a year].

Repeated experience has indicated that the most effective approach is to drive down the costs associated with acquiring and maintaining the data and then gathering as much data as practical. This greatly reduced the chances of lost opportunities (due to lack of data) and provides the ability to perform the analysis as indicated.

In a future post, I will dig deeper into some of the data, analytics/metrics and the ever present danger of misinterpreting the results.

Leave a Reply

Your email address will not be published. Required fields are marked *