Software Quality in Growing Development Organizations (Part 2)

In my previous post I discussed the importance of software quality in growing development organizations with a focus on how to measure the opportunities for defects to occur. In this post, I’ll round out the topic by discussing code quality, QA metrics, and pulling these metrics to help understand quality.

Measuring Code Quality

Code quality is usually measured with metrics like unit test coverage, code complexity, or the number of violations of certain rules assessed through static analysis. These measurements are helpful in developing a complete picture of quality, but bear in mind they are not a complete picture. I’ve seen plenty of failed projects that had very favorable code quality metrics but completely missed the mark relative to their requirements. I’ve also seen successful projects where the code quality looked poor based on tool measurements but in reality was not.

A classic example where these disparities can occur is with unit test coverage. A project with 75% coverage looks great on paper, but if the 25% that is not covered is the most complex or most frequently changed part of the code, the 75% coverage metric is nearly meaningless. Alternatively, 25% coverage may look terrible on paper, but if that 25% is the most complex part of the code, the project is probably in pretty good shape.

The goal of measuring code quality is not to manage to an absolute number by dictating a minimum code coverage percentage or maximum complexity score. Instead, it’s an indicator of where to examine the code created by the team more deeply and use experience judgement to determine if there is a problem. Measured over time on the same code base, code quality metrics establish a baseline that can also alert you to changes in quality.

Making Sense of QA Metrics

The QA process can generate lots of data, but many organizations have trouble teasing the meaning out of that data or are missing a few key metrics to develop a complete picture of project quality. Defect tracking systems track all sorts of details, but I recommend focusing on these data points:

  • Defect status
  • Defect open & close date
  • Defect severity using objective, predefined criteria
  • Number of scripts blocked by the defect
  • Number of times the defect failed retest

With these data points you can produce a simple dashboard of three charts that help you understand a variety of root causes for poor quality:

Defect Pie Chart

The breakdown of defects by severity is important to understanding how serious your defects are. Assuming a severity scale where severity one indicates a problem that completely prevents a major function of an application from working, you would expect to see relatively few severity one defects. Seeing a large number of severity one defects likely indicates a major, systemic problem in the software. You also expect to see more severe defects earlier in the testing process. If severe defects are uncovered late in the testing process, it may indicate a problem with the sequence of testing or that major changes were introduced late in the development process.

defect-bar-chart

Defect retest count failure is an often overlooked metric that is incredibly valuable. Simply put, this is a measurement of the number of times a defect was reported as fixed by development but failed QA upon retesting. In the chart above, five defects never failed retest, meaning the defects were corrected the first time development tried to fix them. Four defects failed retest once, meaning development thought they fixed the problem, returned it to QA to be retested, and QA had to send the defect back to development one more time for resolution. Two defects failed retest twice and one defect failed retest three or more times.

This metric is important because it exposes root causes of poor quality that can otherwise be difficult to detect. Normally you would expect most defects to be fixed without failing retest. Some relatively small percentage of defects may fail retest once or twice. If you see a lot of defects failing retest, particularly three or more times, it usually indicates one of the following situations is occurring:

  • Requirements are poorly defined so developers and QA are disagreeing about what the software should be doing
  • Developers are under pressure to resolve defects and are not spending the
  • Appropriate amount of time understanding and resolving defects
  • in complex environments with multiple systems interacting, developers may not understand the real root cause of the defect and there is a communication failure when triaging a defect involving two or more systems

The rework loop caused by defects failing retesting is also extremely costly to development and QA teams. It is almost always a good idea to closely monitor this metric. As soon as a defect fails retest one or two times, special attention should be given to the defect to prevent the rework cycle from continuing.

defect-performance-line-graph

Measuring the net defect trend line is important, especially in environments where software delivery is date-driven (as opposed to quality-driven). You can easily plot the total number of open defects, the number of new defects opened, and the number of defects resolved by date to create a defect trend line. If there is a trend of more defects being opened than being resolved, chances are good that you haven’t reached the peak of defect discovery yet. If the trend is closing more defects than are being opened, you can look at the trend to determine roughly when you can expect to have all the defects resolved and make a judgement about the ability to meet a release date. You can also use this as a tool to determine the number of developers you need working on defects to attain a particular defect closure rate.

Pulling It All Together

As we’ve seen, there are a variety of metrics that help measure various aspects of the software development process. Unfortunately there is no one metric that tells you everything you need to know about quality, and managing to a number won’t guarantee quality. Instead, treat quality metrics like a flashlight that helps you illuminate the inner workings of a software development organization. The software quality metrics covered in this article can’t solve every software development quality problem, but they will help you get pointed in the right direction to address the root cause.

Chris Hart

Chris Hart

CTO

Chris Hart is a co-founder and CTO of Levvel. He has more than 15 years of technology leadership experience and has led software development, infrastructure, and QA organizations at multiple Fortune 100 companies. In addition to his enterprise experience, Chris has helped start or grow multiple early-stage technology companies. In the five years before starting Levvel, Chris was focused on financial technology solutions in the consumer, commercial and wealth management space. His technical expertise and enterprise-scale, global program management background helps Levvel’s clients transform their businesses.