When do you consider a software metric useful?

Do you use software metrics in your project? Which ones? Why do you use those software metrics?

The answer to question one is probably 'Yes'. The answer to question two may vary, but hopefully the answer to question three is: "because I find them useful".

For me, the usefulness of a software metric is determined by two properties. On the one hand the software metric should be a correct quantification of what I want to measure, while on the other hand the value of the metric should provide enough information to make a decision.

To verify whether a metric measures what you want it to measure you can examine the value of a metric for a small number of cases, or you can conduct a more quantitative experiment to understand the statistical behavior of a the metric on a large group of systems/components/units. The nice thing about such an experiment is that you can conduct it in a relatively safe lab-environment using open-source systems.

Because of its relative easiness this type of evaluation has been done extensively over the past years.  Virtually every scientific paper on software metrics includes at least one or two case studies, but often researchers also examines the statistical relationship between the value of the (newly proposed) metric and other desirable attributes. For example, we did this for our Component Balance and Dependency Profiles metrics.

To understand whether a metric can be effectively used in a decision making process is more complicated. First, you need to ensure that the metric is available for a large number of projects for an extended period of time. Secondly, you need to observe the people involved in the projects and record discussions/decisions involving the metric. Lastly, the gathered data needs to be analyzed to extract usage patterns and identify areas for improvement.

This second type of evaluation requires quite some time, patience, access to a wide range of software projects in various stages of development, and you need to be able to communicate with the people involved in these projects. Basically, you need to find a company which allows you to conduct this type of research, which might be the reason why I did not find any study which evaluates software metrics in this way.

You can probably guess which company allowed me to conduct this research. Indeed, within the environment of the Software Improvement Group me and my co-authors were allowed to study the usefulness of our architectural metrics. The full details of the evaluation design and the results are available in our ICSE 2013 SEIP paper:
which is going to be has been presented at the ICSE conference in San Fransisco! The slides of this presentation can be found by clicking this link.

Naturally, I am very proud of this paper. In particular because it takes the evaluation of the software metrics one step beyond the usual statistical validation. What do you think, should all metrics be validated like this or should we look at other aspects as well?