When do you consider a software metric useful?

Do you use software metrics in your project? Which ones? Why do you use those software metrics?

The answer to question one is probably 'Yes'. The answer to question two may vary, but hopefully the answer to question three is: "because I find them useful".

For me, the usefulness of a software metric is determined by two properties. On the one hand the software metric should be a correct quantification of what I want to measure, while on the other hand the value of the metric should provide enough information to make a decision.

To verify whether a metric measures what you want it to measure you can examine the value of a metric for a small number of cases, or you can conduct a more quantitative experiment to understand the statistical behavior of a the metric on a large group of systems/components/units. The nice thing about such an experiment is that you can conduct it in a relatively safe lab-environment using open-source systems.

Because of its relative easiness this type of evaluation has been done extensively over the past years.  Virtually every scientific paper on software metrics includes at least one or two case studies, but often researchers also examines the statistical relationship between the value of the (newly proposed) metric and other desirable attributes. For example, we did this for our Component Balance and Dependency Profiles metrics.

To understand whether a metric can be effectively used in a decision making process is more complicated. First, you need to ensure that the metric is available for a large number of projects for an extended period of time. Secondly, you need to observe the people involved in the projects and record discussions/decisions involving the metric. Lastly, the gathered data needs to be analyzed to extract usage patterns and identify areas for improvement.

This second type of evaluation requires quite some time, patience, access to a wide range of software projects in various stages of development, and you need to be able to communicate with the people involved in these projects. Basically, you need to find a company which allows you to conduct this type of research, which might be the reason why I did not find any study which evaluates software metrics in this way.

You can probably guess which company allowed me to conduct this research. Indeed, within the environment of the Software Improvement Group me and my co-authors were allowed to study the usefulness of our architectural metrics. The full details of the evaluation design and the results are available in our ICSE 2013 SEIP paper:
which is going to be presented at the ICSE conference in San Fransisco. So if our tutorial was not enough you now have an additional reason to attend this nice conference. Note that there just a few more days left before the early bird discount ends!

Naturally, I am very proud of this paper. In particular because it takes the evaluation of the software metrics one step beyond the usual statistical validation. What do you think, should all metrics be validated like this or should we look at other aspects as well?

Detecting Cross-language Dependencies Generically

Most systems I see are not written in just a single technology. Instead, dedicated technologies  are used for the front-end (think JavaScript, HTML, JSP), the server-side business logic (think Java, C#, C, insert-your-favorite-programming-language-here) and the data-storage (think any flavor of SQL or NoSQL). 

To get an initial understanding on how all of these technologies work together in the system it is needed to determine what the dependencies between these technologies are. Where does the JavaScript connect to the server? And how is the stuff we see on the front-end connected to the stuff stored in the database?

For the most common cases there is tool-support available to detect these dependencies (semi)-automatically. However, extending these tools with support for a new language is not trivial because the techniques used requires you to deeply understand the full language of the new technology.  

Until now...

Last year I had the pleasure of co-supervising Theodoros Polychniatis for his internship at the Software Improvement Group on the subject of Detecting dependencies across programming languages. After diving into the available literature and soliciting requirements from consultants he managed to create a prototype tool which implements a relatively simple algorithm to detect dependencies across source-code modules written in different technologies. 

The first evaluations show that the algorithm is capable of producing a relatively good recall (e.g. finding the dependencies that you want) and a reasonable precision (e.g. finding only actual dependencies). As always, the behavior of the algorithm greatly depends on the parameters used, but the initial results are positive enough to continue exploring this idea.

Apart from producing the thesis Theodoros was also the driving force behind the paper:

This paper will be presented (and published) at the 17th European Conference on Software Maintenance and Reengineering (CSMR 2013). If you are considering attending CSMR make sure you drop by at the Software Quality and Maintainability (SQM) workshop for a chat and a very interesting key-note!

As a closer, here is the abstract of the paper:
In order to evaluate large, heterogeneous information systems (i.e., comprising modules developed in diverse programming languages) a method to detect dependencies among these modules is needed. Although there is a variety of methods that can detect dependencies within a single programming language, the available cross-language detection methods use extensive language specific information to parse and analyze modules written in different languages.
In this paper, a new method for detecting cross-language dependencies is proposed. This method is generic, yet accurate and can support new languages with minimal effort. To evaluate the method, a tool was created and a series of experiments was conducted on a small case study for which dependencies had been extracted manually. The evaluation shows that the method is effective, extensible and easily explainable.

Tutorial: Software metrics - Pitfalls & Best Practices

The International Conference on Software Engineering (ICSE) is one of the (if not the) largest conference on software engineering in the world. This year, the conference will take place in the Hyatt Regency, San Francisco, U.S.A. 

At this conference, Arie van Deursen, Joost Visser and I will be organizing a three-hour tutorial. To quote our proposal: 

Using software metrics to keep track of the progress and quality of products and processes is a common practice in industry. Additionally, designing, validating and improving metrics is an important research area. Although using software metrics can help in reaching goals, the effects of using metrics incorrectly can be devastating. 

In this tutorial we leverage 10 years of metrics-based risk assessment experience to illustrate the benefits of software metrics, discuss different types of metrics and explain typical usage scenario’s. Additionally, we explore various ways in which metrics can be interpreted using examples solicited from participants and practical assignments based on industry cases. During this process we will discuss four common pitfalls of using software metrics.

In particular, we explain why metrics should be placed in a context in order to maximize their benefits. A methodology based on benchmarking to provide such a context is discussed and illustrated by a model designed to quantify the technical quality of a software system. Examples of applying this model in industry are given and challenges involved in interpreting such a model are discussed. 

This tutorial provides an in-depth overview of the benefits and challenges involved in applying software metrics. At the end you will have all the information you need to use, develop and evaluate metrics constructively.  

It is yet unclear when the tutorial takes place (either before or after the main conference), I will update this post as soon as this information is available. The tutorial is scheduled on May 21, see the ICSE program for more details. More information about the registration can be found here.

Meanwhile, please feel free to share your thoughts, remarks or questions on this topic via the comments or any other means of communication!

A last post?

It took some time to actually post it here, but I am very pleased to say that the paper "Dependency Profiles for Software Architecture Evaluations" by Bouwers, van Deursen and Visser has been accepted at the Early Research Achievements-track of the 27th IEEE International Conference on Software Maintenance.

Before dumping the abstract I want to confess that I have given in and created a twitter-account. As you might have noticed, my updates have been infrequently at best. This is mainly because it takes me a long time before I start writing. Let's see whether this twitter-thing makes this easier!

Abstract:
In this paper we introduce the concept of a “dependency profile”, a system level metric aimed at quantifying the level of encapsulation and independence within a system. We verify that these profiles are suitable to be used in an evaluation context by inspecting the dependency profiles for a repository of almost 100 systems. Furthermore we outline the steps we are taking to validate the usefulness and applicability of the proposed profiles.

WICSA 2011

And yet another publication to announce! I am very happy to tell you all that the paper "Quantifying the Analyzability of Software Architectures" by Bouwers, Correia, van Deursen and Visser has been accepted at the 9th Working IEEE/IFIP Conference on Software Architecture!

Abstract:
The decomposition of a software system into components is a major decision in any software architecture, having a strong influence on many of its quality aspects. A system’s analyzability, in particular, is influenced by its decomposition into components. But into how many components should a system be decomposed to achieve optimal analyzability? And how should the elements of the system be distributed over those components?
In this paper, we set out to find answers to these questions with the support of a large repository of industrial and opensource software systems. Based on our findings, we designed a metric which we call Component Balance. In a case study we show that the metric provides pertinent results in various evaluation scenarios. In addition, we report on an empirical study that demonstrates that the metric is strongly correlated with ratings for analyzability as given by experts.


(and yes, maybe twitter is not such a bad idea if I keep on writing these short posts :)