SRC-2019

View the Project on GitHub biabs1/SRC-2019

Is Mutation Score a Fair Metric?

Abstract

Mutation score can be used to compare different test suites in relation to mutants detection. However, it is not known if the mutation score, being a summary of the detection ratios of different mutation types, is a fair metric to do such comparison. In this paper, we present an empirical study, with 10 open-source projects, which compares developer-written and automatically generated test suites in terms of mutation score and in relation to the detection ratios of 7 mutation types. Our results indicate fairness on the mutation score but also suggest equivalence among mutants generated by PIT with different mutation operators.

This page provides the experimental material and the statistical analysis used in this experiment.

Experimental Material

Test Generation Tools

Mutation Testing Tool

Case Study Applications

We used 10 projects from the Apache Commons Repository. These projects already have developer-manually written test suites.

Applications with regression test suites for each test generation technique

We manually removed all the tests that did not pass.

Data Analysis Scripts

Data Analysis Results