... resources about MT

Navigation

Strong vs Weak Mutation Testing

The point of both strong and weak mutation testing tools is to tell you something about the quality of your tests. Weak mutation measures how well your tests stimulate the possible behaviours of the program you are testing. Strong mutation also measures how thoroughly your tests check the outputs of the program.

Mutation changes the program and then asks the tester to find a test that causes the mutant to fail (produce incorrect results). Strong mutation looks at the final output to decide whether to kill the mutant, whereas weak mutation looks only at the state of the program immediately after the mutated statement. Their results will differ when we have "masking", that is, when an incorrect state does not result in incorrect output. So strong mutation analysis can tell you more about your test suites than weak mutation, but weak mutation analysis is often much quicker to perform than strong mutation analysis.

Weak Mutation Testing

Consider this example of weak mutation testing, taken from one of Brian Marick's papers with minor edits:

In weak mutation coverage we suppose that a program contains a particular simple kind of fault. One such fault might be using <= instead of the correct < in an expression like this:

if (A <= B)

Given this program, a weak mutation coverage system would produce a message like

"gcc.c", line 488: operator <= might be < This message would be produced until the program was executed over a test case such that (A <= B) has a different value than (A < B).

That is, a weak mutation testing tool tells you if the tests cause the code to run such that (A <= B) always has the same value as (A < B), which might indicate a missing test, i.e. you should probably have a test in which the value of (A <= B) is different than the value of (A < B), i.e. a test which causes the code to run such that A = B.

Strong Mutation Testing as done by Jester

Jester (a strong mutation testing tool) tells you something subtlely different. Jester indicates if the tests still pass when the expression (A <= B) is replaced by (A < B). This indicates that a test might be missing in which it makes a difference that it's "<=" rather than "<". Note that with Jester, it matters whether the tests pass or not. If the tests execute code that causes A=B but that has no effect on whether the tests pass, then weak mutation is satisfied, but strong mutation is not.

So what's the difference?

Weak mutation is a coverage measure - i.e. it measures something about the code that is run as a result of running your tests. Strong mutation testing measures something about whether your tests test that your code is like it is.

Having your tests run your code is necessary for your tests to stand any chance of being any good, but it's not enough. You could have tests that have no assertions that still get 100% coverage based on any measure of coverage, including weak mutation testing. Coverage tools don't measure whether your tests actually test anything, they just measure that your tests run the code in whatever way the coverage is measuring.

Strong mutation testing tells you whether your tests are strong enough to detect changes to the code.

So which is better?

One very significant benefit of a coverage tool (including a weak mutation testing tool) compared to a strong mutation testing tool is that a coverage tool will run much, much faster. Most coverage tools add very little to the time it takes to run your test suite. Jester requires your test suite to be run for every mutation, which could be thousands of times. Even with a really fast test suite and small code base, Jester can take a really long time to run.

If your code isn't executed by your tests, then that can be indicated in a very user friendly way by a simple code coverage tool. If Jester is used on code that isn't executed by your tests, it will just indicate all the mutations that Jester can make to that code (and take a long time to do that) and it won't be as obvious that the reason is that the code simply isn't executed by the tests. However, code coverage tools don't tell you about the quality of your tests directly; you can fool a code coverage tool (deliberately or accidentally) by having tests that execute your code but have no assertions. But you can't fool a strong mutation tool in the same way.

So a good recommendation is to start with a traditional code coverage tool to measure how well your tests cover the code of the program under test. Once your code coverage becomes reasonably high (perhaps 80% or more), then use strong mutation analysis to find cases where your tests are not strong enough to detect possible errors (mutants) in the program.

Summary

Statement coverage or similar simple coverage measures are a good place to start. They will tell you things like "none of your tests execute line 80".

Weak mutation testing tells you more about the stimuli that your tests send to the program. It will tell you things like "none of your tests execute the A<B on line 80 with the values of A and B being equal".

Strong mutation testing also tells you about how well your tests detect errors in the program. It will tell you things like "none of your tests detect the difference between A<B and A<=B on line 80".

r2 - 11 Sep 2007 - 17:05:04 - Mark Utting
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Mutation Testing Online? Send feedback
Syndicate this site RSSATOM