University of Washington

Improving Effectiveness of Automated Software Testing in the Absence of Specifications

This dissertation presents techniques for improving effectiveness of automated software testing in the absence of specifications, evaluates the efficacy of these techniques, and proposes directions for future research.

Software testing is currently the most widely used method for detecting software failures. When testing a program, developers need to generate test inputs for the program, run these test inputs on the program, and check the test execution for correctness. It has been well recognized that software testing is quite expensive, and automated software testing is important for reducing the laborious human effort in testing. There are at least two major technical challenges in automated testing: the generation of sufficient test inputs and the checking of the test execution for correctness. Program specifications can be valuable in addressing these two challenges. Unfortunately, specifications are often absent from programs in practice.

This dissertation presents a framework for improving effectiveness of automated testing in the absence of specifications. The framework supports a set of related techniques. First, it includes a redundant-test detector for detecting redundant tests among automatically generated test inputs. These redundant tests increase testing time without increasing the ability to detect faults or increasing our confidence in the program. Second, the framework includes a non-redundant-test generator that employs state-exploration techniques to generate non-redundant tests in the first place and uses symbolic execution techniques to further improve the effectiveness of test generation. Third, because it is infeasible for developers to inspect the execution of a large number of generated test inputs, the framework includes a test selector that selects a small subset of test inputs for inspection; these selected test inputs exercise new program behavior that has not been exercised by manually created tests. Fourth, the framework includes a test abstractor that produces succinct state transition diagrams for inspection; these diagrams abstract and summarize the behavior exercised by the generated test inputs. Finally, the framework includes a program-spectra comparator that compares the internal program behavior exercised by regression tests executed on two program versions, exposing behavioral differences beyond different program outputs.

The framework has been implemented and empirical results have shown that the developed techniques within the framework improve the effectiveness of automated testing by detecting high percentage of redundant tests among test inputs generated by existing tools, generating non-redundant test inputs to achieve high structural coverage, reducing inspection efforts for detecting problems in the program, and exposing more behavioral differences during regression testing.