IBM Quality Assurance and Testing Services (QA&TS) practice recently briefed NelsonHall on its approach to managing test cases and on reducing the numbers of test case automation, beyond pair-wise testing.
Before we jump into the details of IBM QA&TS’ offering, let’s step back for a few moments: software testing remains a very human labor-intensive activity. If we take a manufacturing plant comparison, a lot of human effort remains at the assembly line level, creating outputs, in the form of test case design and automated scripts. Over the past decade the testing service industry has put a lot of effort in driving efficiencies around test case design and automation, mostly through reusable artefacts (for instance, test cases and test scripts, frameworks, scriptless frameworks and point solutions; the list is vast).
Another way of limiting the effort spent on output (i.e. test script design and execution) is to limit the input. Methods of input limitation include:
- Test process improvement advisory services (to improve test manufacturing processes)
- Making testing requirements clearer to testers i.e. remove ambiguity for testers
- Reducing the number of test cases to design and eventually execute. Using this method, most software testing service vendors have used a statistical approach, pair-wise testing, to optimize the number of test cases.
To understand the inner dynamics of pair-wise testing: each test scenario has attributes - in the example of an online purchase transaction that involves payment and shipping, attributes are the payment option and the shipping option. Each attribute requires data options (e.g. payment options are credit card, PayPal or other). Test cases are systematically created to reflect all attribute options combinations (e.g. credit card payment with express shipping; credit card payment with regular shopping; credit card payment with air shipping, etc.). The number of data options in each attribute increases the number of test cases to be executed. The statistical philosophy of pairwise testing stipulates that this systematic test case approach based on data attributes combination is not worth it: the number of additional defects found will not rise significantly. This is true for two attributes, and even more so for three or more attributes (e.g. payment, shipping, billing address, shipping address). But testing two or more attribute combinations can help raise test coverage (i.e. the percentage of functions in the code is tested).
This approach is not entirely new: large testing services vendors, e.g. Wipro with its StORM methodology, and smaller ones, e.g. Ciber with its Optimal Pathing methodology, have used the approach, combining pair-wise testing or sometimes using Combinatorial Test Design (CTD). Also, CTD, combined with business rules, has other benefits e.g. not only identifying gaps in test coverage, but also identifying those gaps early in the testing life cycle.
Nevertheless, pair-wise testing or CTD are largely underused by client organization and rarely mentioned by testing services vendors.e
Back to IBM QA&TS, who is emphasizing CTD and trying to get it out of its statistical ghetto. Why?
Largely, because IBM is now finding client organizations with large estate of test cases (up to 15,000 for a large critical application with several releases per year). These test cases are not documented and not well understood.
Its initial approach is to understand what’s available through categorizing these test cases into clusters, using internal IBM Research IP developed in Haifa, Israel, Test Case Clustering. Test Case Clustering will also tag the existing test cases so that they can be more easily searched. This is the first step.
The second step is to determine the lowest number of test cases required for achieving 100% test coverage or any desired test coverage target. The company uses an accelerator to help compute this optimal number of test cases. Test Case Optimizer (TCO) is based on CTD with business rules added (e.g. in our online purchase test scenario example, air transportation as a shipping option is not available for local delivery of the purchased good).
There is more than just counting the optimal number of test cases. In a demo in the context of an agile project, TCO went through user stories and generated test cases, identifying attributes (i.e. test data options). TCO can also generate test cases in the context of waterfall projects, based on testing requirements. That is a major step.
IBM takes a common sense approach with TCO: the tool can create new test cases from scratch or can reuse test cases that the client already has.
What are the results? IBM QA&TS suggests that within its clients’ installed estate, 30% to 70% of their test cases are redundant. If we take a level of 33%, at the bottom end of this range, for an organization with 3,000 test cases for one large and critical application: that’s 1,000 test cases that can be removed right away.
There is more to it. The level of test coverage is well under what clients are targeting. IBM has no benchmarking data, but suggests that 50% to 80% of test cases are not necessary under CTD methods for the same level of coverage. Back to our client examples with 15,000 test cases: IBM suggests that at least 5,000 test cases that can be removed. The impact on time and effort is significant in terms of creation of those unnecessary test cases and scripts, but also in terms of maintaining them.
Back to our initial assembly line discussion, with fewer test cases i.e. less input, the client has less output to manufacture and is making savings on the manufacturing process. Now there is a missing step in IBM’s test optimization story: the automation of the generated test cases into test scripts. IBM QA&TS is working on this. This is about using a specific programming technique, behavior-driven development (BDD) and using a BDD open source tool, Cucumber and its underlying scripting language, Gherkin. Because of the level of standardization and formalization brought by BDD, IBM QA&TS believes it will be able to generate automatically the test scripts from these test cases. Execution of those test cases is next, whether on Selenium or using object-based language Ruby.
Obviously, there are a lot of conditions and restrictions; this is largely work-in-progress. Nevertheless we are getting closer to a full dev and test life cycle automation, both at the input and output levels of our SDLC assembly line. Software testing badly deserves it to become much less labor-intensive.