Home Up Feedback Contents

Test Automation
 Home Up Testing Products Testing Services

 

Home
Up

Since testing is such an expensive part of any project, one might reasonably suppose that there are many tools allowing automation of testing.  Indeed, there are a few testing products, most notably the online data entry capture and replay systems mentioned above.  However, general case automated testing is difficult to do because in normal development testing you always have to do correctness testing.  There is no comparison standard against which to regress, and how do you program a piece of software to determine that another program is correct?  Having a practical answer to that question would allow significant progress in this area, but sadly none has been found to date.

Automation of migration testing is a different story.  Whereas normal development almost always involves a change of business rules, a migration project very carefully avoids changing any business rules, (or at least it does so if it wishes to minimize the overall cost of the project).  Changing the infrastructure under which those rules execute is the essence of a migration project, whether it be a platform change, a compiler change, or a database change, or all three.  And so long as the business rules do not change, it is possible to fully automate migration testing using a rigorous comparison testing methodology.

4.1        Source Code Instrumentation

Source code instrumentation is a process by which software (usually referred to as a “parser”) introduces new program logic into the source code of a given program.  Typically, this new logic is functionally neutral in that it does not change the business logic of the program, but the new logic is used for some analytic purpose to understand the operation of the program itself.  Coverage analysis is an example of using source code instrumentation to determine what logic in a program is and is not executed.  Similarly, performance analysis within a program can be established by timing each paragraph, each branch path, or each instruction in the program and thereby pinpointing where excessive resource consumption is originating.

Automated migration testing utilizes source instrumentation in several forms, initially for capture of test data, and then for one or more different replay scenarios depending on what part of the program is being tested.  This summarizes the whole process:
  

In an automated migration the old sources are processed through a parsing engine that modifies the program source in a controlled fashion to produce the new sources for the new environment.  Automated testing augments this by using two forms of instrumentation against the old sources, and one or more forms of instrumentation against the new sources.  In the next sections, we will discuss the different source code instrumentation sets and their purposes.

4.2        Capture Processing

In capture instrumentation, the old sources have logic added that captures the before and after contents of all data areas relating to I/O operations, plus all relevant state information, all written to a capture file.  In addition, the dynamic logic path through the program is captured to be used for coverage analysis and for logic path testing.  All of the captured information is placed into a capture file for subsequent use in replay processing.  Note that “I/O” in this sense refers to any operation that causes data to move into or out of the module under test.  Thus, a date call to the operating system, a subroutine call, or passed parameters to a called module are equally an I/O as a READ, WRITE, SQL or CICS command.

In effect, for the purpose of validation and business rule replay, the capture file comprises the complete batch or online test script.  All inputs and all outputs are recorded automatically, as well as all execution parameters without the opportunity for manual error in securing the scripts and without the involvement of subject matter experts.  Note that this all but eliminates the greatest sources of both logistical effort and logistical error in the definition of test scripts.

For I/O replay and datacomm replay, any data store that is randomly updated (typically database files and indexed files) must also be secured at the start of data capture and coordinated with the capture data.  Optionally, they can be secured after data capture as well.  None of the sequential files need be secured as their full contents will be recorded in the capture file, including printer output and all terminal traffic.  When multiple programs (including subroutines) multi‑process during capture, it is necessary to include at least a high resolution date/time stamp to provide unique sequencing to the order of I/O’s issued, or a sequencing broker introduced to force serialization of I/O’s if date/time is insufficient to ensure uniqueness.

4.3        Validation Replay

Validation replay is the inverse of the capture processing.  The old sources have logic added to suppress all normal I/O and to take all input I/O data from the capture file and to compare any output I/O against the previous output I/O as stored in the capture file.  The logic path followed in the capture program is compared against the logic path followed during replay processing. No external I/O takes place at all except to the replay parameter file and to the replay report file. 

This initial replay step ensures that the captured data was secured without processing or handling errors, and also produces a cumulative coverage analysis report by program.  If any program requires additional coverage, then the capture program only has to be re-executed with the additional data conditions, and the additional coverage data added to that already captured for that program.

The capture must always occur on the source platform, using the original sources, compiler, database and other components.  Depending on the source and target environments, the validation replay may have to occur on the same platform, or it can occur on an inexpensive Intel workstation, for significantly greater efficiency and lower cost.

4.4        Business Rule Replay

Batch programs have two logical layers: the business rule layer and the data access layer.  Online program have three: the same as batch plus the user interface layer.  Business rule replay tests the business rule layer of both batch and online programs.  The new sources are instrumented with the same logic as inserted into the old sources for validation replay, then the replay is executed.  This will occur in the target environment in many cases, but in some an Intel workstation can be used for business rule replay as well as validation replay.

Any errors introduced into the program business logic by the migration process, or any discrepancies created by the change in environment or the change in compilers will appear on the replay report.  These discrepancies take two forms: data discrepancy (including state information discrepancy) and logic path discrepancy.

A data discrepancy is found when any referenced field has even a single bit error.  The following is an example of a data discrepancy taken from a replay report:

 Input#        236 Verb# 0150 Verb TRANSFER  Record PARW8580-W01-REC
                              Return Code 0000 File Status
  
Compare=BEFORE Input Seq 006130 COBOL Seq 006128

0001-0002  Cap  ..
 Hi 4Bits       00
 Lo 4Bits       0D
                ----+----1----+----2----+----3----+----4----+----5
0001-0002  Rep  ..
 Hi 4Bits       00
 Lo 4Bits       00
!!!!!!!!!       =X
               
----+----1----+----2----+----3----+----4----+----5

In the captured data, this 2 byte field had a value of X’000D’, but in the replay it had a value of X’0000’.  All fields, regardless of size, are compared and any discrepancy displayed in this manner.  An “X” indicates the byte or bytes that differ in the field.

Similarly, a logic path discrepancy produces an entry in the replay report like the following example:


TH
Path Error   Input#         63
   Capture Path# 000009  Cap Src Seq # 001019

   Replay Path # 000011  Cap Src Rec # 001026 Rep Src Rec # 004783
 

In this example, during the capture execution the program took branch path #9, whereas in the replay execution the program took branch path #11.  Sometimes a logic path discrepancy does not result in a data discrepancy, but indicates that there is a possible latent error that might be expressed as a data error under other conditions.  For example, the wrong calculation might be used but the data involved included multiplying by zero, so the result was the same in either calculation.

At other times a logic path discrepancy will pinpoint the exact point in a program that was the ultimate cause of a data discrepancy, allowing the source of the problem to be found in seconds rather than hours.

4.5        I/O Replay

I/O replay is one of two optional rule sets that can be applied to the new sources.  I/O replay tests the I/O interface logic only, not the business rule logic.  Since in this case the database and indexed file update I/O’s will actually be expressed, it is necessary to secure the before image of the database (and optionally the after image as well) along with the capture file.  The before image must be converted to the target database prior to initiating I/O replay.  Optionally, the after image can be compared to the resulting database after completing I/O replay.

In the case of a single batch program execution with no concurrent programs or subroutines participating the I/O’s, business rule replay and I/O replay testing can be combined in a single replay execution, which is sometimes logistically convenient. 

With batch or online programs executing multiple concurrent programs, the various participating programs are instrumented so that only the I/O routines will execute during replay.  Then the serialized I/O’s are extracted from the capture data repository in temporal sequence across all programs and a special set of captured data created specifically for I/O replay.  An I/O replay driver program will read the captured data and pass it to the appropriate program via a CALL.  Then the called program will execute the I/O against the database or indexed file and compare the results with the results obtained from the original I/O.  In this way, all the I/O’s can be tested without the need for application program logic to be invoked.  As an interesting side benefit, this I/O driver arrangement can be used to create an I/O stress test to facilitate database tuning.

It depends on the nature of the migration whether I/O replay will be the most important part of testing or whether business rule replay will predominate.  Typically, language migrations and platform migrations will find business rule replay to be more important, whereas database migrations will find I/O replay to be more important.  Some projects will find both to be of equal importance.

4.6        Datacomm Replay

Where the architecture of the target data communications environment allows it, the input and output data communications messages can be extracted in a manner very similar to I/O replay, and a datacomm driver utility used to exercise the online programs in the same temporal sequence as the original messages.  This arrangement can also be used to create a stress test of the online environment to determine the maximum throughput of the target configuration. 

4.7        Summary

Using a capture/replay methodology, it is possible to fully automate migration regression testing, including branch and path coverage analysis.  By separately testing the business rule, I/O and user interface layers of programs, all aspects of the program execution can be accurately tested.  However, this separate testing methodology only works because 100% of the data and state information is captured and then compared during replay testing.  Any attempt to use less than 100% will cause the tests to fail in short order.

As a practical matter, it should be clear that this methodology produces a much higher degree of accuracy than any attempt at manual testing, and further that the methodology provides its own audit by having fully integrated coverage analysis. It should also be clear that the degree of detail and perfection of process required for this testing to succeed cannot be implemented with fallible human beings conducting the testing and performing the comparisons.  Only full automation can succeed.  As a result, this is not a method that augments manual testing, which is how some test automation tools are used.  This is a method that renders manual testing largely irrelevant, at least at the level of unit and system testing where most project testing effort is expended.

In the final analysis, this method must also be less expensive than anything but the most basic of typical testing, since it largely or completely eliminates the most labor intensive parts of migration project testing, which also happen to be the most tedious and error prone aspects of the testing process.  If the automated testing is also integrated with the automated migration process, it may be possible to catch more errors earlier in the cycle, reducing most client and vendor costs while improving the accuracy of the results.

 

Home ] Up ] Executive Summary ] Conventional Testing ] Professional Testing ] [ Test Automation ] Conclusion ]

Last modified: 07/19/08