Orthogonal array

The Orthogonal array (OA) based testing is a systematic, statistical way of testing. Orthogonal arrays could be applied in user interface testing, system testing, regression testing, configuration testing and performance testing.

All orthogonal vectors exhibit Orthogonality. Orthogonal vectors are ought to exhibit the following properties:
 * Each of the vectors convey information different from any other vector in the sequence, i.e., each vector conveys unique information therefore avoiding redundancy.
 * On a linear addition, the signals may be separated easily.
 * Each of the vectors are statistically independent from each other.
 * When linearly added, the resultant is the arithmetic sum of the individual components.

Benefits include:
 * Provides uniformly distributed coverage of the test domain.
 * Concise test set with fewer test cases is created.
 * All pair-wise combinations of test set created.
 * Arrives at complex combinations of all the variables.
 * Simpler to generate and less error prone than test sets created manually.
 * Reduces testing cycle time.


 * It does not guarantee the extensive coverage of test domain

Orthogonal Array Types & Generation
Contents provided by: http://people.scs.fsu.edu/~burkardt/datasets/oa/oa.html

An orthogonal array A is a matrix of n rows and k columns, with every element being one of the q symbols 0 through q-1. The array has strength t if, in every n by t submatrix, the q^ t possible distinct rows, all appear the same number of times. This number is the index of the array, commonly denoted lambda. Clearly,

lambda * q^ t = n

Geometrically, if one were to "plot" the submatrix with one plotting axis for each of the t columns and one point in t dimensional space for each row, the result would be a grid of q^t distinct points. There would be lambda "overstrikes" at each point of the grid.

The notation for such an array is OA ( n, k, q, t ).

If

n <= q^ (t+1) then the n rows "should" plot as n distinct points in every n by t+1 dimensional subarray. When this fails to hold, the array has the "coincidence defect".

Owen (1992,1994) describes some uses for randomized orthogonal arrays, in numerical integration, computer experiments and visualization of functions. Those references contain further references to the literature, that provide further explanations. A strength 1 randomized orthogonal array is a Latin hypercube sample, essentially so or exactly so, depending on the definition used for Latin hypercube sampling. The arrays constructed here have strength 2 or more, it being much easier to construct arrays of strength 1.

The randomization is achieved by independent uniform permutation of the symbols in each column.

To investigate a function f of d variables, one has to have an array with k greater than or equal to d. One may also have a maximum value of n in mind and a minimum value for the number q of distinct levels to investigate. It is entirely possible that there is no array of strength t greater than 1 that is compatible with these conditions. The programs here provide some choices to pick from, hopefully without too much of a compromise.

The constructions used are based on published algorithms that exploit properties of Galois fields. Because of this, the number of levels q must be a prime power. That is

q = p^ r where p is prime and r is a positive integer.

The Galois field arithmetic for the prime powers is based on tables published by Knuth and Alanen (1964). The resulting fields have been tested by the methods described in Appendix 2 of that paper and they passed.

Motivation Visualization: Given a function in d dimensions, one might want to run it at N points and then use interactive data analysis on the output. For example if the function computes switching speed and breakdown voltage of a semiconductor device given d=10 process settings one might select those points with a large enough breakdown voltage and a large enough speed and then make plots in lower dimensions of the corresponding process settings. Full 10 dimensional grids are infeasible for this. Latin hypercube samples can miss settings where there are strong effects in the corners. For example if one setting is oxidation temperature and another is oxidation time we expect that the corners (hi,hi) and (lo,lo) will be significant. It can happen that Latin hypercube samples have gaps in these corners. There are 4*C(d,2)=2d(d-1) "bivariate" corners to investigate and random designs or Latin hypercube samples can easily miss some of them.

Integration: The sum of function values over a Latin hypercube sample is a good estimate of the integral over the input cube, for functions that are nearly additive. Stein shows how this works. Owen gives a central limit theorem for the estimate and shows how to estimate the variance. The sum over an orthogonal array of strength 2 gives a similarly good estimate of the integral for functions dominated by main effects and two factor interactions among the input variables. Owen discusses this.

Computer Experiments: In many computer experiments the visualization methods outlined above are sufficiently informative. Sometimes one would like to find response surface models for the function, perhaps for predictive purposes, or for interpolation to a finer visualization data set. Least squares regression methods can be applied here. The population coefficients are determined from certain integrals over the input space. These can be estimated by the sum over a sample (independent, Latin hypercube, orthogonal array). More accurate sample integrals imply more accurately estimated response surfaces.

Multivariate Nonparametric Regression: These arrays might form good designs for fitting models like MARS, Friedman.

GF Arrays Suppose that q, the number of distinct values per axis, is a prime or a prime raised to a power. Then there exists a Galois field with q elements, GF(q). Using this field one can construct OA( q^2,q+1,q,2 ) the orthogonal array with N=q^2 rows, k=q+1 columns, q symbols and strength 2. The construction is a special case of the one given in Raghavarao 2.4 and appears to be very old. Let column 1 be q 0's, q 1's,...,q (q-1)'s and let column 2 be q repetitions of (0,1,...,q-1). Then for 1 <= k <= q-1 column k+2 is (column 1) + k * (column 2) where the addition and multiplication take place in GF(q). If q is prime these are simply addition and multiplication modulo q. Random permutation of the labels within columns preserves the orthogonality, breaks up the "planes" and provides a basis for randomization inference.

The files gf.02, gf.03, gf.04 ... contain these arrays with q=2,3,4,5,7,8,9,11,13,16. Note that gf.q contains q^3+q^2 numbers so files with q=17,25,27,32 come in their own shar files.

When q is not a prime power the largest attainable number of columns k can be much less than q+1. In general if c mutually orthogonal latin squares of side q can be found, then an orthogonal array of k=c+2 columns may be constructed. For q=6, k=3 is the limit.

These designs are q^(q-1) fractional replications of q^(q+1) factorials with resolution III. That is they might be denoted q_III^[(q+1)-(q-1)].

In practice one might use the first 5 columns of gf.16 to get 256 runs in 5 variables. If some input variables only take 2 values (on/off say) then they can be coded as a column of gf.q for any even number q by mapping 0..q/2-1 to on and q/2..q-1 to off.

Taguchi's L9 has the same form as gf.03, L25 has the form of gf.05. The array qf.32 could be called L1024.

Files you may copy include: File Levels Runs Variables gf.02 2 4 3 gf.03 3 9 4 gf.04 4 16 5 gf.05 5 25 6 gf.07 7 49 8 gf.08 8 64 9 gf.09 9 81 10 gf.11 11 121 12 gf.13 13 169 14 gf.16 16 256 17 (gf.17) 17 289 18 (gf.25) 25 625 26 (gf.32) 32 1024 33

AK Arrays With N=q^2 only q+1 columns are possible. Addelman and Kempthorne show how to get 2q+1 columns in N=2q^2 rows. As before q is a prime or a power of a prime. The files ak.02,...,ak.11 contain these designs for q=2,3,5,7,9,11. The algorithm for even q is not as easy to code, but the bb2 arrays below fill that role. The array L18 has the same form as ak.03. This array was known before Addelman and Kempthorne by Bose and Bush and is alluded to in a note added in proof to Plackett and Burman. These designs are OA( 2q^2,2q+1,q,2 ).

Files you may copy include: File Levels Runs Variables ak.02 2 8 5 ak.03 3 18 7 ak.05 5 50 11 ak.07 7 98 15 ak.09 9 162 19 (ak.11) 11 242 23

BB Arrays Bose and Bush show how to construct OA( lambda x q^2, lambda x q, q, 2 ) where q is a prime power and lambda is a power of the same prime. The designs bb2.02,...,bb2.16 are of this form with lambda=2, except that they are augmented with a 2q+1st column using a method Bose and Bush discuss. So they are of the form OA( 2q^2, 2q+1, q, 2 ) where q is a power of 2, augmenting the designs available from Addelman and Kempthorne.

Files you may copy include: File Levels Runs Variables bb2.02 2 8 5 bb2.04 4 32 9 bb2.08 8 128 17 (bb2.16) 16 512 33

Extensions It is possible to construct arrays of strength 3. These might be useful for the same purposes as arrays of strength 2, but they require much larger numbers N of runs.

If you want q levels and q is not a power of a prime, the MacNeish-Mann theorem might help. If q=25*4, you can get an array with 5 columns of 4 symbols (gf.04) and an array with 26 columns of 25 symbols (gf.25). Take the first 5 columns of gf.25. Then make an array of 100x5 symbols in which the kth column is obtained by taking each element of the kth column of gf.25 and replacing it by a vector of length 16 formed by taking 4* the element of gf.25 and adding the 16 values in the kth column of gf.4.