User:David Cruise/Homogeneity

In statistics, the word homogeneity is sometimes used as a synonym of homoscedasticity, but this article treats a different topic.

Homogeneity in statistics and data analysis pertains to properties of logically consistent data matrices. Within this framework, the coefficient of homogeneity indicates the degree data approximate the Guttman implicatory scales.

The etymology of the term scale can be traced to the Latin word scala meaning ladder, steps, stairway. Homogeneous, internally consistent data matrices form step-like structures (cf., Fig. 3). The lack of homogeneity indicates the degree data structures depart from this ideal lattice form. The conceptual differences between homogeneity and internal consistency reliability are often poorly understood. These differences are best elucidated by contrasting the limiting cases of both indices.

Tautologous lattices
Normal (tautologous) structure for binary variables p, q, and r, shown in Fig. 1 for five variables, can be defined as Y = (p 1 q) & (q 1 r) where 1 signifies the logical operator of tautology.

Since Y is a unit vector, tautologous structures do not have to be rectified. The intercorrelations of the p, q, and r variables form an identity matrix shown below.



\begin{bmatrix} 1 & 0 & 0 \\  0 & 1 & 0 \\   0 & 0 & 1 \\  \end{bmatrix} $$

The coefficient of the internal consistency reliability and the coefficient of homogeneity for tautologous lattices are both equal to zero.

Equivalential lattices
Parallel (equivalential) structure for binary variables p, q, and r can be defined as y = (p = q) & (q = r).

When rectified (according to the values of the rectifying variable Y), you can get step scale such as shown in Fig. 2 for eight variables.

Intercorrelations for this type of abstract data are shown below:



\begin{bmatrix} 1 & 1 & 1 \\  1 & 1 & 1 \\   1 & 1 & 1 \\  \end{bmatrix} $$

For the equivalent lattices, the coefficient of the internal consistency reliability equals one and the coefficient of homogeneity is less than one.

Implicational lattices
Hierarchical (implicational) structure for binary variables p, q, and r can be defined as y = (p -> q) & (q -> r).



When rectified (according to the values of the rectifying variable Y), you can get the implicational (Guttman) scale such as shown in Fig. 3 for eight variables.

Intercorrelations for this type of abstract data are shown below



\begin{bmatrix} 1.000 & 0.577 & 0.333 \\  0.577 & 1.000 & 0.577 \\   0.333 & 0.577 & 1.00 \\  \end{bmatrix} $$

For the implicational lattices, the coefficient of the internal consistency reliability is less than one and the coefficient of homogeneity equals one.

Tetrad criterion
The above correlation matrix is compliant with Spearman's tetrad criterion, characteristic of hierarchical unidimensional structures. The tetrad criterion is based on computations of products and differences of four (from Gr. prefix τετρα-, four) elements of correlation matrices. For the above matrix of correlations, the tetrad criterion can be tested as .577(.577)-1.000(.333). If the result (as in this instance) equals zero or is close to zero, the tetrad criterion is met.

Spearman tetrads are in fact the 2 x 2 minors of a matrix. In factor analysis, the number of common factors is one less than the order of the lowest-order minor that will vanish. In the case of implicatory scales, even minors with order equal to two (tetrads) will vanish, thus these data structures are unidimensional.

Coefficient of homogeneity
The original coefficient of homogeneity, wrapped in complex algebraic considerations, was introduced in 1948 by Loevinger. Interest in homogeneity of data was revived during the closing decades of the last century by Cliff (1977), and by Krus and Blackman (1988). On the basis of theoretical analysis outlined above, Krus and Blackman defined the coefficient of homogeneity as



h_{xx} =\frac{MS_I - MS_{RES}}{MS^*_I - MS^*_{RES}} $$

where MS stands for mean square, I for individuals and RES for residual terms of the analysis of variance. The * indicates that these indices were obtained from the data matrix where the variance of the variables was maximized. This coefficient of homogeneity is numerically equivalent with both the Loevinger's and Cliff's conceptualizations of the coefficient of homogeneity. As the Hoyt's (1941) formula for the internal consistency reliability is



r_{xx} =\frac{MS_I - MS_{RES}}{MS_I}

$$

the Krus and Blackman formulation of the coefficient of homogeneity brings both the coefficient of internal consistency reliability and the coefficient of homogeneity within the framework of the analysis of variance.

Image below shows results of a data analysis described elsewhere of which a calculation of the coefficient of homogeneity was an integral part. The slices of the pie show the proportions of variance obtained by the analysis of variance design for the obtained and ordered data sets. The original residual variance component (.088) was further partitioned into the component due to the lack of ordinality (.060) and the residual component proper (.028).



These results were interpreted that about 6% of the total variance reflected the “illogical” relationships between the data elements.