P.Mean: What is the effect of an unmeasured covariate? (created 2009-06-09).

Suppose you want to conduct an analysis of covariance, but you have data on some but not all of the covariates. What do you miss out on because of the unmeasured covariate. To understand this, we need to venture in to the world of partitioned matrices. If you have a symmetric matrix of the form

then

.

The inverse of this matrix is

where

and

.

represent the matrices which project a vector onto the column space perpendicular to A and B, respectively. This results can be found on the Wikipedia page on the block matrix pseudoinverse:

The formula for the regression coefficients is

which, when partitioned equals

.

There are two special cases to consider. If the unmeasured covariate is balanced across levels of A, then

and if the unmeasured covariate is uncorrelated with the response y, then

If both of these conditions are met, then the regression coefficients for the partitioned case would be

which is equivalent to using only the information in A. If only the first condition is met then the regression coefficients

A test for the effectiveness of the statistical adjustment could be made if B were known in a random subset of the data. This could occur in a situation where B is not truly unknown, but rather is very expensive to measure. There would not be sufficient budget to measure B for all cases, but it could be done for a randomly selected set of cases. I will detail those results in a future webpage.