P.Mean: What is the effect of an unmeasured covariate? (created 2009-06-09).

Suppose you want to conduct an analysis of covariance, but you have data on some but not all of the covariates. What do you miss out on because of the unmeasured covariate. To understand this, we need to venture in to the world of partitioned matrices. If you have a symmetric matrix of the form



The inverse of this matrix is




represent the matrices which project a vector onto the column space perpendicular to A and B, respectively. This results can be found on the Wikipedia page on the block matrix pseudoinverse:

The formula for the regression coefficients is

which, when partitioned equals


There are two special cases to consider. If the unmeasured covariate is balanced across levels of A, then

and if the unmeasured covariate is uncorrelated with the response y, then

If both of these conditions are met, then the regression coefficients for the partitioned case would be

which is equivalent to using only the information in A. If only the first condition is met then the regression coefficients

A test for the effectiveness of the statistical adjustment could be made if B were known in a random subset of the data. This could occur in a situation where B is not truly unknown, but rather is very expensive to measure. There would not be sufficient budget to measure B for all cases, but it could be done for a randomly selected set of cases. I will detail those results in a future webpage.

Creative Commons License This work was supported in part by a grant from the American Nurses Association. The content is licensed under a Creative Commons Attribution 3.0 United States License. This page was written by Steve Simon and was last modified on 2010-04-12. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Statistical theory.