P.Mean: Comparing a set of microarray experiments to a model experiment  (created 2008-11-01).

I have a matrix of effect sizes from numerous microarray experiments.  For example, in one matrix I have 200 genes (rows) and 107 experiments (columns).  In addition, I also have a sort of “model experiment” which contains the values in which I am most interested. For each gene, I am trying to determine which genes are not statistically different from the “model experiment” value.  An example may make this clearer:

      Model  test1  test2  test3
Gene1     4      4      5      3
Gene2    10     14     -2      0
Gene3   -12    -13    -14    -15
Gene4     1      0      0      7

In the example above, it would appear that the average value from from genes 1 and 3 from the “test” experiments are “not significantly different” from the “Model” experiment.  However, I’m not sure how to test this systematically across all genes.  I have tried to do a Z-test to test the hypothesis that the mean from the “test” experiments is not different from the “model” value, then only take those genes with the highest p-values, but this gives only a very small number of genes, and just doesn’t feel right in my gut. Is there a better way?  Any help you can offer would be wonderful!

I'm not an expert on microarray experiments, but the one thing I worry about is the lack of replication in this experiment. I realize these microarrays are expensive, but have you considered the possibility of replicating some of the conditions (particularly replication of the model condition)? If the model experiment had a few anomalous results, you could end up with a seriously incorrect interpretation.

Also, it's impossible to do a complex microarray experiment through an email exchange. I'm just hoping that you get some ideas to help you get started on your own.

I'll presume that test1, test2, etc. all represent a replication of sorts or that the individual tests have been normalized somehow.

A good general approach to a microarray experiment is to calculate a simple statistic for each row of data, and also calculate a p-value associated with that experiment. Then adjust the p-value, somehow for multiple comparisons.

In your application, a simple statistic would subtract the model column from the remaining columns. Then assume that the 106 adjusted values represents a sample from a population with a mean of zero. There is a simple t-test for this. Get the p-value and multiply by 200. This is the Bonferroni adjustment. Any p-values that are still smaller than 0.10 are evaluated further.

Note that I didn't use the traditional alpha level here. In a microarray experiment where any significant finding is followed up with some confirmatory assays, adhering to a very strict alpha level may not make sense.

You might consider using an approach like the false discovery rate rather than Bonferroni. I give a simple example of this at my old website.

Randomization tests are attractive for microarray experiments as well. Suppose that you ran the model experiment and 106 other test experiments, but you forgot to label the 107 experiments. To hide this from your boss, you just randomly assigned labels to the 107 experiments. If you calculated any test statistic from this data, it would clearly be noise.

Let's suppose that your lab partner found the unlabeled experiment and randomly assigned the 107 in a different order. He/she would also get a result that was clearly noise. The interesting thing is that you now have two different estimates of the noise in your experiment. You could start to characterize variation in the noise. Do this 107 times, each time assigning a different experiment to the model experiment.

Draw a histogram for the statistic from 106 bogus labelings and locate the value of the true labeling on that histogram. If the true value is in the middle of the histogram, this is a gene where the expression level is clearly the same in the model experiment and the remaining 106 experiments. If the true value is at or near the extremes of the histogram, then this represents a gene where the expression levels of the 106 tests differ from the model.

Randomization allows for a very nice global test. Calculate 200 p-values from the true experiment and pick the smallest p-value. If that p-value were larger than 0.10, you would be done, of course, because this is clearly a negative experiment. A value smaller than 0.10 may be real or it may be due to the fact that you are running a bunch of tests simultaneously. Find out what the smallest of 200 p-values would be for each of the bogus labelings, and draw a histogram. Where does the true minimum of 200 p-values lie in this histogram?

Now there's some stuff I don't understand about this microarray experiment, and one question I would ask is whether all the test experiments have to deviate in the same direction from the model. Suppose you had a gene where the model result was 50 and the test experiments gave half of the results around 10 and the other half of the results around 90. On average, this gene is behaving fine, but I suspect that you'd want this gene to be highlighted by any statistical procedure. There's lots of other issues like this that could only be identified by the give and take of a face-to-face consultation.

Of course, you're getting this advice for free, so even if it is lousy advice, it's still a bargain!

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 United States License. This page was written by Steve Simon and was last modified on 2010-04-01. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Data mining.