StATS: S+ArrayAnalyzer web seminar (June 22, 2004).

Michael O'Connell and Richard Park gave a nice web seminar on the S+ArrayAnalyzer, a software program for analysis of microarray data that is marketed by Insightful Corporation. This company makes a lot of very nice software

The S+ArrayAnalyzer software is built on the open source Bioconductor project. It remains faithful to the Bioconductor implementaiton of expression sets and code written for Bioconductor will work in S+ArrayAnalyzer. S+ArrayAnalyzer adds additional slots, consistent accessor methods, and a graphical user interface. It also offers Affymetrix API support, and an SPXML library for graphics.

You can run S+ArrayAnalyzer algorithms within the Spotfire DecisionSite application. Details are available at Spotfire S-PLUS Server Solution [pdf].

The speakers described two experiments. The first experiment looked at granulocyte differentiation in a series of mice, with measurements at day 0, 1, 2, ..., 6 with four mice evaluated at each day. The goal was to identify genes that are differentially expressed while minimizing the number of false positives.

The second experiment looked at young versus old animals in the time 0, 0.5, 1, 2, 4 hours after surgically induced injury. There were 3 animals of each age at each time point. The goal was to see the effect of age on recovery.

S+ArrayAnalyzer can read the CEL and CHP formats as well as AADM links used by Affymetrix chips. It can also read a variety of formats for the two color spotted arrays.

Initial exploratory methods include MvA plots (Bland-Altman plots), box plots, image plots of spatial expression, and RNA degradation plots. I had not heard about the RNA degradation plot before. This plot aligns all the Affymetrix probes from the 5' end of the gene to the 3' end. Since RNA degradation starts at the 5' end, any degradation would appear as a trend in the plot with lower expression values on the 5' end. A brief description of this plot appears on page 17 of the pdf handout, Introduction to Affymetrix GeneChip Data Analysis, by Han-Ming Wu and the AffyRNAdeg function in Bioconductor will produce this graph.

Affymetrix chips have a set of Mismatch probes that attempt to adjust for background and cross hybridization. There are several ways to incorporate the mismatch probes. The approach used by Affymetrix is called MAS 5 and is described at

Alternative approaches for handling the mismatch probes appear in the following references:

Differential expression is tricky because of the large number of genes tested. To minimize the number of false positives, you need to use an approach with control of Family Wise Error Rate. The best known approach is the Bonferroni correction, but this is very conservative. Alternative to Bonferroni include

Alternately you can consider an approach with control of False Discovery Rate. Some references for this approach are:

Cluster analysis will filter the genes into groups of genes that behave similarly

A heat map will allow you to see how well the clustered genes behave.

The final step is annotation, which tries to place the genes in context and link to freely available web resources like

For further details look at the handout for this web seminar [pdf].

This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Data mining or Category: Statistical computing.