Steve Simon


[StATS]: R-squared (created 1999-08-18)

*Dear Professor Mean

You’re performing quadratic regressions on a calculator? I hope the poor thing doesn’t overheat.

You know more than you think you do. R squared

Short explanation

In brief

More details

Consider a data set where we are trying to predict the Lymphocyte count (per cubic mm) as a quadratic function of reticulytes.

**R squared is computed by looking at two sources of variation

Think of SStotal as the error in prediction if you did not use any information about reticulytes. In that case


Here is the SPSS output of a sample exercise from page 461 of Rosner (1992). We use a quadratic function of reticulytes to predict lymphocytes.


Figure 3.1. [Image is already full size]

The output shows a value of R squared of 0.39. How is this number computed?

The formula for R squared is


Another formula


where SSregression is the difference between SStotal and SSerror.

We know that SSerror is 2,207,364.8. In SPSS



A value of 0.39 is a low

What is adjusted R squared?

Notice that SPSS also produces a statistic called adjusted R squared. This statistic adjusts for the degrees of freedom in the model


and here is a double check of the results.


We might prefer to use the adjusted R squared if we are comparing our quadratic model to other models of varying complexity

More complicated models

Some regression and ANOVA models incorporate a random factor. These models do not have an obvious way to compute R squared.

With a random factor

The R squared for within variation is a measure of how much the model helps when trying to predict a new observation on one of the subjects already in your study. The R squared for total variation is a measure of how much the model helps when trying to predict a new observation on a new subject.

In theory

I have a paper somewhere in my bibliography that talks about this


R squared measures the relative prediction power of your model. It compares the variability of the residuals in your model (SSerror) to the variability of the dependent measure (SStotal). If the variability of the residuals is small then your model has good predictive power.

Further reading

Any good textbook on regression should have lots of details on R squared. Draper and Smith (1998) discuss R squared in chapters 2 and 4. Most introductory level books will also discuss R squared. Rosner (1992) talks about R squared (though not in the context of quadratic regression) in chapter 11.

  1. **Applied Regression Analysis
  1. **Fundamentals of Biostatistics

This page was written by Steve Simon while working at Children’s Mercy Hospital. Although I do not hold the copyright for this material

regression](../category/LinearRegression.html). this one at [Category: Linear with general help resources. You can also browse for pages similar to Children’s Mercy Hospital website. Need more information? I have a page reproducing it here as a service