StATS: What is a conditional probability?

Conditional probability represents the chance that one event will occur given that a second event has already occurred. Conditional probability allows you to examine how different treatments or exposures influence the probability of events like disease or mortality. It also provides a useful way to evaluate diagnostic tests.

You can best understand understand conditional probability in the context of a 2 by 2 table.

Definition of a 2 by 2 table.

Many of the variables that we encounter in medical research are binary. That is, they have two possible values. Examples of binary variables are alive/dead, diseased/healthy, male/female, treated/control.

A 2 by 2 table is listing of all the possible combinations for a pair of binary variables. The data is laid out in a grid and the numbers in the grid represent the number of occurrences of certain combinations of the the two variables.

In the layout below, we list the variable X in the columns. We arbitrarily label the two columns X+ and X-. We list the variable Y in the rows with arbitrary labels Y+ and Y-.

wpe3.gif (1702 bytes)

The values a through h and n represent the counts for various combinations of X and Y. For example, "a" represents the number of times that X- and Y- occur together, "e" represents the number of times that Y- occurs (regardless of the value of X). n is the total number of events.

Example

In a study of breast feeding in pre-term infants, the infants were randomized into two groups, an NG tube feeding group (treatment) and a bottle fed group (control). The researchers wished to see if using the NG tube for feeding rather than using a bottle would increase the likelihood of breastfeeding at discharge and afterwards.

wpe7.gif (2056 bytes)

In the table below, the rows represent the feeding group (NG Tube versus Bottle) and the columns represent feeding status at discharge (Exclusive breast feeding versus Partial/No breast feeding). We see, for example, that there are 20 exclusive breastfeeding controls.

Cell probabilities.

If we divide each entry in the table by n, the total sample size, we get cell probabilities.

wpe8.gif (2110 bytes)

The probabilities in the total row/column represent unconditional probabilities. The interior probabilities represent the probabilities of the intersection between two events.

Example

Here is an example of cell probabilities, using the data presented above.

wpe9.gif (2539 bytes)

Notice that the (unconditional) probability that an infant was assigned to the NG Tube group was 52.8%. The probability that an infant was assigned to the Bottle group and was an exclusive breast feeder at discharge was 22.5%.

The probabilities in a table like this may not always add up because of rounding.

Computing 2 by 2 tables in SPSS.

From the menus in SPSS, select ANALYZE | DESCRIPTIVE STATISTICS | CROSSTABS. In the dialog box, insert the variables names in the ROW and COLUMN boxes.

wpe3.gif (4683 bytes)

The table shown above is an example of the output using SPSS.

To compute cell probabilities in SPSS, click on the CELLS button in the dialog box. This calls up a new dialog box. Check the box next to TOTAL PROBABILITIES and uncheck the box next to OBSERVED.

wpe5.gif (5081 bytes)

Listed above is an example of what the SPSS output would look like.

Conditional Probabilities.

Conditional probabilities represent the probability that an event will occur when we restrict our attention to a specific row (or sometimes a specific column) of a 2 by 2 table.

We use a vertical bar, "|" to denote conditional probability. The notation

P[U|V]

is interpreted as the probability that U will occur given that V has occurred. For example,

P[Cancer|Smoking]

is the probability of developing lung cancer when we restrict our attention to smokers only. We compute conditional probabilities by dividing by the row totals (or sometimes the column totals).

wpe12.gif (2161 bytes)

In the table shown above, we divide by the row totals. This answers the question what is the probability of a certain value of X when Y is restricted to a certain value.

P[X+|Y+]=a/e.
P[X-|Y+]=b/e.
P[X+|Y-]=c/f.
P[X-|Y-]=d/f.

The last row gives unconditional probabilities. Unconditional probabilities represent probabilities without any restrictions.

P[X+]=g/n
P[X-]=h/n.

Example

The following table gives probabilities for breast feeding status given the feeding group that the inant was first assigned to.

wpe13.gif (2575 bytes)

The probability of exclusive breast feeding when we restrict our attention to the NG Tube group is 76.2%. When we restrict our attention instead to the bottle fed group, the probability drops to 42.6%. The unconditional probability of breast feeding is 58.4%, somewhere between the two conditional probabilities.

Using SPSS to compute conditional probabilities.

You can get SPSS to compute conditional probabilities. As before, select ANALYZE | DESCRIPTIVE STATISTICS | SUMMARIZE CROSSTABS from the menu. In the dialog box, click on the CELLS button. Check the box next to ROW PERCENTAGES to condition on the row categories. Check the box next to COLUMN PERCENTAGES to condition instead on the column categories.

wpe3.gif (5208 bytes)

Shown above is an example of conditional probabilities using the breast feeding study data.

A second example of conditional probability.

The data listed below is a study of infant mortality for white infants in New York City in 1974. The infants are categorized by birth weight (low birth weight is 2500 grams or less) and whether they survived for at least one year.

wpe14.gif (2656 bytes)

This table shows that the total sample size is 72,730 infants.

wpe15.gif (2726 bytes)

Notice that probability of survival is much higher among the normal birth weight infants (99.4% compared to 88.1%).

Source: Fleiss JL. Statistical Methods for Rates and Proportions (1981) New York N: John Wiley and Sons, Inc. page 77.

This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Definitions, Category: Probability concepts.