P.Mean: Venn diagrams with proportional areas (created 2008-09-23).

I was asked by someone to come up with a graphic summary of a data set that includes three binary factors that can be either present or absent in any combination. Typically this can be illustrated with a Venn diagram, the intersection of three circles but I wondered if you could do a Venn diagram with areas proportion to the actual probabilities.

Wikipedia has a nice picture of a Venn diagram at en.wikipedia.org/wiki/Image:Venn_diagram_cmyk.svg.

This diagram has all the circles of equal size. I wanted to make the circles larger for the larger probabilities and smaller for the smaller probabilities and I wanted to where all the circles are equal. I wanted to make the various intesected areas proportions to the probabilities of various intersections.

There's a nice website that will draw a proportional Venn diagram using circles, but it admits that the solution is only approximate.

This is a Java applet at www.cs.kent.ac.uk/people/staff/pjr/EulerVennCircles/EulerVennApplet.html

It appears to choke on this particular problem because the three way intersection is empty. Still it appears to do reasonably well with some of the other problems I've tried it with.

It just about killed me to do it, but I figured out how to use two squares and a tilted rectangle for a simple case.

If A, B, and C represent the presence of the three factors and a, b, and c represent the absence of the three factors, then you can write eight different conditions (abc, Abc, aBc, ABc, abC, AbC, aBC, ABC). Some people will drop the lower case letters and list seven regions (A, B, AB, C, AC, BC, ABC) with an implicit eighth region equal to the remainder of the probability.

The probabilities are abc = 38%, Abc =18%, aBc =24%, ABc =15%, abC = 3%, AbC = 1% aBC = 1% # ABC= 0%. The approach I used is not very general. If it works at all, it will only be when the individual probabilities are small.

Here's the steps I follow.

1. Set up a square grid from 0 to 1 in both the x and y directions.
2. Draw the ABc region (orange) as a square in the center of the region.
3. Draw the Abc region (red) extending from the lower left corner of the ABc region and going in a square to the right and above.
4. Draw the aBc region (yellow) extending from the upper right corner of the ABc region and going in a square down and to the left.
5. There are four remaining regions (abC, AbC, aBC, and ABC). Draw an isoceles right triangle for the smallest of these regions extending from the upper left corner of the ABc region.
6. Draw similar triangles for the remaining three regions.
7. The combination of the four triangles is a square. Extend the square in the direction to match the next smallest region.
8. Continue extending in the other two directions.

Problems may occur in steps 3, 4, 7, or 8. There are simple workarounds for problems at steps 3 and 4 that involve using rectangular extensions rather than squares and using a non-centered ABc region. Problems in steps 7 and 8 are trickier. There are some pathological cases (such as when diagonally opposite regions in step 5 are both zero) that would obviously fail.

I suspect that problems are more likely to occur when the probabilities are large and come close to summing to 1.