the_basics.gif (3434 bytes)


Page 3 - Lurking Variables and Simpson's Paradox

When comparing the conditional probabilities of two events it is important to consider influential variables.  A lurking variable is a variable which has an important effect on outcomes, but which has not been accounted for in the data.

Example

This table compares the death rates at two Hospitals:

Hospital
Deaths

Hospital
U

Hospital
R

Lived

2350

485

Died

150

15

Total

2500

500

The death rates at: Hospital U = 6%, at Hospital R = 3%. 

Does this mean Hospital R is better?

Hospital U is a large urban hospital with a trauma unit treating injured and very sick patients, hospital R is a regional hospital which primarily performs non-emergency elective proceedures. Patient condition may be an important factor to consider.

This table compares the condition of the patients at both hospitals:

Hospital
Deaths

Hospital
U

Hospital
R

Patient
Condition

Good

Poor

Good

Poor

Lived

495

1855

441

44

Died

5

145

9

6

Total

500

2000

450

50

For Patients in Good Condition: U's deaths= 1%, R's deaths = 2%

For Patients in Poor Condition: U's deaths= 7.25%, R's deaths = 12%


Hospital U has better results in both categories, yet worse results overall due to the disproportionate number of its patients in poor condition.  The condition of the patient is a lurking variable.

 
The situation described above illustrates a principle known as Simpson's Paradox:
When data are aggregated over a lurking variable, the results may reverse.


Basics 1
Basics 2
Basics 3

Copyright © 1999 CyberGnostics, Inc. All rights reserved.