6 jun. 2011

Cherry Picking

Suppose you tested a classroom of 30 kids for about 40 separate variables. Chances are you would find that some of those variables would line up by sheer chance. In other words, you could find a strong correlation between two independent variables. If you came up with even more variables you could find even stronger correlations. Four of the five kids who wore glasses also had names beginning with J. This is a little like shooting bullets at a barn and then going and drawing targets around the bullet holes, the Texas Marksman's Fallacy. Correlation is not causation, but sometimes correlation is not even correlation.

2 comentarios:

Andrew Shields dijo...

Thanks for teaching me about the Texas Markman's Fallacy. I'd heard the joke before, but I hadn't heard of the fallacy.

Shedding Khawatir dijo...

This is actually true of all statistical tests, which is why you should always be skeptical if an article uses a lot of them on the same sample (and even more so if the sample is small). You are supposed to correct the level of significance (making it lower, and thus less likely to be significant by chance) with an increased number of tests. Unfortunately, this isn't always done, and readers aren't always aware of how this affects the claims of the research.