Ecological Fallacy

Ecological Fallacy” is a phenomena in statistics where correlations for groups of data points are differ from the correlations for ungrouped data.  Here a quote from the Wikipedia,

“Another example is a 1950 paper by William S. Robinson that coined the term.[5] For each of the 48 states + District of Columbia in the US as of the 1930 census, he computed the illiteracy rate and the proportion of the population born outside the US. He showed that these two figures were associated with a negative correlation of −0.53 — in other words, the greater the proportion of immigrants in a state, the lower its average illiteracy. However, when individuals are considered, the correlation was +0.12 — immigrants were on average more illiterate than native citizens. Robinson showed that the negative correlation at the level of state populations was because immigrants tended to settle in states where the native population was more literate. He cautioned against deducing conclusions about individuals on the basis of population-level, or “ecological” data. In 2011, it was found that Robinson’s calculations of the ecological correlations are based on the wrong state level data. The correlation of −0.53 mentioned above is in fact −0.46.[6]