I’ve been paying a lot of attention lately to how statistics are released to the public. In particular, when are confidence intervals used, and when are they dropped? When are numbers presented as fact, and when are they acknowledged to be fuzzy?

The only time you consistently see confidence intervals reported, in the general press, is for poll results. As in: 66% of respondents believed this poll to be self-reflexive, with a margin of error of plus or minus 5%.

Weather reports often have percentages involved, but it would be a stretch to call these confidence intervals. For example, when the attractive meteorologist on Channel 7 tells you that there is an 80% chance of rain tomorrow, they are presenting that as a fact. Behind that number is a computer simulation that may or may not be able to estimate a confidence interval around that 80% number.

Government statistics and estimates, no matter how bad or biased, almost never come with confidence intervals attached. Gross Domestic Product of the U.S. in 2010? Estimated at $14.64 trillion. Confidence interval for that estimate? Probably so bad you don’t even want to know. Or maybe you do?

Are there times when you’ve been surprised to see a confidence interval reported or missing?

Tags: mainstream media, numbers, statistics

This is a good point. Whenever I check out the ECB monthly bulletin or the Boe inflation report they always show their macroeconomic forecasts with confidence regions (like a fuzzy shaded area) around the historical areas that widen in the forecast part.

@John Hall:

Good that they show confidence intervals for the forecasts. For those kinds of future estimates I always want not just a CI, but also an empirical rating for the CI. In other words, what % of the time did the results actually appear to fall within their predicted range.

Your weather report example reminded me that the hurricane forecasts are often presented with “error-cones” and sometimes with multi-banded error cones. I suspect that the underlying prediction models that construct hat 80% value are built with the same sort of stochastic model that is a hybrid of physical principles (airmasses, solar convection, Coriolis, and all that jazz) and repeated stochastic simulation. If so, there would be the possibility of reporting some sort of range for multiple runs of the model at different time horizons. A quick search brings up this NOAA website as the first hit: http://www.nssl.noaa.gov/users/brooks/public_html/prob/Probability.html

Just encountered your article on confidence intervals and to be honest I don’t think that the general public particularly notice that the data being presented to them is accurate or that the sample size, for example, may affect the CI – I think it’s unfortunate but common sense and questioning seems to be lacking nowadays.