In the coming election season there will be a lot of polls in the news. With the Presidential election likely to be close, the words “margin of error” will often be heard. And grossly misinterpreted!

“Jones 46%, Smith 43%, but that’s within the poll’s 3% margin of error–a statistical dead heat.”

Wrong.

Here’s the real deal:

A perfect poll would show the true split of electoral opinion. An actual poll provides an estimate of that “true” picture. Statisticians calculate “confidence intervals” to assess the potential difference between the two. A typical confidence interval is 95%. And, with a 3% margin of error, that means we can be 95% confident that the difference between the poll result and reality is not greater than 3 percentage points.

So in our Jones versus Smith example, it’s not in any sense a “dead heat” or “statistical tie,” and it’s not just as likely that Smith, not Jones, is ahead. In fact, there’s only a 5% chance that Smith is actually ahead.

I believe the news media’s margin of error in reporting this stuff will be 100%. In other words, I’m 95% confident that they will report it wrong every time.

Why? Don’t they know better? Why don’t the polling organizations themselves scream about this? Because it’s just too subtle, too complicated, too difficult a point to convey in a soundbite. Ya gotta keep it simple for da yahoos. Or so at least they think.

(That’s how George Bush failed so miserably in making the case for invading Iraq. He thought the American public was too dumb to grasp a complex, nuanced, multi-faceted, subtle argument (which certainly could have been made). So, to keep it simple, he focused on just one thing, WMD and terrorists. Unfortunately, that one thing turned out to be the one thing that was incorrect.)

July 9, 2008 at 11:51 pm

I agree that statistics can be complicated and nuanced, and I think you’re a bit off. Or at the very least, you’re oversimplifying matters in no better a manner than that of the news media.

First, let’s assume the margin of error they report is actually two standard deviations. This means we are 95% confident that Jones’s actual percentage is between 49% and 43%. But if we’re comparing Smith and Jones, that’s not really what we’re interested in. We’re interested in a difference-of-means hypothesis test.

Let the null hypothesis be that there is no significant difference between the candidates’ percentages. In order to reject the null, and support the hypothesis that there is a statistically significant difference in the candidates’ support, we need to pick a critical value; let’s say we choose a 5% confidence level (for a z-score of 1.96, assuming a normal distribution). We subtract the percentages and divide by a standard deviation (1.5%, since 3% is two standard deviations). (46-43)/1.5 = 2 exactly. Since 2 > 1.96, we conclude that there’s a significant difference between the support of both candidates (and that Jones is probably ahead). We do not conclude that “there is only a 5% chance that Smith is actually ahead.”

Please forgive and correct any errors in the above analysis. My point here is only to demonstrate that your simplification of the statistics of polling is no better than that of the greater news media.

FSR COMMENT: I thank Frank for his input. I am indeed not a satistician, and perhaps I should have simply pointed out that in the 46-43 hypothetical, it’s NOT a “statistical tie” as the media usually say, and left it at that. I think Frank would have to agree with that — and hence that what I said is NOT “no better than the greater news media.”

July 11, 2008 at 3:04 pm

Again, I must laugh at polling.

In reality, polls are only accurate by random chance.

People don’t make up their minds until they actually vote. Asking their opinions before that time is wasteful and inefficient. We already know that 40% vote Republican and 40% vote Democrat because that is their party affiliation and they nearly always succumb to peer pressure. That is a major function of the party – produce peer pressure.

So, every poll seeks to break the issues down into party lines SPECIFICALLY to produce a certain result – and the polls are always timed to happen AFTER certain current events which are fresh in the minds of the sampled people.

Of course President Bush gets the lowest poll results ever right now. The Iraq war is going badly, the economy is going badly, and his bumbling public speaking gets mocked daily (by people like David Letterman, my personal favorite). But… is he really that unpopular? Of course not! If we were attacked again and successfully weathered the attack, his poll numbers would jump into the positive range almost overnight!

Polls are garbage. Stop reading them. They are used to CHANGE public opinion, not to measure it.

By the way, if you give me enough money and time, I can produce a Legitimate poll that shows President Bush with a 50% approval rating RIGHT NOW. Make the check out to B. R. Peterson and include at least seven zeros in the amount area.

July 11, 2008 at 3:08 pm

Oh, and I’ve been polled over the phone and I have intentionally lied to produce the opposite answer that the pollers were looking for.

Na-na-na-boo-boo!!!