« previous | TPM CAFÉ READER POSTS HOME | next »
Twisting The Truth: A Statistical Dead Heat? I don't think so
I just dropped by CNN.com when I noticed an interesting headline: "Obama, McCain in statistical dead heat". Well, of course such a headline jumped out at me, so I thought I'd check it out.
With just over four months remaining until voters weigh in at theWait a second, hold on... Obama leads by five points, and it's a statistical dead heat? So that must mean the MOE is greater than or equal to 5, correct?
polls, the CNN/Opinion Research Corporation survey out Tuesday
indicates that among registered voters nationwide, Obama holds a
5-point advantage over the Arizona senator, 50 percent to 45 percent.
...
CNN polling director Keating Holland notes that Tuesday's survey
confirms what a string of national polls released this month have
shown: Obama holds a slight advantage over McCain, though not a big
enough one to constitute a statistical lead.
The poll, conducted June 26-29, surveyed 906 registered voters andNow, ok, the article does say that when factoring in Bob Barr and Ralph Nader, Obama holds a 46-43 lead. But that's certainly not what the headline states. And it's not the main point of the article.
carries a margin of error of plus or minus 3.5 percentage points.
But we know that CNN is just blowing smoke. If we look at Pollster, Obama has the largest lead over McCain that he's ever had, and doesn't seem to be slowing down. RCP shows the same.
So what can we conclude? CNN wants this race to look closer than it actually is. I know this isn't really a huge surprise to most, but I thought I'd just put it out there.
Should we be overconfident about Obama's chances? Of course not. But the facts don't lie, and the two candidates are certainly not in a statistical dead heat. Thanks for the brilliant, and accurate, repoting, CNN.
Advertisement











Comments (20)
Did you factor in the black boxes? Failing to do so is what messed up John Kerry....
July 2, 2008 1:45 AM | Reply | Permalink
Excellent pick-up. I noticed the same thing when I saw the report. Throughout the primaries I never heard any news agency declare that either Clinton or Obama were in a statistical dead heat when a candidate was ahead by 5 points.
Again, a great pick-up. Enjoyed reading your post.
July 2, 2008 3:55 AM | Reply | Permalink
It's déjà vu all over again.
July 2, 2008 5:57 AM | Reply | Permalink
I have a purely statistical question as it relates to these polls. When it says the MOE if 3.5%, Obama has 50% and McCain has 45%, is the 3.5% for the spread or for the individual numbers? If it's for the individual numbers, then it seems there's some overlap as Obama's numbers could be as low as 46.5% and McCain's as high as 48.5%.
I have some statistical background, but when I do my statistics, it's always on individual measurements and not forced choices, so I'm not sure how the MOE is calculated here.
July 2, 2008 10:05 AM | Reply | Permalink
Most likely CNN's reporters just don't understand statistics very well.
The margin of error is a 95% confidence interval around the candidates' percentages. In a two candidate race, then it also essentially is a 95% confidence interval around the spread between the candidates (you're assuming any votes lost by candidate X go to candidate Y, so the spread really is just another way of looking at the one independent variable).
But note that "margin of error" only reflects random sampling error: that is, assuming you've picked a truly unbiased random sample of the population, then 95% of the time a candidate's true polling in the full population would be within the margin of error. As it happens, this range is determined only by the size of your polling sample, and is independent of the size of the population. Real polls, however, do probably have subtle biases (although polling firms do their best to minimuze them), and they're still just a snapshot of current opinion. And opinion will, of course, change between now and election day.
But the bottom line is that if your lead in a 2-candidate race is more than the poll's margin of error, then it's pretty safe to say you're actually ahead right now. So CNN's headline is misleading.
July 2, 2008 11:11 AM | Reply | Permalink
Forgive me for being dense, but I'm not quite understanding you here. If the 95% confidence interval around candidate A is ±3.5%, and the 95% confidence interval around candidate B is ±3.5%, then it seems the confidence interval around their spread would be ±7%. E.g., the confidence interval around McCain in the current example would be from 41.5% to 48.5% and the confidence interval around Obama would be in the 46.5% to 53.5%. The biggest possible spread would be 12% and the smallest would be -2%, so it'd be 5% ± 7%. Naturally, the errors would not only be not statistically independent, but would correlate with an r of almost -1, so that the 41.5% would tend to go with the 53.5% and the 48.5% would tend to go with the 46.5% (and, of course, 53.5% going along with 48.5% would be downright impossible).
I feel that I am missing something here, but I don't know what it is.
July 2, 2008 11:36 AM | Reply | Permalink
If there really are only two legitimate choices then the values are p (Obama) and q=1-p (McCain). In that case, 1-p is not an independent variable, and the confidence interval around p is the only confidence interval.
The trouble arises when there are more than two legitimate choices (undecided, won't say, etc.). In this case, there is partial independence in the confidence intervals.
In that case, the best way to look at the survey is as a t-test where the hypothesis is that p-q=0. Since p and q are not entirely linked (although they are also not entirely independent), the standard error would be based on adding the separate variances for p and q and finding the square root of their combined value (the actual formula is more complex).
This is a long winded way of saying that the confidence interval for p-q is wider than the confidence interval for p or q, although not (I don't think-my math intuition with squared values and square roots sometimes goes wrong) twice as wide.
Fosberry is wrong, although he is leaning in the right direction.
July 3, 2008 12:19 AM | Reply | Permalink
Hocking, you're right. MOE is the radius of the 95% confidence interval, meaning the 3.5 points extends in both the positive and negative directions for each candidate's percentile.
That said, 906 people is too small a sample size, and 3.5 points to large a MOE, for CNN to be reporting this to the entire nation.
They could cut that MOE to about 2.5 points by interviewing another 600 people. That would add a lot of statistical power to their poll. And it's frickin CNN for goodness' sake, in an election year; they can get a couple thousand data points in a 4 day window, if they try.
And if they were still feeling lazy about it, the least they could've done is mash their results with other reliable pollsters and make a metapoll.
July 2, 2008 12:04 PM | Reply | Permalink
July 2, 2008 12:13 PM | Reply | Permalink
You're absolutely right about how they should've increased the power of their study.
A more accurate title would've been: "We didn't poll enough people to reliably give you an answer on this".
If I randomly polled 10 people on Bush's favorable vs. unfavorable ratings, it'd also probably be a "statistical dead heat".
July 2, 2008 12:33 PM | Reply | Permalink
Not necessarily, 10 people at WalMart would be different than 10 people at the Co-op Farmer's Market.
This is the real reason why >1000 opinions is too small. And as I've said before, I don't take polls and my father will talk to anyone...
July 3, 2008 12:18 PM | Reply | Permalink
A more accurate title would've been: "We didn't poll enough people to reliably give you an answer on this".
Haha! Nice one, Ben. =)
July 2, 2008 1:52 PM | Reply | Permalink
On FiveThirtyEight.com, they actually have a story concerning this. He quotes the National Council on Public Polls by saying,
Certainly, if the gap between the two candidates is less than the sampling error margin, you should not say that one candidate is ahead of the other. You can say the race is "close," the race is "roughly even," or there is "little difference between the candidates." But it should not be called a "dead heat" unless the candidates are tied with the same percentages. And it certainly is not a “statistical tie” unless both candidates have the same exact percentages.
And just as certainly, when the gap between the two candidates is equal to or more than twice the error margin – 6 percentage points in our example – and if there are only two candidates and no undecided voters, you can say with confidence that the poll says Candidate A is clearly leading Candidate B.
When the gap between the two candidates is more than the error margin but less than twice the error margin, you should say that Candidate A "is ahead," "has an advantage" or "holds an edge." The story should mention that there is a small possibility that Candidate B is ahead of Candidate A.
July 2, 2008 1:55 PM | Reply | Permalink
CNN and FOX are broadcast media's equivalent to the print media's tabloids (i.e. Star, Enquirer, et al.)
They manipulate and try to make news, not report facts as news - just the opposite.
Their polling criteria, ergo results, (like their 'news') delivers only the illusion of being valid.
July 2, 2008 2:21 PM | Reply | Permalink
Bien sur. Part of this was that we talk about the MSM trying to make the race look closer, more of a "dead heat" than it actually is. Eh, some could brush that off as bias and perception. But this kind makes one have second thoughts about any "bias" or "perception" going into our minds. More likely it's in theirs.
July 2, 2008 2:29 PM | Reply | Permalink
The media do not report news. When are you silly geese going to get that straight. They produce public theater. Where is the drama in the race unless it is close? Why is a news report called a "story." I'm a former journalist myself. Trust me, this is theater, nothing less.
July 2, 2008 5:55 PM | Reply | Permalink
Maybe this will help:
http://www.washingtonmonthly.com/archives/individual/2004_08/004536.php
I don't have time right now to plug the numbers into the 5 line Excel spreadsheet, but using the table provided by Kevin Drum, the odds that this poll is showing a real lead are 90%.
Statistical dead heat, my ass.
July 2, 2008 8:35 PM | Reply | Permalink
well, it doesn't hurt for people to believe it's close if it keeps their interest up. Having a strong lead early could decrease fundraising, could decrease volunteers and the like. I am much more concerned that polling advantages be clear and strong during the last week of the election comes when those folks who think their vote is a test that they get right if they vote for the winner make up their minds.
July 2, 2008 11:26 PM | Reply | Permalink
Whether there's an ideological or philosophical reason for doing what they did, it's still biased, faulty reporting.
July 3, 2008 12:46 AM | Reply | Permalink
1)You must have a land line phone.
2) You must be at home (not working) when it rings.
3) You have the free time and inclination to blabber on the phone to a machine for as long as it takes.
My guess is that a lot of Obama supporters such as myself work for a living and quite a few also use cell phones.
If the polling is not getting to you. You are not counted. No amount of sophisticated statistical analysis can make up for that.
This is one Obama supporter the polling won't reach.
How about you?
heh...
July 3, 2008 1:41 AM | Reply | Permalink
Post a Comment