« The Federalist | jchaus's Blog | comment on "joe the skinhead" »

Outlier AP Poll?


The Associated Press Poll today gave Obama a 1% lead over McCain.  This was substantially different from other polls released about the same time.  And one might think it is merely an outlier.

Polls often report the margins of error, say plus or minus 3% that is typical of a sample size of around 1000 observations.  This means that the real value is likely to be within the range of the poll, but we don't know precisely where.  And note that here the range is 6 points.  So if Obama has a score of 44%, the real value is presumed to be between 41% and 47%.  Most political junkies understand this now.

But there is another element of the calculations that is usually not stated in reporting the results.  That is the confidence level.  All of these reported levels of support for a candidate have a measure of likelihood that the true value lies within the plus or minus range reported.  And the most common confidence level is 95%.  Thus the proper way to read a poll that gives Obama a 44% rating is that "we can be 95% confident that the true rating is between 41% and 47%."  Notice that 95% is 19 times out of 20.  Thus one time out of twenty will not even be within the confidence interval: it will be outside the confidence interval. 

Next question, how do we know which polls are these (extreme) outliers?  Answer: we don't.  And that is precisely why measuring a large number of polls is an important exercise.   Because some of them will be outliers.   Averaging out a number helps to control for this, which is exactly what TPM does.

Finally, there have been dozens and dozens of polls; hundreds in fact.  When there are so many polls it would be surprising if there were no (extreme) outliers.  And there should also be a few polls that overstate Obama's lead.  Again, it would be surprising if there were not.


6 Comments

| Leave a comment
user-pic

From Nate Silver at fivethirtyeight
_____________________________________________

Some Likely Voter Models are Suspect

There are eight current national polls that list separate sets of results for likely and registered voters. (In this case, for reasons that will be apparent momentarily, I am deliberately double-counting the two Gallup likely voter models). On average, Barack Obama leads by 9.8 points in the registered voter versions of these polls, but by 7.0 points in the likely voter versions -- nearly a 3-point difference:

Note, however, that the likely voter models appear to segregate themselves into two clusters. In one cluster, there is a rather large, 4-6 point difference between registered and likely voter results. In the other cluster, there is essentially no difference.

The first cluster coincides with Gallup's so-called "traditional" likely voter model, which considers both a voter's stated intention and his past voting behavior. The second cluster coincides with their "expanded" likely voter model, which considers solely the voter's stated intentions. Note the philosophical difference between the two: in the "traditional" model, a voter can tell you that he's registered, tell you that he's certain to vote, tell you that he's very engaged by the election, tell you that he knows where his polling place is, etc., and still be excluded from the model if he hasn't voted in the past. The pollster, in other words, is making a determination as to how the voter will behave. In the "expanded" model, the pollster lets the voter speak for himself.

Frankly, I find polls showing a 4-6 point gap between likely and registered voters to be utterly ridiculous. Why?

1. Among people who have already voted, Democrats lead overwhelmingly. Zogby pegs Barack Obama's advantage at 27 points among people who have already voted. The New York Times details how Democrats are overperforming, sometimes dramatically, in states where early voting is underway. (By the way, the New York Times' data on Florida is wrong, as it includes absentee ballot requests as well as early voters. According to an Open Left diarist, Democrats have a 24-point advantage among those who have actually voted early in Florida).

Pollsters ought to make certain that they're asking people whether they've already voted. Moreover, they ought to be putting these early voters through their likely voter models as a sanity check. That is, they should be testing to see whether a substantial number of people who have actually voted would in fact have been excluded by their likely voter screens. If the answer to this question is yes, they ought to be asking themselves whether their likely voter models have any basis in reality.

2. Enthusiasm is much higher among Democrats than among Republicans. The latest Diageo/Hotline numbers show that 72 percent of Democrats are enthusiastic about voting for their candidate, as opposed to 55 percent of Republicans.

3. Most likely voter models are unlikely to distinguish newly registered voters from what I would call lapsed registered voters. If someone is registered, and has been registered for a long while, but has not cast a ballot since they pulled the lever for Ross Perot in 1992, there is good reason to be skeptical about their intentions. On the other hand, voters who are newly registered have quite literally demonstrated their interest in the 2008 campaign; they are in fact quite likely to vote. Barack Obama's advantages are principally from among the newly-registered voter group.

4. There is an enormous discrepancy in the strength of the Republican and Democratic turnout operations. In past elections, such as 2004, this advantage favored the Republicans; in this one, it favors the Democrats. Barack Obama has somewhere between a 2:1 and a 4:1 advantage in field offices in most battleground states. He is relying almost exclusively on volunteers (the exception are a couple of cities like Philadelphia and Detroit, where Obama will most likely pay 'street money' to canvassers on Election Day). McCain, meanwhile, has already had to hire paid canvassers in Florida, and perhaps he will also in several other states.

5. Turnout among 'unlikely' voter blocks was substantially up during the Democratic primaries. Youth voters (18-29 year olds) increased their share of the Democratic electorate by 52 percent. Latino voters increased their share by 42 percent. Black voters increased their share by 8 percent.

I would like to issue a challenge to those pollsters like Franklin & Marshall and GfK which in spite of all the facts above, are showing a substantial shift toward the Republicans when they apply their likely voter models. E-mail me -- my contact information is at the top of the page -- and tell me why you think what you're doing is good science.

There's More (he goes into some detail... good stuff...)

http://www.fivethirtyeight.com/2008/10/some-likely-voter-models-are-suspect.html

user-pic

I didn't tell it to paste the whole thing!!! I think there musta been a javascript in there somewhere....

Sorry Nate.

user-pic

Frankly, I hope the AP poll gets wide and extensive play. Complacency is the biggest threat we've got, 2 weeks out. Obama supporters should feel hungry, let McCain's supporters cheer up a bit and get careless.

user-pic

I do as well but forget just an outlier, it is a disgrace to polling. 44% of the ENTIRE ELECTORATE was born again/evangelical voters. When they were actually TURNED OUT to vote they only constituted 23% of the voting public. DISGRACE.

user-pic

Agreed. Plus, the purpose of all the nastiness from mccain is to scare it out of the base so they will vote. There's nothing like the specter of a terrorist in the WH to get the repubs our of their recliners. That's what makes me believe there may be a shift in the gop direction.

user-pic

I like Nate Silver's comments. I must confess I am particularly interested in the biases that will be revealed once the votes are counted. In particular, I am interested in the extent to which cell phone users who are not counted, and are poorly accounted for in most of these polls.

Secondly, I strongly suspect there will be a MASSIVE increase in voter participation. Turnout will be a big jump from 2004 (59%), which was a big jump up from 2000. This is a mobilization. Such mobilizations are rare in American political history. Turnout in the 1880's reached 85% for presidential years (and think, these people were not college educated for the most part--whereas today we expect turnout to increase with education--but I digress). Turnout declined steadily for a variety of reasons, including the introduction of the Australian ballot (secret ballots instead of publicly visible ballots, which begins in 1890), the spread of the white-only primary in the South (begins in 1890, along with poll taxes and literacy tests). Turnout expands significantly in 1932, which was the FDR realignment. It then will steadily decline slightly, dropping off more significantly around 1968, where there was an increase of distrust in government amidst the Vietnam War. Watergate merely increased this distrust.

The Reagan era showed a voter turnout of about 55% or so in most presidential elections, with a margin of error of plus or minus 3%. It declined slowly.

The Obama phenomenon will almost certainly smash recent records for turnout (just a hunch at this point--although polls suggest this is correct. And this will confound many pollsters, because they are generally not accounted for in their stratified samples.

The new voters will almost certainly be systematically different the old voters. They will almost certainly be Obama voters. And we (I hope) will witness a LANDSLIDE of HISTORIC proportions (think in terms of Andrew Jackson and FDR).

Leave a comment

jchaus

user-pic

Following:
Followers:

Posts
Comments & Recommends


  • Politics Yes

Favorites

  • Favorite Books Building a New American State

All Reader Posts
How to use myTPM

Advertise Liberally
Share
Close Social Web Email

"To" Email Address

Your Name

Your Email Address