Gallup vs Gallup, Round 2
As detailed numbers become available from early voting, it has become extraordinarily clear that the surge in the numbers of early voters has been disproportionately fed by supporters of the junior senator from Illinois. The lopsided margins in partisan turnout have been reflected in several national polls. From Pew:
Obama holds a 53% to 34% lead among the sizable minority of voters (15%) who say they have already voted. Among those who plan to vote early but have not yet voted (16% of voters), 56% support Obama, while 37% support McCain.
Yesterday's ABC/WashPost poll provides a finer layer of granularity, showing that the higher turnout is being driven, at least in part, by Obama's GOTV operations in battleground states:
Vote preference among these early voters is 59-40 percent, Obama-McCain, widening to 68-31 percent in the 16 battleground states and 71-28 percent in the eight toss-ups states
That's why the initial release of numbers by the Gallup Organization proved so puzzling. Gallup found that "early voting generally reflects the same Obama lead evident in the overall sample." I've been hammering them for that finding ever since. Since there's a high degree of correlation between pollsters who find Obama doing disproportionately well among early voters and the size of the lead they assign him overall, I think that the problem is significant. It may well reflect problems with polling that would produce underestimates of his overall support.
On Wednesday, however, Gallup released new numbers, showing that early voters had increased from 11% to 18% of respondents. Here's the key passage:
The voter preferences of the group of 1,430 individuals who have already voted and who were interviewed by Gallup between Oct. 17 and Oct. 27 show a 53% to 43% Obama over McCain tilt. Among the group of those who say they have not yet voted, but will before Election Day, the skew towards Obama is more pronounced, at 54% to 40%. By comparison, those who are going to wait to vote on Nov. 4 manifest a narrower 50% to 44% Obama over McCain candidate preference. (Across all registered voters over this time period, Obama leads McCain by a 51% to 43% margin).
The new numbers also revealed the precise nature of Gallup's problem. Take a close look at what Gallup's doing here. Normally, tracking polls are compiled on a rolling basis - aggregating 3-5 days of data. In this case, Gallup has aggregated all eleven days of data. That increases its sample size, but, as it happens, decreases the accuracy of the results.
Here's the problem. There are (basically) two ways to vote early: by absentee ballot, or at a polling station that's open prior to election day. When Gallup began asking the question, almost no states had commenced in-person early voting. So the 7% of respondents who told Gallup they'd already voted must have been absentee voters who had already mailed their ballots. That's a population which tips whiter, older, and more Republican than the electorate at large. And sure enough - Gallup reported that its initial sample skewed older. (We can confirm this not only from past elections, but also from those states that have released numbers which separate out the two forms of early voting.) Early in-person voters, on the other hand, are proving this year to be overwhelmingly Democratic, disproportionately black, and supportive of Obama. Most of the 11% of voters who have cast ballots since Gallup started asking the question fall into this latter group. But Gallup isn't making any effort to separate out these two entirely dissimilar groups.
Normally, that might not matter. But aggregating data is only a sensible move when the underlying population you're trying to assess remains relatively constant. In this case, though, we know that's not true. In essence, Gallup has been recounting the same absentee voters (7%) and only slowly adding in the early voters (now 11%). So that initial group of absentee voters comprises 40% of those who have already voted, but fully 60% of Gallup's sample. That's why Gallup is reporting only a modest ten point (53-43%) lead for Obama. His actual lead among those who have already voted in the Gallup sample is probably about the same as his lead among those who report they still intend to vote early - 54-40%.
So why is Gallup doing this?
I don't think that Frank Newport is biased in respect to this election, but I do think he has a vested interest in protecting Gallup's reputation. And Gallup has a very big problem. In its first release, it found that "there is little significant difference in the propensity to vote early between the Obama supporters and the McCain supporters interviewed in the aggregated sample." That was a very strange finding - a claim that 31% of Obama supporters and 29% of McCain supporters planned to vote early, for a total of 30% of voters. Since then, although the results in the daily tracking have actually been a little bit tighter, the split among early voters has widened, making it clear that Obama supporters are in fact far more likely to vote early than McCain backers. But by reporting the results the way it has, saying that early voters initially split like the rest of the electorate, now support Obama 53-43%, but that the gap among those who have yet to vote early is 54-40%, Gallup creates the illusion that we're witnessing shifting support. (The problem would be even worse if, as I suspect, Gallup is also aggregating its "yet to vote" pool over all eleven days, further exaggerating the largely-illusory split.)
But the real news here has to do with sampling and likely voter screens. Gallup has actually been producing three sets of numbers - registered voters, traditional likely voters (based on current intentions and past behavior), and expanded likely voters (based only on current intentions.) The early voting numbers are expressed as a percentage of registered voters who indicate that they are likely to vote - essentially, the expanded likely voter pool.
Let's assume for a moment that Gallup's early voter numbers (adjusted for their methodological error) are accurate. In other words, that 33% of registered voters will have voted by election day, and that they split something like 54-42%. Gallup didn't predict that result. The first time it looked at early voters, remember, it found "little significant difference" between their preferences and those of other voters. The obvious conclusion is that even Gallup's minimal "expanded" screen, which tested only for interest, has proven not to be predictive of actual voting behavior. Let me repeat that. When Gallup looked only at voters who expressed a sufficient level of "interest" to pass their screen, it predicted that McCain and Obama voters would cast ballots at the same rates. But now that 18% of registered voters have actually gone to the polls, it turns out that McCain and Obama voters are casting ballots at very different rates.
The oddest part? The expanded model actual shows higher levels of support for Obama than among the raw numbers of registered voters. So if Gallup isn't managing to capture the relative likelihood of Obama and McCain supporters to vote, it must either be the case that their sampling methodology is off, or that Gallup is overstating the likelihood of McCain supporters going to the polls. That's not as strange as it may initially seem. The traditional model rewards past voting behavior, a metric that gives McCain supporters an edge, but which apparently has not been predictive of early voting at all. The expanded model rewards interest. Isn't it possible that McCain supporters are more likely than Obama supporters to report their interest in this election and their intent to vote, without actually summoning up the time or effort to go to the polls? That's the likely result of a gap in enthusiasm or voter outreach, both of which Gallup has already documented.
I challenge Frank Newport to explain why his data doesn't suggest that even his "expanded" screen is proving to model actual voter behavior extremely poorly, and to understate the extent of Obama's advantage in this election. The last time I challenged Dr. Newport, he responded by providing an illuminating look at the internal processes of his polling operation. I hope that he will be similarly open this time around.
If you've enjoyed this, please share it with other readers by clicking the 'recommend this' link. You can find more analysis on my blog, or subscribe by clicking "Follow Me" on the right. As always, I welcome your comments and corrections, and thank you for your feedback.
Advertisement





Great stuff as always.
Correction for second to last paragraph:
I think I caught another missing word or typo earlier on but I got your meaning and just plowed through the rest of it without making a note.
October 30, 2008 1:11 PM | Reply | Permalink
Many may say they are for mcShame by default. They are republicans by history or whatever. They may even be against bush and against the war. But for whatever reason they are skeptical of Barack but not really enthused for the alternative.
I myself think that Obama supporters are so eager to cast a vote. That they would go through hell and high water just to demonstrate their desire for change and their hope that Obama can really turn around some things that are now so broken we are all suffering.
But the anger and resentment on the other side are more likely to lead to mayhem than a civil action like voting. Voting may not feel like it expresses what these people are feeling. They're betrayed and frustrated and angry. And mcShame is stirring up those feelings. But how does voting express them? How does voting seem like it's going to make a difference?
This doesn't explain statistics or voter choice. But maybe it speaks to a voter's attitudes. And when your campaign is in a way all about undermining the democratic process and civil discourse, how does voting fit in?
Whereas if your campaign is about recommitting ourselves to a civil and open society, where justice and equality are available to all, voting as a behavior is part of your duty as a citizen.
Thanks as ever, Fly! You seem to "fly" in here just when we need it.
October 30, 2008 2:00 PM | Reply | Permalink
It seems to me that if there is an error in Gallup's methods, then it could be off in either direction. If there is a fundatmental flaw, then the polls are worthless and trying to second guess what they mean is futile.
If you are correct and Gallup underestimates Obama's support , it's a problem shared by all of the polls isn't it? Gallup shows one of the biggest gaps, I believe. Frankly, an underestimate for Mccain's support concerns me a heckofa lot more than the reverse.
I should add: there is always the possibility that I didn't understand a word your wrote.
October 30, 2008 5:11 PM | Reply | Permalink
Fly, here are some numbers from the Atlanta Journal Constitution today that you and articleman might find helpful. Advanced and voting by mail started Sept. 22nd.
BREAKDOWN OF WHO’S VOTING
TURNOUT DEMOGRAPHICS
Note: The gender numbers do not add up to 100 percent because the gender of voters listed in the racial category “Other” is not included in the data.
Wait times for voting vary from 2-4 hours in Metro Atlanta at all polls without exception, except when some voters waited 8 hours on Monday because the state's computers either slowed down or crashed that day.
October 30, 2008 6:16 PM | Reply | Permalink
I believe that, in the end, Gallup (and others) are unwilling to modify their 2004 voter models to match the NEW voter tallies around the country (which shift overwhelmingly Democratic) until there is tangible voting evidence that these new voters are actually going to SHOW UP AND VOTE.
(New registrations skyrocketed among 18-24 voters in 2004, but there was not a ;skyrocketing; in their actual participation in the election. It increased, yes, but not anywhere near their proportional representations in the overall electorate did.)
If Gallup and others DID incorporate the new demographics, then you would likely see a 3-4% GAIN for Obama (as reflected in the Gallup II (Expanded Model) Poll).
But most pollsters who are in the business professionally are probably unwilling to 'go out on a limb' based on increased numbers of voters who may or may not show up Nov. 4 - and are unwilling to redefine the demographics based on 'enthusiasm' or 'professed loyalty' to the Democratic Party.
Just my .02 worth.
Great article!!
October 30, 2008 7:39 PM | Reply | Permalink
NY Times/CBS poll (10/25-29/08):
Already voted:
Obama 55%, Mcain 35%
Enthusiastic supporters:
Obama 67%, Mccain 45%
So one possibility is that we are simply seeing an enthusiasm gap, compounded by better Obama early-GOTV, particularly in battleground states. The key question becomes whether the non-enthusiastic supporters will end up voting or not - there's really no way to predict that based on early voting.
Seems to me the real question is how well the various likely-voter models have predicted the "already voted" group. To the extent that any of the models would have excluded a voter that had already voted, we know that that particular likely voter model is too stringent.
October 30, 2008 8:00 PM | Reply | Permalink
As I suggest above, we effectively have an answer for Gallup. Both of its models flubbed the test. They predicted that early voting would mirror election day voting, and it hasn't. It's been vastly better for Obama than for McCain. They've subsequently picked up some of that advantage, but while most polls are putting it around 20 points, they're lagging. So why did Gallup fail to predict this outcome, and why are they still slightly off?
I think it's a mistake to focus just on the voters who are being excluded. I don't have their internals, of course, but I suspect that Gallup was caught on the horns of a dilemma. Its traditional model excluded too many young and black voters on the basis of their past voting history. But its "expanded" model may have expanded too far, including too many Republicans.
Look at it this way. The traditional model has done a good job in the past, at least over the final week. For Republican voters, this race isn't that different from past cycles, and there's no particular reason to think that their behavior will diverge. So if you want to know which Republicans will actually vote, it makes sense to restrict on the basis of both interest and past behavior. For Democrats, on the other hand, the race is different - in particular, for black voters, and perhaps for young voters as well. So you want to drop the past behavior component, and perhaps even use a lower interest setting, at least for those groups. But Gallup can't (or more precisely, won't) apply different screens on a partisan basis. So its traditional screen is underpredicting Obama's support, and its expanded screen is overpredicting McCain's.
My guess is that Obama's actual advantage is a little higher than even the expanded screen is currently suggesting, because those numbers relatively overstate McCain's backing. But it's not a huge difference.
October 30, 2008 9:03 PM | Reply | Permalink
This comment deserves to be a post itself. Of course a pollster should have a more expanded screen for an energized party than for a less energized party. Great insight.
October 31, 2008 12:16 AM | Reply | Permalink
If you simply posit a high propensity to vote early (but not absentee) for black voters (driven by GOTV efforts, high enthusiasm and perhaps memories of long lines and voting barriers in high minority districts in past elections) wouldn't that explain what is happening in the Gallup poll?
Note there doesn't seem to be high early voting among young voters.
October 31, 2008 12:20 AM | Reply | Permalink
Early voting skews older because of inclusion of mail-in ballots traditionally used by seniors. Registration of young voters is up. So the early voter actual numbers are higher even if you have seen percentages that aren't dazzling.
October 31, 2008 1:00 AM | Reply | Permalink
And now McCain is cutting way back on his GOTV to put the money towards more TV ads?
http://www.dailykos.com/story/2008/10/31/03654/518/249/647590
I hate to use a sports analogy, but these last weeks remind me of the end of a basketball game where the team that's behind keeps fouling - hoping that the other team will miss the free throws and they can score quickly and then foul again - it's been known to work but it usually winds up increasing the margin of defeat.
In this case, McCain is committing one flagrant foul after another but if Obama keeps sinking the free throws he's going to wind up winning by a much larger margin than he would if McCain would just run a normal McCain 2000 campaign.
Degrading the vaunted GOP GOTV that helped Bush so much seems like the most desperate move since the Palin pick.
October 31, 2008 6:17 AM | Reply | Permalink
In Colorado we have permanent mail-in voting status available. Whenever my wife and I have come in contact with Obama GOTV workers, standing in line for an event ticket, standing in line for and event entrance, canvassing, they are urging everyone to sign up for mail-in voting and have the forms on hand to accomplish this. This is not your grandfather's absentee ballot. It is a concerted effort to relieve pressure at polling places and provide a paper trail in the era of computer voting. I really think too many things have changed too quickly to model this election accurately. Future models will draw heavily on this election but this one is a watershed event, in my opinion.
October 31, 2008 11:40 AM | Reply | Permalink