« previous | TPM CAFÉ READER POSTS HOME | next »
Oh, the Logical Fallacies!
This is not news for anyone--Hillary's campaign and the mass media coverage thereof have been marked by one logical fallacy or inconsistency after another. These fallacies have marred even blogs, including TPM. Today things just went over the line for me; enough for me to be sitting and writing this after more than 15 hours in the office. The following is not meant to be proof (hence there are no source references), it is merely meant as an argument weakener, a plea to take a step back and analyze the news of the day in a colder analytical way.
I want to pose the following questions re the news of the day:
(1) We all saw the Gallup numbers of Hillary running better against McCain as opposed to Obama. I believe (and not only because I support Obama) that the survey could be misleading because:
(1.1) Polls have margins of error. Here, we have the possibility of compounding error. To state this simply, a poll shows that with certiain high probability (x% level of significance) that the results are correct (i.e., the candidate the poll points to as the preferred candidate is actually the preferred candidate). Now, here we are using four polls simultaneously to come to a conclusion. Say each has a probability of 10% of being significantly wrong. The joint probability that at least one is wrong is 35%. Probabilistic uncertainty tends to snowball. Using several polls simultaneously unleashes this effect.
(1.2) Even a more basic question is how reliable have polls been in this cycle (the above 10% is just an example picked out of thin air, an oversimplification of real life). If the answer is not very reliable, why are we taking a poll with a possibly high compounded error on face value?
(1.3) To my knowledge, this is one of the first major polls to show us the result of Hillary being (slightly) ahead. Statistically it may be an outlier (similar point to the ones above, is with a slightly different nuance). Just because it is the first/only (?) substantiation of a theory that Hillary and her surrogates have been using for months does not prove anything. Heck, a broken clock is right twice a day--if the campaign is saying the same thing over and over and the data vary, there will be some data confirming the point. But must we be blind to all data supporting the contrary point of view?
(1.4) Another troubling aspect of this poll is the measure. We see only total voter numbers which are nothing to do with electoral college math (ask Bill Clinton who never won the popular vote but happily served two terms). Thus we see that the poll is not very useful because it uses the wrong metric--it is like trying to figure out who is the better tennis player by observing the players play featherball--while there are some superficial similarities between the sports, skill in one is not necessarily a good predictor of success in the other. (Hey, to revisit the above, Bush Sr. won the popular vote against Bill Clinton!)
(1.5) Oh, and by the way it is entirely possible (although, in fairness, I do not think it is the case) that they are tied with the numbers as they stand because what we see are two unweighted averages that we are tempted to daa in our minds. Now Obama trails McCain by a point in both "types" of states. Hillary leads by 7 and trails by two. Clearly Obama averages "tails by one" no matter what the weights of the states are. Hillary's overall picture is not so clear. If the "trails by two" states are seven times as many as the "leads by five" (states should be weighted by the number of electoral votes), she will be ovarall in a "trails by one" position, which is a tie with Obama! (Check my math if you want to, but I did major in math and I vouch that I have not knowingly misrepresented anything...)
(2) The US is a democracy. Elections, not polls are the deciding factor. Otherwise, Hillary is clearly losing by an ever widening margin in a nationwide poll of polls (which has to carry a greater weight than a single Gallup poll)--one need not look further than www.pollster.com. Why should polls matter just when they are convenient? A lot of arguments the Hillary camp has made this campaign have been later abandoned for the exact opposite. Now, don't get me wrong, Barack is not a saint, but he has not commited such atrocious insults to my (and everyone else's) intelligence. What does she think we are? Dumb?
(3) By the way, the whole poll saga reminds me of a short story by Asimov which is worth reading--the name is "Franchise."
Again, this is just meant to be the starting point of a debate, not proof of a particular position. I hope I have been able to elucidate some of my more technical probability/statistics points. I welcome any and all questions and comments and promise to reply in due course.














Comments (7)
You have to remember that the Clinton campaign is still trying to build the narrative before the RBL committee meeting.
If you listen to the conference call this morning it was the continuing effort to state their case without any modifications. One question by a journalist from the Florida Sun Sentinel(?), asked a question of validity to the Florida vote. Questions of this type were not answered. Most journalist at this last conference call are either are completely amazed at the disconnect.
For Gallup to issue this poll does damage the general political discourse. It would seem at this time their reputation would be at stake for taking such a careless view of polling. I don't think they are that worried about being wrong, they'll just file it away as a "a mistake in variables". It's the support from the brand that the Clinton campaign needs. And you have to remember Mark Penn still has some advisory influence within the campaign.
With this added little tidbit, their argument with the superdelegates and the RBL committee is just a little sweeter.
May 28, 2008 8:18 PM | Reply | Permalink
edit:
Most journalist at this last conference call are completely amazed at the disconnect.
May 28, 2008 8:20 PM | Reply | Permalink
Not to mention the fact that if a person being polled is asked to make a choice between a) McCain, b) Obama, and c) Clinton, they are freely able to choose between three, count them, three, candidates.
If people are only being asked to choose between Obama and McCain, then that would be more fair. I don't know how the questions are posed, I've never been polled. But until Hillary Clinton (or Barack, for that matter) is no longer a choice on the poll, these numbers mean nothing.
May 28, 2008 9:13 PM | Reply | Permalink
Thanks for the comment Lis. In that line of thought, the order of asking the questions also matters (if I remember correctly, it can net a couple of percentage points in a real election, not to mention a poll). I wonder if they randomized the order in which they asked the respondents. The methodology must be on Gallup's website, but I am falling asleep on my keyboard as I am typing. I will try to look it up tomorrow.
May 29, 2008 7:34 PM | Reply | Permalink
I just want to let everybody know, for historical reference, that this was the day the Steamship Empress of Ireland sank and 1,024 people drowned.
I just automatically think of May 29 as Empress of Ireland Sinking Day, in case that ever pops up in conversation and it seems inexplicable or inappropriate to you. That's just how I remember it's May 29.
So don't freak out or make a big deal out of it if I bring it up.
May 29, 2008 12:58 AM | Reply | Permalink
When correcting people for logic (or math) errors, please know what you are talking about....
First, as the typical survey will compare two (or more) candidates, they are drawn from the same sample and the values are not independent. Thus, curiously, the assumption that there are "four polls" is incorrect. It is more likely that there are two polls (one comparing McCain and Obama and the other comparing McCain and Clinton). As these values are not independent, their probabilities are ALSO not independent (excluding "refused" "undecided" and "some one else"-not a valid answer in a forced choice comparison-the two are essentially r and 1-r, so they should have something very close to the same probability). That is why a poll will be reported with one probability number although there is more than one candidate (for purely technical reasons, the probabilities will vary between candidates if there are more than two valid responses).
(Actually, based on Gallup's website, the questions are asked in the same poll, only the questions are quasi-"independent", but they would be treated as independent for this analysis. The surveys are aggregated and from a 13 day period of time and could, therefore, be subject to systematic drift, but that is an entirely different matter.)
Second, the probability for the joint set is roughly 1-((1-p)sub1 * (1-p)sub2 *... (1-p)subN). In your hypothesized this case (corrected) N is 2 so it is 1 -((1-0.1)^2) = 1 -0.9^2 = 1-.81 = 0.19 (19%), which, while not 10% is far lower than 35%. It does appear that you would have been within 1% of correct if you had been correct about the 4 separate "polls."
Next, you make a fairly severe error in asserting that that p is 10%. Gallup (see http://www.gallup.com/poll/107539/Hillary-Clintons-SwingState-Advantage.aspx) says that p is 1%, this is a result of a sample of over 11,000 people. Based on the math I showed before, the joint p for two questions should be 2%.
Your use of the word outlier is inconsistent with its technical meaning - in excess of 2, 2.5, or 3 SD (depending on source) from the mean. Observations are outliers, estimates of means (including estimates of percents) are not. It is always possible that the estimate falls within the 1% probability area, but that would not be an outlier.
There is very little evidence that polls have been unreliable. Some polls have been of questionable validity, these mostly do not use well established random sampling protocols. The rest tend to agree with each other and are, therefore, by definition reliable. They MAY also be valid measures of what they are designed to be measures of (voter intentions at the time of the survey). Since there is no other known way to measure these intentions, it is hard to know. What you likely mean is that they fail as forecasts of voter behavior some 3, 4, 5 or more months later. Not a big surprise. They aren't forecasts (although often incorrectly used that way), they are measures of voter intention at the time they are conducted.
I am afraid I tire of explaining your errors.
Let me be clear, I see no reason to think Obama will not be elected president in November. But I would prefer a defense of his viability that isn't so confused.
May 29, 2008 1:44 AM | Reply | Permalink
Thanks for the comment, even if some of your attitude is pushing the limits of polite discourse. So:
- By four polls I mean: Obama vs. McCain in "Obama states," Obama vs. McCain in "Clinton states," Clinton vs. McCain in "Obama states," and Clinton vs. McCain in "Clinton states." Ergo, my assertion of "four polls."
- Your second point is a non-point and entirely based on the first. You essentially agree with the method (apologies for rounding, but there is only so much I am willing to type at 1AM).
- On number three... wait a second, where do I make an "assessment?" I thought the expression "an example picked out of thin air" was clear. This is a conceptual example, as the goal is to discuss theory, not exact numbers.
- The "Gallup poll" is a particular observation in the population of "polls." Although I admit I did not have in mind the technical definition of SD, I do not see how my usage has violated it, exactly.
- The general relaibility point of polls is a larger query (and phrased as such above). How is asking a question an error?
- In all fairness I thought "aren't forecasts (although often incorrectly used that way), they are measures of voter intention at the time they are conducted." while being somewhat obvious added to the debate in a meaningful way.
- Another thing to note are the caveats I tried to make very clear in my post (apparently not clearly enough to be noted on first reading)--this was never meant as a defense of viability. This is an "argument weakener," it does not prove a point, but seeks to develop alternative explanations that might lead to what is observed. As I said, more of a starting point for a discussion than a proof of any sort.
Have a good night!
May 29, 2008 7:29 PM | Reply | Permalink
Post a Comment