Quick Milestone Update
Before delving into the Bernie / Hillary data, just a quick update on some prior predictions. See http://mathematelection.blogspot.com/2015/08/the-first-gop-debate.html for the previous ones.
- August 13: Fiorina (R) passes Bush (R) at 250K likes (no prediction, although I think I may have misread things last time because around this date was when I thought Fiorina would pass Walker)
- Marco Rubio (R) has still not hit 1M likes (prediction, August 17) Since the spike in likes from the first GOP debate, he's cooled off a bit to roughly ~1000 likes per night.
- August 18: Fiorina passes Santorum (R) at 265K (I think this was a misread too -- predicted Bush to do this on the 19th)
- August 22: Sanders passes Clinton at 1.2M likes (prediction: August 20 at 1.2M).
- Unless something quite strange happens, Sanders is within 7000 likes of Rick Perry (R) and based on Sanders' typical increase of ~10000, he'll pass Perry tomorrow. Clinton will likely follow after in another couple days.
- August 22: Fiorina passes Bobby Jindal (R) with 278K(again, I think last time was misread because Bush was supposed to pass Jindal on August 23 with 286K)
Back to the Story
Bernie passes Hillary. Based on the data so far, this was a long time coming. Since I started collecting data, Bernie has been closing the gap [for the record, I'm using first names because each of their respective campaigns use the candidate's first name primarily]:
[As an aside, I discovered today, courtesy of an article from The Guardian several days ago, that Bernie actually has two official Facebook Pages, one of which passed Hillary long ago. However, the page I'm monitoring is the official campaign page {the other is for his presence as a U.S. Senator}, so I'm going to stick with it. As a related aside to the pages I'm following, I will not be monitoring the "candidate" Deez Nuts who recently polled at 9% in North Carolina until a verified page is created -- and this may not happen since the high school student running under this pseudonym is not actually eligible to legally become the POTUS due to age restrictions]
Since the first GOP debate (in which neither of these candidates participated), Bernie took off relative to Hillary. What intrigues me is the sudden change August 19. All of a sudden he lost momentum relative to Hillary (or, equivalently, she gained momentum relative to him). What's interesting is that nothing major seems to have happened that day. Hillary has been (somewhat) facing criticism following remarks made to #BlackLivesMatter activists that were recorded. Bernie, two days prior, was getting (mostly) positive press for his remark to a reporter about how his hair is not a serious issue when asked about why his gets less scrutiny than Hillary's.
What happened?
According to a user on reddit, it may have been from Hillary's campaign buying votes overseas. If you want to see what that user came up with, click on the link. I will not copy the infographic to this page because I have seen no indication as of yet that such figures are legitimate. I mention it because the timeframe seems to coincide with where things took a rather sudden turn in my data set, but with a word of caution regarding jumping to conclusions that in the extremely chaotic world of social media, this could also be just a fluke for a few days.
What I do know is data. From August 6 through August 19, the linear best-fit line for Hillary's lead compared to Bernie was the following, where "d" is the number of days since August 6.
Obviously, these are a best fit, and so the slope and intercept are estimates. Considering the standard error of the fit, the intercept is 186,000 +/- 3000 and the slope is -15100 +/- 400. That means that on August 6, Hillary was roughly 186,000 Likes ahead of Bernie, with a lead dropping 15,000 likes per day. The fit produced the coefficient of determination R^2=0.990. What does that mean? Formally, 99% of the variance in the data can be accounted for by the model listed above. Less formally, the data are pretty much exactly in a line. If you don't believe the number, just scroll up and look at the fit. [I changed how "d" was expressed in the formula above to be days since August 6 rather than days before August 22, to provide clarity about what the intercept means in this case, but regardless the slope is the same either way]
Including the data since August 19, R^2 drops to 0.962. In practice, that means the data are still pretty much linear. As stated before, the last few days could be more of a fluke [for example, when Hillary's lead stagnated about a month ago for several days] and things may get back to normal later, but from a very simplified (perhaps not even really valid, just interesting) perspective, it seems to me that something has been going on for one of the campaigns (or both) over the past couple days. The fact that this happened between the two leading candidates on the Democratic tickets right as their number of Likes on Facebook was equalizing is at the very least curious, and should probably elicit some skepticism about this data set to begin with. It is very much true that some companies exist to sell Likes on Facebook (such as fbskip.com), and what better time to do so when your lead is falling? Whether that's going on now is currently just speculation. But I think there's good reason to wear very skeptical spectacles when reviewing this data.
Predictions:
(I'll be real careful this time) I'm still using a linear model fit of the data since the first debate on August 6. Predictions change in response to the new data.
- August 26: Clinton passes Perry at 1.2M Likes
- August 29: Rubio hits 1M Likes (I doubt this date is correct because he's been losing momentum) [See below]
- September 1: Bush passes Santorum at 265K
- September 4: Sanders passes Cruz at 1.4M
- September 18: Bush passes Jindal at 286K
- September 29: Sanders passes Huckabee (R) at 1.8M
- September 29: Bush passes Walker at 300K
- October 13: Sanders passes Paul (R) at 2.1M
Interactive Plots Coming:
I'm going to keep working to get better interactive plots up. More on that probably next time. In the meantime, if there's a plot you want based on the graphics I've displayed anywhere on the blog (or on the existing interactive plots with old data), just let me know and I'll get it to you.