Saturday, August 22, 2015

Bernie Beats Hillary... Again?

Interesting times for the Democrats! As of today, Bernie Sanders (I) has passed Hillary Clinton (D) in terms of total likes on Facebook. After trailing by ~7000 likes for the past couple days (see below), Bernie finally passed Hillary. But is that the full story? It's hard to say.

Quick Milestone Update
Before delving into the Bernie / Hillary data, just a quick update on some prior predictions. See http://mathematelection.blogspot.com/2015/08/the-first-gop-debate.html for the previous ones.
  • August 13: Fiorina (R) passes Bush (R) at 250K likes (no prediction, although I think I may have misread things last time because around this date was when I thought Fiorina would pass Walker)
  • Marco Rubio (R) has still not hit 1M likes (prediction, August 17) Since the spike in likes from the first GOP debate, he's cooled off a bit to roughly ~1000 likes per night.
  • August 18: Fiorina passes Santorum (R) at 265K (I think this was a misread too -- predicted Bush to do this on the 19th)
  • August 22: Sanders passes Clinton at 1.2M likes (prediction: August 20 at 1.2M).
  • Unless something quite strange happens, Sanders is within 7000 likes of Rick Perry (R) and based on Sanders' typical increase of ~10000, he'll pass Perry tomorrow. Clinton will likely follow after in another couple days.
  • August 22: Fiorina passes Bobby Jindal (R) with 278K(again, I think last time was misread because Bush was supposed to pass Jindal on August 23 with 286K)
Back to the Story
Bernie passes Hillary. Based on the data so far, this was a long time coming. Since I started collecting data, Bernie has been closing the gap [for the record, I'm using first names because each of their respective campaigns use the candidate's first name primarily]:


[As an aside, I discovered today, courtesy of an article from The Guardian several days ago, that Bernie actually has two official Facebook Pages, one of which passed Hillary long ago. However, the page I'm monitoring is the official campaign page {the other is for his presence as a U.S. Senator}, so I'm going to stick with it. As a related aside to the pages I'm following, I will not be monitoring the "candidate" Deez Nuts who recently polled at 9% in North Carolina until a verified page is created -- and this may not happen since the high school student running under this pseudonym is not actually eligible to legally become the POTUS due to age restrictions]

Since the first GOP debate (in which neither of these candidates participated), Bernie took off relative to Hillary. What intrigues me is the sudden change August 19. All of a sudden he lost momentum relative to Hillary (or, equivalently, she gained momentum relative to him). What's interesting is that nothing major seems to have happened that day. Hillary has been (somewhat) facing criticism following remarks made to #BlackLivesMatter activists that were recorded. Bernie, two days prior, was getting (mostly) positive press for his remark to a reporter about how his hair is not a serious issue when asked about why his gets less scrutiny than Hillary's.

What happened?

According to a user on reddit, it may have been from Hillary's campaign buying votes overseas. If you want to see what that user came up with, click on the link. I will not copy the infographic to this page because I have seen no indication as of yet that such figures are legitimate. I mention it because the timeframe seems to coincide with where things took a rather sudden turn in my data set, but with a word of caution regarding jumping to conclusions that in the extremely chaotic world of social media, this could also be just a fluke for a few days.

What I do know is data. From August 6 through August 19, the linear best-fit line for Hillary's lead compared to Bernie was the following, where "d" is the number of days since August 6.
Obviously, these are a best fit, and so the slope and intercept are estimates. Considering the standard error of the fit, the intercept is 186,000 +/- 3000 and the slope is -15100 +/- 400. That means that on August 6, Hillary was roughly 186,000 Likes ahead of Bernie, with a lead dropping 15,000 likes per day. The fit produced the coefficient of determination R^2=0.990. What does that mean? Formally, 99% of the variance in the data can be accounted for by the model listed above. Less formally, the data are pretty much exactly in a line. If you don't believe the number, just scroll up and look at the fit. [I changed how "d" was expressed in the formula above to be days since August 6 rather than days before August 22, to provide clarity about what the intercept means in this case, but regardless the slope is the same either way]

Including the data since August 19, R^2 drops to 0.962. In practice, that means the data are still pretty much linear. As stated before, the last few days could be more of a fluke [for example, when Hillary's lead stagnated about a month ago for several days] and things may get back to normal later, but from a very simplified (perhaps not even really valid, just interesting) perspective, it seems to me that something has been going on for one of the campaigns (or both) over the past couple days. The fact that this happened between the two leading candidates on the Democratic tickets right as their number of Likes on Facebook was equalizing is at the very least curious, and should probably elicit some skepticism about this data set to begin with. It is very much true that some companies exist to sell Likes on Facebook (such as fbskip.com), and what better time to do so when your lead is falling? Whether that's going on now is currently just speculation. But I think there's good reason to wear very skeptical spectacles when reviewing this data.

Predictions:
(I'll be real careful this time) I'm still using a linear model fit of the data since the first debate on August 6. Predictions change in response to the new data. 
  • August 26: Clinton passes Perry at 1.2M Likes
  • August 29: Rubio hits 1M Likes (I doubt this date is correct because he's been losing momentum) [See below]
  • September 1: Bush passes Santorum at 265K
  • September 4: Sanders passes Cruz at 1.4M
  • September 18: Bush passes Jindal at 286K
  • September 29: Sanders passes Huckabee (R) at 1.8M
  • September 29: Bush passes Walker at 300K
  • October 13: Sanders passes Paul (R) at 2.1M


Interactive Plots Coming:
I'm going to keep working to get better interactive plots up. More on that probably next time. In the meantime, if there's a plot you want based on the graphics I've displayed anywhere on the blog (or on the existing interactive plots with old data), just let me know and I'll get it to you.

Wednesday, August 12, 2015

The First GOP Debate

The first debate for Republican candidates was August 6, just about a week ago. Since then, there have been many interesting changes in the Facebook Like dataset.

Debate Winners:
This is one of the first times to see if major changes in the political landscape are reflected by Facebook Likes. CNN and The Huffington Post seem to agree that Carly Fiorina (R), Marco Rubio (R), and Ben Carson (R) were the winners of the debates (Fiorina was in the "happy hour" debate while Rubio and Caron were in the "primetime" debate). Let's take a look at the Facebook data.

Let's take a look at the data from August 7:


Based on the percentage overnight change, it certainly seems that Fiorina and Carson were major winners. In fact, Carson's overnight increase of 124,341 Likes set a new record (at least since I started this project). [Although, he broke his own record the next day with 145,526 Likes] It definitely seems like percent change is one of the best ways to track public opinion shifting, rather than total Likes or even absolute overnight change. Trump consistently does well overnight, but it's difficult to ascertain whether this is because of agreement with his views on issues or because people are simply interested in following him due to controversy. However, the percentage bumps seem to jive with the "declared winners" of the Debate (to some extent anyway). What's interesting is that the Facebook data seems to not really show much of an increase for Rubio. In fact, Bernie Sanders (I) has seen more of a bump since the debate (the large bump for Carson is August 7). 


I created a new statistic this time that shows, I think, who the front-runners seem to be in terms of momentum: the percent increase in Likes since the start of the data set. While Donald Trump (R) still leads in terms of total Likes, Fiorina now actually has the highest percent change, up there with Bernie Sanders (I) and Carson. We'll see how this plays out, but I wouldn't be surprised if those four become the major contenders later on (but that's just a personal opinion).



The Sanders Bump:
One strange artifact of the debate (and other recent press) is that Sanders is taking off relative to Hillary Clinton (D). Over the past week or so, Sanders has rapidly decreased Clinton's lead. Using all the data collected so far in a linear model, Sanders should pass Clinton in approximately 43 days. Using just the past week (given his actual increased political momentum after the debate and the #BlackLivesMatter interruption in Seattle at one of his rallies), Sanders should pass Clinton in just a little over a week. Interesting times are ahead for the Democrats.



Milestones:
Here are some milestones that candidates have hit recently, along with when they were predicted to happen (see my previous post).

August 3: Bernie Sanders (I) passes Marco Rubio (R) with 925K Likes (prediction: July 29 with 918K)
August 6: Ben Carson (R) passes Mike Huckabee (R) with 1.8M Likes (prediction: August 14 with 1.8M)
August 8: Ben Carson (R) passes Rand Paul (R) with 2M Likes (no prediction)
August 9: Donald Trump (R) hits 3M Likes (prediction: August 10)
August 10: Bernie Sanders (I) hits 1M Likes (no prediction)

New Predictions:
These are based on just the data since the debate. The newest prediction for the party nominees are Ben Carson and Bernie Sanders.

August 14: Fiorina passes Walker (R) at 290K
August 17: Rubio hits 1M
August 19: Bush (R) passes Santorum (R) at 265K
August 20: Sanders passes Perry (R) and Clinton at 1.2M
August 23: Clinton passes Perry at 1.2M
August 25: Bush passes Jindal (R) at 286K
September 4: Sanders passes Cruz (R) at 1.5M
September 8: Carson passes Trump at 3.97M
September 9: Carson hits 4M
September 11: Christie (R) passes Graham (R) at 136K

Sunday, August 2, 2015

Major (code) Updates!

This time, I don't really have many updates on the actual election front. Things have been fairly stagnant over the past week or more.

However.

I've been changing quite a bit of my process and have some cool things to share:

1) Wolfram Data Drop
As stated previously, I've been using Wolfram Mathematica to process the FB Like data. I'm working to move things more and more to the cloud and make things more and more accessible. A friend of mine told me about Wolfram Data Drop (cloud-based storage, publically available, often used as an interface for the "Internet of Things" -- maybe you store your minute-by-minute pulse measurements from a FitBit or something there). So now my data is accessible here.

2) Cloud Visualizations
Another cool thing is deploying some of the things I've built to the cloud. I'm still having a little bit of trouble producing things quite the way I want, but I do have a way to present all the data up until today in an interactive format:

To view graphs like below, go here.










To view a pie chart like below, go here.
























To view the actual data like below, go here.























And finally, to see the estimated total likes of each candidate from now until election day (based on a best-fit line of data collected so far), go here.

I'll be working on making these faster soon. There are a lot of calculations happening in each of these which is why they may take a while to load. Until then, I wish you happy times in exploring all the data!