Today I presented my work on this blog at the Wolfram Technology Conference. So that happened.
I'm a part of the Wolfram Student Ambassador Program, and was invited to share the sorts of things I've been doing. I've been a nervous wreck about it for multiple weeks. The people at the conference are industry leaders, often having been working in Mathematica for many years (compared to my ~3-4 years). But y'know what?
It was great!
I think roughly 60 people attended my talk, and it went really well. *phew* I've had several people come up to me after the talk to say how much they enjoyed it, and that was about the biggest self-esteem boost I've had since getting into graduate school
The slides for the talk are available here:
Mathematica Notebook: https://dl.dropboxusercontent.com/u/4972364/WTC_2015/Presentation_beta.nb
(needs the following GIF: https://dl.dropboxusercontent.com/u/4972364/WTC_2015/blog.gif)
The Mathematica Notebook version (if you own a copy of Mathematica) is preferred -- the formatting is better and it's fully interactive. In either case, it gives you some insight into how I create the visualizations for this blog and some of the technology behind it -- specifically the Wolfram Data Drop, the Manipulate function, and CloudDeploy.
I'll have some updates for interactive things over the next couple weeks. Going to make it a lot easier for you to play with the data on your own. I'll also have some other updates from the coding side of things.
Some questions from the Q&A are worth mentioning here as well as things I'd like to do in the future:
Have you thought about using Twitter / Google Trends?
Short answer: yes, but not yet. I'm definitely hoping to analyze these sorts of data streams as well, seeing how well they match up to each other and how well they match up to poll-based public opinion estimates.
What about the selection bias of Facebook?
This (and questions like it) pose a very valid flaw (or at least a major assumption) inherent in this analysis, which is that Facebook, Twitter, and even Google have a skewed representation of the population. A lot of likely voters simply aren't going to have a Facebook page with a lot of information. Furthermore, just because someone follows a candidate doesn't mean they'll vote for them, nor does the lack of following a candidate indicate a lack of support. I'm hoping to look at past elections and find similar Facebook Like data to see how well these sorts of things actually match up. If there's enough overlap, maybe I can make accurate predictions in spite of the biases, or at least try to correct for them.
(not asked, but something I mentioned) What about extrapolation? Surely a linear model is inappropriate here.
Absolutely. Right now, I'm making very naive predictions that are surely wrong. That's not a statement made out of false humility. It's really a quite stupid way of looking at things. That doesn't make it bad. Just very likely to be inaccurate as time goes on. The question is how to bring in more sophisticated models (specifically regarding discrete events like the debates). The computation is straightforward. The challenge is the theory. And honestly, I don't (yet) have a ton of experience with this sort of data. So, it's coming down the pipeline, just maybe not for a while.
Anyway, I need to get some sleep for the conference tomorrow, followed by flights, working on my neuroscience homework, and studying for midterms for Thursday. Oy.