A good coffee should be dark, intense and rich in taste, right? That’s what you and me will tell when asked. Based on consumer research, coffee companies will create new coffee roasts that are extra dark and strong and full of flavour. Then you and me are going to completely ignore those and go buy something mild, probably with a lot of milk and sugar in it. Because everybody lies.
There is a difference between action and intent. We may mean what we say, but we do not follow through. What we actually do is what we really, secretly and often subconsciously mean. That is why everybody lies. When we voice our preferences and opinions, even to ourselves, we are always feeling constraints to remove the edges, to be acceptable to others, to fit in. Even under absolutely no legal or moral restrictions we adapt our opinions to what we perceive to be the public opinion, an effect known as social desirability bias.
Ultimately, Everybody Lies is a book about Big Data, and about revealing the truths by mining it. Stephens-Davidowitz brings colorful anecdotes from horse races, political campaigning, sex, marketing and other fields to illustrate the power and potential of Big Data analysis.
Did you know that google searches for racist jokes correlated well with Trump votes (while poll results did not)? Sometimes, Big Data is the only way to uncover the truth nobody would willingly admit (the author speaks of “zooming in”). A lot of data (statisticians would call it “independent observations”) allows making rare observations and find hidden trends and connections. Also, the data is anonymous, so nobody has a reason to filter their opinions, allowing data scientists to observe those less comfortable relations.
After singing the praise for Big Data, Stephens-Davidowitz does not shy away from explaining the pitfalls, like a good scientist does. People tend to forget that correlation is not causality, and this opens the door for clueless or abusive data mining. The authors showcases wonderful examples of false correlations (there are full websites about spuriuos correlations), of data manipulation and potential exploitations by corporations and governments.
Stephens-Davidowitz is an economist, a field that is otherwise is based on theories failing even the most basic reality checks (or so it seems). Data Science can be the tool that brings economics and social science on actually scientific grounds. So far, I like where this is heading, and hopefully data scientists of the future will use big data to model, explain and predict human behavior better than ever before.
Everybody likes exquisite coffee. Everybody hates discrimination. Everybody prefers vanilla sex. Everybody makes sound financial decisions. Everybody Lies is a good, short, entertaining and scientific book to get you started with Data Science. If you need primer on Big Data, this book is for you.
- Big Data gatherered from anonymus resources reveals the true motives ans opinions of people.
- Data Science can be used for some good purposes and abused for some bad purposes and is sometimes not applicable.
- Test your assumptions.
Seth Stephens-Davidowitz is an economist and data scientist. You can read all about him on his website.
Bloomsbury UK; Auflage: Export/Airside (26. Mai 2017)