What’s the Difference Between Science and Storytelling in Data Science?

It’s easy to come up with stories and plausible explanations. What’s hard is to be right.

Olivier Binette https://olivierbinette.github.io (Duke University)
2022-03-27

Note: this is a cross-post from LinkedIn.

As scientists (and statisticians 🤓), we use data to explain things like natural phenomena and business outcomes.

We come up with nice little stories that are easy to communicate and easy for the audience to digest.

But coming up with stories is only half of the work. It’s easy to find patterns in data and come up with plausible explanations.

What’s hard is to be right. For that, you need a scientific approach.

Call hypotheses your stories and explanations. Then, make sure you have a plan to test them.

If your explanation is right, what else should be going on? Is that other thing indeed happening? Make your explanations falsifiable and then rigorously try to prove yourself wrong.

Don’t stop at the explanations or at what seems to make sense. Go through the whole scientific process, down to the documentation of what you’ve done to validate your hypotheses.

This is what is going to make you into a reliable scientist rather than just a storyteller. 👩‍🔬

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.