In the first two conversations of Discussing Data Science, my guests used some intriguing metaphors in talking about their perspective on data.

Christian Djurhuus introduced me to the metaphor of the data void to talk about data gaps in biomedicine. My co-founder, Brendan Guerin, introduced me to the metaphor of the data mosaic to talk about building his analysis of an investment opportunity.

I like both of these metaphors individually, but in this article, I reflect on what it would mean to put them together, and how these metaphors fit into our broader philosophy of science.

The Limits of Evidence-Based Decision-Making

It seems to me that the nature of evidence-based (or data-driven) decision-making is nicely captured by Brendan's metaphor of the data mosaic: To inform a decision with the best evidence possible, we try to build a picture. We gather and analyze data and assemble these into the tiles of the mosaic. The more tiles we have, the more complete is our picture. The more complete is our picture, the more confidence we have about our understanding of the evidence, and hopefully, the more success we have following our decision.

The limitations of evidence-based decision making are nicely captured by the metaphor of the data void: We should recognize that most of the tiles that are necessary to complete the picture are missing. In the beginning of the mosaic assembly (i.e., data gathering) process, this is not a problem, because the process always starts with a lot of uncertainty, gaps, blanks, what-have-you. But as we begin putting the tiles together, we start to see how little of the picture we will actually be able to fill to our satisfaction—because the data doesn’t exist or we don’t have time to find it or we lack the means to access it.

So what can we do in this situation? How do we assemble a compelling data mosaic in a data void?

We build the best mosaic we can, within time and resource constraints, and base our decision based on that necessarily fragmented, fuzzy, incomplete picture.

Second-order Decision-Making

I suspect that the above state—accepting the incomplete mosaic and trying to make the best of it—is where most of us who value evidence-based decisions end up.

But let's push this line of thought a little further. It may be the case that we never have all the data we'd like, but that doesn't mean we can't do some second-order analysis of our decision-making process itself.

So let's ask ourselves: How do we know if, in the end, we've made a good decision? How do we know if our decision moved us toward greater value or away from it? How do we know if our decision got us to the best possible outcome? Or just to a slightly better outcome? Or to a worse outcome?

In other words, how do we know if our decision reflects a wise or prudent “navigation of the data void”?

How do we know if we made a mistake in assembling the mosaic? What could we improve about our decision-making process for next time? Are there techniques or strategies we might employ to generate a clearer and more complete mosaic despite the data void?

All of these second-order questions are about our capabilities (or the capabilities of an organization) to reason and act effectively on the basis of data. I think we can perhaps boil them down to one big question: How much mastery do we have of the knowledge in our domain?

The Dream of Knowledge Mastery

Knowing how much knowledge mastery one has is difficult. But for an evidence-driven organization, it seems like the ideal answer would be something like this:

"Our organization has as complete a mastery of knowledge in our domain as possible. We know this because we have all the data that is accessible to us—all cleaned and structured in a way that is informative for our business. The data is logically linked, cubed, and right at our fingertips. When we need insights, we know where (or to whom) to go, and we get those insights immediately."

On the one hand, I suspect few organizations can truthfully give that answer. Indeed, prudently navigating the data void is not easy—and I think the technology to do it is still somewhat new and evolving. [Obligatory, shameless plug time: The tech to achieve this kind of mastery is what Prism provides its clients.]

On the other hand: This ideal answer should not strike us as an absurd or impossible dream. On the contrary, if we strive to make evidence-based decisions, then this is the answer we need to be giving—or at the very least, the answer we should be working toward.

But this brings me to what is, I think, a very encouraging insight: Recognizing the data void and working toward this ideal evidence-based perspective is fundamentally just a mindset. Or if you'd prefer a more technical, philosophical phrasing: It is simply adopting a set of epistemological premises.

Knowledge mastery means accepting that there will always be gaps in your data mosaic. It means being self-aware and strategic about the heuristics that you must use to judge and act effectively despite the gaps and the uncertainty.

And there's more good news: Once you give up the premise that knowledge mastery requires certainty, the world really does open up in a new way. Anxiety about uncertainty gives way to increased curiosity and confidence, because you begin to see that the data void is no longer a barrier to your good judgment. By exploring the second-order questions (like those I've canvassed above), you move into a higher-level of knowledge—where you can come to better understand how you know, and learn, and how you can improve your own learning and decision-making.

To close on a personal note: This shift from fear and anxiety in the face of uncertainty to excitement and confidence was one of the great joys of my philosophical training. It is the root of everything we do at Prism. To share this joy; to bring this confidence and power to our clients is why I love doing what we do.