Data Correlation, Data Causality or Data Explanation? Part 2

If we take a helicopter view above data management then we see all kinds of different methods to read data and determine the worth of data. The question about the trustworthiness of data is often popped. There are many different ways to answer this question. How about data correlation, data causality or data explanation. When does which method fits best?

The method that we try to clarify in this article is the method of data causality.

Definition of Data Causality

Causality (also referred to as causation,[1] or cause and effect) is the agency or efficacy that connects one process (the cause) with another process or state (the effect), where the first is understood to be partly responsible for the second, and the second is dependent on the first. In general, a process has many causes, which are said to be causal factors for it, and all lie in its past.

In practice

I am still not a mathematician, a clinical researcher or a data scientist but I am finding this subject very interesting. Especially when we look at these different approaches in different markets with different goals. Explaining the real value of data is hard to many of us. As mentioned in an earlier article about data on the balance sheet.

Let’s take a look data causality. Data causality is most common in the industries of exact sciences such as healthcare or other clinical science researches. There is no room for error or faults when you test a medicine or something likewise. One of the key elements of research based on causality is the fact that you work with a group of selected people who represent the “N”. Also known as a sample. This sample has to be compliant to represent the group and be trustworthy. In causal research there are two types: Experimentation in for example a laboratory and statistical research.

In this case data causality is about the “Why” question.