Statistical Causality


  • Lecture


  • 2020/2021


Marco Malvaldi


Tuesday, May 4, 2021 - 09:00 to Tuesday, May 25, 2021 - 11:00



This short course is organized for Ph.D. students in Data Science and other programs of the organizing institutions.
Registration is mandatory.
Other interested people can register, but their admission is subject to approval (based on the total number of participants).
Registration form:

Statistical Causality - Asking “What if?” to data


May 4th 2021, 09:00 - 11:00

A gentle introduction: forecasting the future, explaining the past
What causes what? A (necessary) philosophical digression
Why you should care about causality: Simpson’s paradox
Elements of probability theory and Bayes’ theorem.
Difference between p(A|B) and p(A|do(B)): spurious correlation - or, selling ice cream do not cause woodfires.
How to define causality: B causes A if p(A|do B) > p(A|B). Is the altitude that causes
temperature, or is the cold that causes temperature?
Directed acyclic Graphs - for lazy people, DAG. A DAG is not a Bayesian net.
How to build a DAG. Conditional independence, or: is my DAG correct?

Dealing with DAG and reality
May 7th 2021, 09:00-11:00

A system of two variables: How to establish if A causes B, or B causes A, or none of them.
How to establish causality in a system of n variables, all known and measurable. 2.0 Virtual
intervention on a DAG: the do-operator.
Simpson paradox again, this time solved.
A drug good for men, good for women, harmful for humans.
How can you test a goalkeeper?

What if I can’t measure a variable?
May 11th 2021, 09:00-11:00

Building a DAG with hidden variables:
Surgery on a DAG: blocking a pipe is the best way to test where is the leakage...
Constraints in hidden variables DAG
How to establish if X causes Y in a system of n variables – some of them unknown.
First attempt: the Back-door criterion.
Second attempt: the Front-door criterion.

Let’s generalize
May 14th 2021, 09:00-11:00

General methods for operating on DAGs
A general criterion: if no child of X can be reached with a bidirectional trajectory, we can state if X causes, Y, that is, we can always express p(Y|do(X)) as a linear combination of conditional probabilities, and the procedure can be cast into an algorithm.

Causality in time series
May 21st 2021, 09:00-11:00

Granger causality
Transfer entropy
Where these two concepts fail (but they are nevertheless mandatory to learn)

Hands on
May 25th 2021, 09:00-11:00

Asking questions to the data: how to use the do-operator to build hypothetical worlds.
Direct effects of X on Y. Is the Covid fatality higher in China or in Italy?
Indirect effects of X on Y. Having made the political choices of Sweden, would fatality rate be different in Italy?
Natural effects vs. Induced effects. How would Belotti play in Milan coached by Pioli?
Questions, hope as many as possible.

You can find the recordings here:
Lecture 1/6:
Lecture 2/6:
Lecture 3/6:
Lecture 4/6:
Lecture 5/6:
Lecture 6/6:

Zircon - This is a contributing Drupal Theme
Design by WeebPal.