4

Causal Networks with Python: Predicting Punching Power in Boxing

 3 years ago
source link: https://medium.com/analytics-vidhya/causal-networks-with-python-predicting-punching-power-in-boxing-b16de0a8dd58
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Causal Networks with Python: Predicting Punching Power in Boxing

Causal Forecasting with CausalNex

Source

Causal Networks are having a huge impact in the world of Artificial Intelligence and their importance is only going to grow. Since Judea Pearl and his colleagues created a mathematical language for causality termed “do- calculus” we have a way to explicitly calculate causal effects and therefore we can answer “what-if” questions a.k.a we can make predictions.

I’ve picked up boxing for this article for two reasons. First, it’s my favorite sport, but most importantly it is a very explicit example of cause and effect isn’t it?

I believe even a 5 years old kid knows pretty well that it’s a punch that causes pain and not the other way around, right?

Well, long gone are the days when statisticians and their mantra “correlation does not imply causation” reigned supreme. Welcome to the new era of causality, it’s the year 2021 and we can at last say: hell, that was a strong punch, it caused a knock-out.

Introducing Causal Networks

Causal networks are similar to Bayesian Networks in that they are also made of nodes and connections. The nodes in a bayesian network transfer information in terms of conditional probabilities through the network. In this way, the mechanism is similar to the super-popular neural networks that you use in your deep learning algorithms. However, the key difference between causal networks and other types of networks is that the former are directed acyclic graphs (known as DAGs).

DAGs are made of vertexes and edges, representing causal relationships between variables, not correlations, that’s why the arrows actually have a direction.

In a DAG, if you see A → B it means that A causes B and not the other way around. Therefore, arrows in DAGs are known as causal paths and causal effects are measured by path coefficients.

The Do-Calculus has been created to solve equations in causal diagrams by telling us the probabilities in terms of p(A|do(B)) instead of the conditional probability P(A|B).

And this is crucial.

Well, this is a causality article so I am happy you are asking this :)

The key concept in causality is that the Bayesian conditional probability is based on correlations. It tells how likely it is that A happens if I know that B has happened. The whole Bayesian calculus is based on observations of A and B happening together. No clue about why things happen is ingrained in old school probabilistic reasoning, so until now, in the world of statistics you could only say that people are knocked out when they take strong punches, but also strong punches are thrown when people are knocked out.

You couldn’t say what caused what.

So how have we been doing predictions so far?

Good question.

We have been using regression models for forecasting, that means: correlations and not causations to make our predictions.

When you fit a regression model to your data, you are basically using correlations to make your predictions. So on the one hand statisticians told you to be careful and not assume that correlation implies causation, but when you ask them to make a forecast, they use correlations…funny stuff isn’t it?

The problem is when you are forecasting, you are not observing. You are imagining.

You don’t want to make predictions about the past, you make them in the future, so you are not observing, you are speculating. You imagine a “what-if” scenario in which an intervention is happening, for example, a boxer throws a punch, and you need a causal model telling the probability of that punch causing a knockout.

Debunking the Myth

The reason why correlations are not always causal effects is the spurious effect caused by independent variables that influence both the dependent variable and another independent variable. We call these variables confounders and their spurious effect confounding.

Now, to be perfectly clear: correlation CAN imply causation, actually very often it does. In the absence of confounders, correlation IS causation.

That’s why human beings have been successfully using regression models to make predictions for long time. Because correlations very often imply causation and therefore the models fit well the real world.

But what happens when correlation doesn’t imply causation?

Well, now you can use causal networks.

An Example of Confounding: Power in Boxing Punches

For example, in boxing, there are two clear variables causing the power of a punch: the mass of the boxer and the speed of the punch.

1*siY9BIWFi8ABTTWj4AbDOQ.jpeg?q=20
causal-networks-with-python-predicting-punching-power-in-boxing-b16de0a8dd58
Muhammad Ali was a KO artist. Source

It is clear that a heavyweight hits harder than a lightweight, however, speed is a confounder. At equal mass, the faster the punch the more powerful it is, but heavy boxers are slower, and small (less powerful) boxers are faster. This is a clear example of confounding.

The way science has dealt with confounding in the past has been by doing Randomized Controlled Trials (RCTs) in which they control for the confounding effect.

For example, in boxing, if we plot hand-speed versus power we get a negative correlation, but if we control for weight by looking at only one weight class we arrive at the opposite conclusion (which is the truth): punching power increases with hand-speed among boxers of the weight class.

Now, thanks to do-calculus we can simulate RCTs by adjusting for confounders in our models.

The key take away here is that we need to understand the process generating the data, once we have a model or equation for our process, we can answer “what-if” questions.

So that’s cool, we can generate data based on the real model and then perform regression and causal analysis to see how they forecast.

Let’s get hands-on!

Generating our data

In this case, we know from physics that the process of punching is governed by the kinetic energy equation:

power = 1/2 * mass * speed**2.

Therefore we can generate data for hand-speed, mass and punching power for 1000 boxers between 55 and 110 kg, doing like this:

The plot of power vs speed shows a non-linear, descending curve:

1*CrdbvV9F_RJgRjz1cyoVng.png?q=20
causal-networks-with-python-predicting-punching-power-in-boxing-b16de0a8dd58
power vs hand-speed

This means in our total population of 1000 boxers, the bigger the boxer the slower he is. Makes sense. Although perhaps not as much as we could expect, the decrease in speed is not huge, heavyweights are just a 7% slower on average than lightweigths. This explains why size is by far the most important variable causing punching power and it explains why even a few extra pounds of difference between two boxers of the same weight class can cause a serious advantage in the fight.

If you are a professional boxer and you want to hit harder it is wiser to try and gain some muscle than trying to hit faster. You cannot increase much your speed once you are technically good, but you can gain some kilograms of muscle. They won’t make you noticeable slower and you will hit harder.

Now let’s fit a regression model

I will create a DataFrame and sort boxers by weight, then I will train the model using only the smaller 200 boxers on purpose (from 55 to 65 kg).

I want the model to see only some small part of the population and then I want to make an intervention and predict what would happen if we speak about heavyweights instead of lightweights.

We can do this with this code:

This outputs the following graph for real power vs predicted power:

1*7ssnxSaXoGH5tmgoYLXzdg.png?q=20
causal-networks-with-python-predicting-punching-power-in-boxing-b16de0a8dd58
Blue: real target, Orange: model predictions

R2 = 0.91 is high because after all, speed effect is very small.

Now let’s model a Causal Network using CausalNex

CausalNex is an excellent library for Python causality, especially because it provides a Sklearn interface so we can easily compare causal models and traditional sklearn models.

We can create a causal model in CausalNex as follows:

Obtaining this output:

1*I0dCXdZqqu_OTTp-Jsv5WA.png?q=20
causal-networks-with-python-predicting-punching-power-in-boxing-b16de0a8dd58

R2 = 0.96 which is quite good. The model extrapolates very well from small to big boxers.

Conclusion

Modeling causal relationship is not a problem anymore thanks to do-calculus and its Python implementation. There is a number of Python libraries implementing causal models, namely: Dowhy (from microsfot), Pymc3 (popular statistical page) and CausalNex.

The reason I picked CausalNex for this article is that I think it is intuitive, easy to use and has a good interface to make it compatible with Scikit-Learn.

Now that it is easy to model causal relationships with Python a huge new world of possibilities is open for the machine learning and AI community.

I am sure the next decade will bring a causal revolution to AI because we will be able to leverage current technologies to discover the true size of causal effects. The implications of this is going to change fields like medicine, energy and weather forecast, economics, epidemiology and many more.

Currently, thanks to deep learning and neural networks we are solving problems like: telling a cat from a dog, creating chat bots, recognizing faces, faking faces, telling who has a desease.

That’s cool.

Now imagine being able to say why cancer is lethal and then change its mortality rate, calculate how much impact has CO2 in global warming or discover why the economy crashed.

Wouldn’t that be cooler?

Of course it is. Answering why is cool.

Hope you enjoyed this.

Happy coding!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK