Causal Inference with Deep Learning using Manufacturing Supply Chain Optimizatio...

Causal Inference with Deep Learning using Manufacturing Supply Chain Optimization as an Example

Machine Learning has been very successful using observational data to build models for predictions, but does not go far enough for causal inference. We humans use cause and effect to learn about the world. In causal inference statistical tools are used to analyze cause and effect. In causal analysis, our goal is to set a variable to a specific value to find the outcome in another variable, which aids in decision making. This is traditionally done through Randomized Control Trial or A/B testing. However in many real life cases A/B testing is not feasible or too expensive. In this post we will discuss solution for causal inference with deep learning models. We will use manufacturing supply chain as an example where our goal will be to gain insight on how to reduce back order to optimize profit.

The causal inference analysis in this post is based causal graphical model and do calculus. The implementation based on PyTorch is available in my open source project avenir in GitHub.

Causal Inference

There are various techniques for causal inference. We will use an approach based on causal graph and back door conditioning. We will discuss interventional distribution and how to estimate it based on observational distribution using do calculus and back door criteria to block non causal path in causal graph. For technical details of these concepts you could refer to the citations provided. Our focus in this post will be application of causal inference analysis for a real world problem. Here is one more good post on causal inference approach.

In machine learning we are interested in conditional distribution P(Y|X) of a target variable Y given feature variables X. In causal inference we are interested in interventional distribution P(Y|do(X)) using do calculas notations. Manual intervention to set X to a specific value is represented as do(X). Our final goal is the expected value E(Y|do(X)) i.e the average effect on Y as a result of intervening and setting X to x

All we have is observational data for a problem. The following steps will show us how the interventional interventional expectation E(Y|do(X)) can be expressed in terms of the expectation on observational data E(Y|X=x, Z). We will find out shortly what Z is.

The first thing we need is a causal graph. In a causal graph there is a node for each variable and an edge from A to B if A causes B. Defining a correct causal graph is critical for the success of causal inference. Domain knowledge is required for defining causal graph.

Next we need to block all non causal path between X and Y. This done through a process called back door conditioning. We define a set of nodes Z such that they break all non causal paths between X and Y and defined as below

No descendent of X is included in Z. This prevents a causal path from a to B from being blocked
Nodes in Z blocks every path between and X and Y that has an incoming edge to X. This is how confounders are handled.

Next let’s find out the effect of the do() operation on the causal graph. It breaks all the paths from the parents of X pointing to X and the value of X is set to x. Because some paths are broken the joint distribution over all variables are fundamentally changed after intervention. This impact is reflected in the expression for observational expectation that includes Z.

In the final step we translate interventional statistics to statistics based on observational data. It starts with Robin’s G Formula which is P(Y|do(X)) = sum(P(Y|X,Z) P(Z)) where the sum is over Z It expresses the distribution of Y under intervention of X as a function of distribution of observational data.

After taking expectation of both sides of G formula , we arrive at the expression E(Y|do(X=x)) = sumk( E(Y|Xk,Zk)) / N. The quantity inside the sum on the right hand side is a prediction for which we can use a machine learning model. We have connected all the dots between causal inference and machine learning prediction model. This enables us to use a trained machine learning predictive model for causal inference.

Machine Learning Predictive Model for Causal Inference

Based on the finding from previous section, we can use trained predictive model based on observational data for causal inference. Here are the steps

Train a predictive model with X and Z as feature variables and Y as the target variable
Make prediction with a test data set with intervention variable X set to the desired value
Take average of all predictions
If you are interested in changing the intervention value, repeat the previous 2 steps with a second intervention value and then take the difference of the 2 average values.

Manufacturing Supply Chain Back Order Optimization

We are are goning to use all the theories discussed so far for a manufacturing supply chain optimization. To be more specific we will analyzing the impact of back order on per unit profit.

Consider the fo;;owing scenario. A manufacturer manufactures a product which requires several parts supplied by third parties. Manufacturing plans are made base on weekly orders. Order for previous week is used to make plans for the next week becuset parts order fulfillment requires a lead time. The plan consists of ordering parts and deciding how many machines will be operating for the week.

Back order always eats into profit, primary because of additional packaging and shipping costs. Back order can be triggered when there is not enough parts in stock or in transportation. Back order can also be caused when production capacity is reached. The manufacturer is interested in finding out increase in per unit profit when back order is decreased from a certain level to a lower level.

Here are the different variables in model with indication of confounding variables Z, intervening variable X and target variable Y. The variables X and Z constitute the features.

Previous week demand (Z)
Current week demand (Z)
Downtime percentage (Z)
Extra margin for parts order (Z)
Back order quantity (X)
Profit per unit (Y)

We will be using a feed forward network with on hidden layer using PyTorch. It’s a regression model since the target is a real valued quantity.

In the causal graph, there is a path from each variable in Z to X and Y. The variable X has path to Y.

Feed Forward Network for Causal Inference of Back Order and Profit

I am using a python wrapper class around PyTorch with all network and other parameters externalized as a configuration file. It helps in making training a feed forward network easier and faster. The application code is in this python script

Please refer to the tutorial document about how to generate data, train the model and do causal inference with the trained model. I am just going to show some results here.

I estimated profit per unit for different back order quantities ranging from 0 to 2000. Here are the results. As expected per unit profit drops as back order increases.

This chart shows gain in profit aa back order is reduced. Based on this result, the manufacturer can decided whether to reduce back order and by how much. There will be additional costs associated with reducing back order. These costs are for ordering larger quantities of parts and/or increasing the production capacity.

Wrapping Up

Although machine learning has been very successful in predictive modeling, it has made very little dent for causal inference. Causal inference is an important tool for any business decision making. In this post we have seen how causal inference can leverage machine learning with an example from supply chain.

I have provided 2 citations. Here is the classic survey paper on causal inference by Pearl. Please follow the tutorial for instructions on how to execute the various steps in the supply chain use case.