28

Agent-based Modeling Will Unleash a New Paradigm of Machine Learning

 4 years ago
source link: https://towardsdatascience.com/agent-based-modeling-will-unleash-a-new-paradigm-of-machine-learning-ff6d3b1ac940?gi=5ede8a1270bc
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

And address some issues in data science

May 8 ·8min read

fy6R3if.jpg!web

@marobian unsplash.com

In 2014, I stared at the ticker screen of my trading application for 5 hours straight. I was grappling with the “unthinkable”, the idea that statistical models were insufficient to model our world. We assume we know the right questions to ask when it comes to data science. But, do we really? Most of the time, our questions are littered with biases (such as confirmation bias ) and our source of inspiration comes from the experiences we have and the data that we have. The stock market, like the real world, our social world are giant interactive systems, mini-worlds where changes are dynamic . In this kind of complex, dynamic system, the only way to know anything is through experimentation.

When you watch a baby learn, you will notice that this baby is acutely aware of his surroundings. This baby interacts with the world to learn. This baby is constantly creating experiments. This baby is gathering data from his experiments.

We all learn this way. We are all agents exploring the vast universe. We are not after the “right” answer. But, most often, we are exploring the understanding of our environment.

Last week, I had the pleasure of listening to the Covid-19 Summit organized by Ben Goertzel ’s SingularityNET . Subsequently, I discovered Deborah Duong ’s work in agent-based modeling for Rejuve.io . I interviewed her and she gave me a crash course on agent-based modeling. This article was partially based on our conversation.

How Do Intelligent Machines Learn About Our World?

Real intelligent machines don’t assume that people who are developing the machines know the questions to ask, but rather experiments using the system to dynamically gather data that it needs. The machine discovers through interactions, “a picture”. This picture is not the answer. This picture is merely a representation of the state that the dynamic system is in at any given moment in time.

Based on the time frame, the picture of the dynamic system can be drastically different.

Based on the events that shape the dynamic system, the picture can be drastically different.

To model our world, we need more generalized models that can adapt to our world in this way. This is where agent-based modeling comes in. Coupled with machine learning, agent-based modeling can change the way we analyze all different kinds of data such as enterprise data, supply chain data, social data, and behavior data, etc..

Agent-based modeling allows us to move away from the need to predict a future event.

Rather, it allows us to focus on understanding the dynamic system as a whole: the interactions between agents with other agents, and the interactions between the agents within the system.

The results from agent-based modeling are not a set of predictions but a set of planning steps to mitigate unwanted effects in a dynamic system.

Agent-based Modeling Can Give You More

Today, we are all about predictive models. In industries such as the financial industry, pharmaceutical industry, shipping industry, and retail industry, often, data science and machine learning have been focused on narrow problem-solving. Specific intelligence is installed in a way as to improve, optimize, and predict more accurately the way that we should solve problems.

There are a few problems associated with that mindset, here are two notable ones:

  1. You are missing the big picture.
  2. You encounter too many exceptions to the rule.

When we encounter more complex problems such as 1) How to determine Covid-19 policy to reopen our city after social distancing? 2) How to model stock market participants’ behaviors depending on timed events? 3) How to shift the business model in the event of an economic meltdown? we realize that we need more than just being able to predict what happens when events occur.

We need the bigger picture and a way to model interactions in our dynamic systems to understand how the agents may change their participation as conditions change, policies change, and rules change.

If you think of a predictive model as a specialized worker, you can think of an agent-based model as a manager who is concerned with “how” the workers are working rather than the direct results of individual projects.

Confounding Variables

Confounding variables can change the accuracy of the machine learning problem. In statistics, confounding variables are ones that influence both the dependent and independent variables. This means that with the presence of these variables in the model, it’s difficult to assign causation. You are not sure whether X has caused Y and vice versa. You can’t say for sure if X1 or X2 caused Y.

An example of this is seasonality. When evaluating time series data, it’s important to remove the seasonality so that you can isolate the variables that are more deterministic of the outcome.

With Covid-19 policy for instance, as we enter a cycle of warmer weather due to seasonal changes, we may see the number of cases of Covid-19 decrease. But, in the winter, these cases may come back.

Failure to remove the effects of these confounding variables in this type of problem will mean that the understanding will be misleading and inaccurate.

You can easily fool yourself into thinking that the rate of infection of Covid-19 has decreased over the summer. Then, be completely surprised when the infection starts to increase again in the winter.

Normally, data scientists take great care when doing feature selection for machine learning. Confounding variables is one of the key reasons. Exploratory Data Analysis (EDA) can often allow data scientists to pinpoint the confounding so that feature selection can account for that.

Randomization is a way to control for confounding variables. Automatically selecting machine learning algorithms per characteristic of the dataset is also trying to control a data scientist’s bias of which algorithm to choose.

But, these methods ultimately are too “outcome” focused. You are trying to get the “right” answer. But, you are not trying to maximize understanding.

This is where Complex Adaptive Systems can help. It can help you to maximize understanding. By modeling and understanding the confounding variables’ interactions with the system and other variables as a whole, you can run simulations to uncover the “nuanced” patterns that may emerge. These patterns can lead to new paths or a new line of questions to explore.

It may not be able to predict the future. But, it will tell you what to focus on in your exploration.

Self-Organization and Unsupervised Learning

One of the fundamental ideas of agent-based modeling is the idea of self-organization. No one is making the rules. The agents are acting by their own will of trying to adapt to this system.

If you watch an ant colony , this is exactly what you will find, an efficient and dynamic system where no one’s really in charge. But, because the agents or the ants have mutual goals of food, shelter, and community, they will work together to achieve that. Each ant knows exactly what to do based on interactions with its surroundings and with each other.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK