42

Analysing Honeypot Data using Kibana and Elasticsearch

 4 years ago
source link: https://towardsdatascience.com/analysing-honeypot-data-using-kibana-and-elasticsearch-5e3d61eb2098?gi=bf7affad3d9d
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Analysing Honeypot Data using Kibana and Elasticsearch

H oneypots are an interesting data source and should be used to enhance business cyber strategy. Some companies have been deploying honeypots mirroring some of the core services just to monitor what sort of attacks hackers are incorporating, whilst others are using honeypots to enrich other systems and build system rules (i.e if an IP has appeared in our honeypot data, block on our firewall) . For students, however, using honeypot data to learn about real cyberattacks can be a f̵u̵n̵ good way to get into log analysis, machine learning and threat intelligence. I recently published a post covering how to deploy a honeypot using the Elastic Stack (Elasticsearch, Logstash & Kibana) and over a 2 week period, the honeypot has been live on GCP (Google Cloud Platform) — EU region and in this post we will look at all the attacks that it’s endured, who the threat actors have been and how we can utilise this data.

What’s underneath the honeypot?

  • Debian 10 ( Buster ) — 2 vCPUs, 9GB RAM & 40GB storage.
  • Honeypot — Tpot by T-Mobile
  • Elastic Stack: Elasticsearch, Logstash & Kibana version 6.4 (D̳o̳c̳k̳e̳r̳s̳)

T-Pot was a good deployment to use for this analysis as it included 15 different honeypot systems ranging from email servers to RDP servers which ensured I could get a large enough data pool.

The complete list of the dockerized versions of the honeypots running in T-Pot:

As the data was collected in Elasticsearch, dashboards were available but it was also possible to build my own to get a good understanding of the data. If you have never set up an Elasticsearch cluster before, I recommend reading and trying out the tutorial by Digital Ocean on setting up the Elastic Stack .

Analysis of the Elastic Cluster :chart_with_downwards_trend:

As this was a single node set up running version 6.8.2, we did not have access to Machine Learning (Requires a Gold/Platinum license, but there’s a 30-day trial you can try out some of the features or you can try a 2 week trial of Elastic Cloud). We did, however, get access to 3rd party tool Elasticsearch Head which gives us an overview of how the cluster is performing.

6na2eeF.png!web

The Elastic Cluster

So what is the point of all this?

Threat intelligence is the short answer. Threat intelligence or cyber threat intel for short (seems longer) is described as, “information used by an organization to understand the threats that have, will or are currently targeting the organization.” Using this standard definition with our honeypot data, we can see that we have a pool of data about ongoing attacks, who are doing the attacks, and what ports they are targeting. This data can be used to produced intel reports which can be fed into SOAR systems or used to produce reports. Knowing who the persistent attackers will also help you defend against their attacks, the more you know about the enemy the better right? And what’s a better data source for an EWS (early warning system)?

SOAR?

Security orchestration, automation, and response is becoming a large part of incident response and managing a SIEM platform. For example, users of Splunk will be familiar with Phantom. A good SOAR platform to try out which has a community edition is Demisto. It is possible to feed data from your honeypot Elastic cluster straight to a SOAR platform connected to your SIEM. You can use it as an enrichment tool for your incidents or hunt out zero-day attacks. What honeypots give you is a constant feed of ‘real’ attacks and malicious actors, and as you move towards automated decision making, having access to real threat data will help enrich your threat intelligence.

And last but not least, Data! :bar_chart:

Looking at building a data lake? The more data you have the better right? Even if you’re a penetration tester, want to update your password lists, having a honeypot collecting the most up-to-date username and password combos means if there’s any data leak and the credentials are used, your honeypot will give you access to the data.

eIVrYrb.png!web

Shout out to everyone still using admin/admin credentials!

Let’s open up the honeypot! :honey_pot:

As the deployed system has been running for a while now, I wanted to look at the attacks that have happened and also investigate some of the malicious threat actors. As the data is in Elasticsearch and viewable via Kibana, we won’t have to be grepping any log files which is a win in itself.

What was the most common attack?

The Cowrie honeypot had over 100k attacks!

It is no surprise that the Telnet & SSH honeypot had the most attacks. Cowrie is a medium to high interaction honeypot designed to attract and log brute force attacks and any shell interaction by the attacker. Its main purpose is to interact with an attacker whilst monitoring how they behave when they think they’ve breached a system. You can read up more on Cowrie here.

It was no surprise that of the 102,787 attacks, 32,607 came from China, with the United States of America coming second with a modest 14,115 attacks.

b2aaiqq.png!web

Top 10 Attacking nations

Cowrie tags any known attackers, mass scanners, bad reputation attackers, spam bots and tor exit nodes and of the 102,787 attacks, 99.95% were already known attackers who have been previously reported for malicious activities. I decided to investigate the IP that attacked the honeypot the most ( Hi, 128.199.235.18 ) and using AbuseIPDB we can see that this IP has been reported over 928 times for various malicious activity.

r6Z7fqA.png!web

This server is hosted on Digital Ocean (I notified them)

But what do they do once they log in?

MzEVjye.png!web

Top 10 attacker input

Understanding what attackers are doing once they gain access is key to building your defence. Not every attack is going to come from an unknown user from an unknown IP. With most SIEMs, it is possible to trigger alarms if a user uses any of the listed commands. This way, should a user on your network ever lose their credentials, if any of those commands are used, you can tag it as suspicious behaviour and set up rules to block the account till there’s confirmation it’s not a breach.

Note: Don’t block the ‘top’ command, we still need it! :sob:

It’s not always the usual suspects!

Q7NZZvU.png!web

A map where all the attacks have come from

Cyber attacks are more than just the Big Four (Russia, China, USA & Iran). For security teams, understanding this is key to building out your strategy. Regularly reviewing where your services are being accessed/attacked can also help determine if there’s any internal sabotage going on.

The main purpose of this post was to introduce cybersecurity enthusiasts to what’s happening behind the honeypot, the data and what you can do with it. If you are interested in deploying your own honeypot, have a look at my tutorial on setting up the deployment I used. I did notice however that a few of the honeypot dockers kept knocking over after a small brute force and I set up an Elastic cluster separately to monitor the server to see if it’s being attacked, it was. My next post will cover using the latest version of Elastic to use their SIEM function to monitor some of your test environments. I also plan to declutter the honeypots and deploy them individually for better analysis and also to make use of machine learning from Elastic.

UbqINb6.png!web

RDPY Attacks — Not all data has to be on Excel!

By Stephen Chapendama


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK