

Setup Elasticsearch Cluster on CentOS 8/7 | Ubuntu 20.04/18.04 With Ansible
source link: https://computingforgeeks.com/setup-elasticsearch-cluster-on-centos-ubuntu-with-ansible/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Elasticsearch is a powerful open-source, RESTful, distributed real-time search and analytics engine which provides the ability for full-text search. Elasticsearch is built on Apache Lucene and the software is freely available under the Apache 2 license. In this article, we will install an Elasticsearch Cluster on CentOS 8/7 & Ubuntu 20.04/18.04 using Ansible automation tool.
This tutorial will help Linux users to install and configure a highly available multi-node Elasticsearch Cluster on CentOS 8 / CentOS 7 & Ubuntu 20.04/18.04 Linux systems. Some of the key uses of ElasticSearch are Log analytics, Search Engine, full-text search, business analytics, security intelligence, among many others.
In this setup, we will be installing Elasticsearch 7.x Cluster with the Ansible role. The role we’re using is ElasticSearch official project, and gives you flexibility of your choice.
Elasticsearch Nodes type
There are two common types of Elasticsearch nodes:
- Master nodes: Responsible for the cluster-wide operations, such as management of indices and allocating data shards storage to data nodes.
- Data nodes: They hold the actual shards of indexed data, and handles all CRUD, search, and aggregation operations. They consume more CPU, Memory, and I/O
Setup Requirements
Before you begin, you’ll need at least three CentOS 8/7 servers installed and updated. A user with sudo privileges or root will be required for the actions to be done. My setup is based on the following nodes structure.
Server NameSpecsServer roleelk-master-0116gb ram, 8vpcusMasterelk-master-0216gb ram, 8vpcusMasterelk-master-0316gb ram, 8vpcusMasterelk-data0132gb ram, 16vpcusDataelk-data0232gb ram, 16vpcusDataelk-data0332gb ram, 16vpcusDataNOTE:
- For small environments, you can use a node for both data and master operations.
Storage Considerations
For data nodes, it is recommended to configure storage properly with consideration for scalability. In my Lab, each Data node has a 500GB disk mounted under /data. This was configured with the commands below.
WARNING: Don’t copy and run the commands, they are just reference point.
sudo parted -s -a optimal -- /dev/sdb mklabel gpt
sudo parted -s -a optimal -- /dev/sdb mkpart primary 0% 100%
sudo parted -s -- /dev/sdb align-check optimal 1
sudo pvcreate /dev/sdb1
sudo vgcreate vg0 /dev/sdb1
sudo lvcreate -n lv01 -l+100%FREE vg0
sudo mkfs.xfs /dev/mapper/vg0-lv01
echo "/dev/mapper/vg0-lv01 /data xfs defaults 0 0" | sudo tee -a /etc/fstab
sudo mount -a
Step 1: Install Ansible on Workstation
We will be using Ansible to setup Elasticsearch Cluster on CentOS 8/7. Ensure Ansible is installed in your machine for ease of administration.
On Fedora:
sudo dnf install ansible
On CentOS:
sudo yum -y install epel-release
sudo yum install ansible
RHEL 7 / RHEL 8:
--- RHEL 8 ---
sudo subscription-manager repos --enable ansible-2.9-for-rhel-8-x86_64-rpms
sudo yum install ansible
--- RHEL 7 ---
sudo subscription-manager repos --enable rhel-7-server-ansible-2.9-rpms
sudo yum install ansible
Ubuntu:
sudo apt update
sudo apt install software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install ansible
For any other distribution, refer to official Ansible installation guide.
Confirm installation of Ansible in your machine by querying the version.
$ ansible --version
ansible 2.9.6
config file = /etc/ansible/ansible.cfg
configured module search path = ['/var/home/jkmutai/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 3.7.6 (default, Jan 30 2020, 09:44:41) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]
Step 2: Import Elasticsearch ansible role
After installation of Ansible, you can now import the Elasticsearch ansible role to your local system using galaxy.
$ ansible-galaxy install elastic.elasticsearch,v7.13.3
Starting galaxy role install process
- downloading role 'elasticsearch', owned by elastic
- downloading role from https://github.com/elastic/ansible-elasticsearch/archive/v7.13.3.tar.gz
- extracting elastic.elasticsearch to /Users/jkmutai/.ansible/roles/elastic.elasticsearch
- elastic.elasticsearch (v7.13.3) was installed successfully
Where 7.13.2 is the release version of Elasticsearch role to download. You can check the releases page for a match for Elasticsearch version you want to install.
The role will be added to the ~/.ansible/roles directory.
$ ls ~/.ansible/roles
total 4.0K
drwx------. 15 jkmutai jkmutai 4.0K May 1 16:28 elastic.elasticsearch
Configure your ssh with Elasticsearch cluster hosts.
$ vim ~/.ssh/config
This how my additional configurations looks like – update to fit your environment.
# Elasticsearch master nodes
Host elk-master01
Hostname 192.168.10.2
User root
Host elk-master02
Hostname 192.168.10.3
User root
Host elk-master03
Hostname 192.168.10.4
User root
# Elasticsearch worker nodes
Host elk-data01
Hostname 192.168.10.2
User root
Host elk-data02
Hostname 192.168.10.3
User root
Host elk-data03
Hostname 192.168.10.4
User root
Ensure you’ve copied ssh keys to all machines.
### Master nodes ###
for host in elk-master0{1..3}; do ssh-copy-id $host; done
### Worker nodes ###
for host in elk-data0{1..3}; do ssh-copy-id $host; done
Confirm you can ssh without password authentication.
$ ssh elk-master01
Warning: Permanently added '95.216.167.173' (ECDSA) to the list of known hosts.
[root@elk-master-01 ~]#
If your private ssh key has a passphrase, save it to avoid prompt for each machine.
$ eval `ssh-agent -s` && ssh-add
Enter passphrase for /var/home/jkmutai/.ssh/id_rsa:
Identity added: /var/home/jkmutai/.ssh/id_rsa (/var/home/jkmutai/.ssh/id_rsa)
Step 3: Create Elasticsearch Playbook & Run it
Now that all the pre-requisites are configured, let’s create a Playbook file for deployment.
$ vim elk.yml
Mine has the contents below.
- hosts: elk-master-nodes
roles:
- role: elastic.elasticsearch
vars:
es_enable_xpack: false
es_data_dirs:
- "/data/elasticsearch/data"
es_log_dir: "/data/elasticsearch/logs"
es_java_install: true
es_heap_size: "1g"
es_config:
cluster.name: "elk-cluster"
cluster.initial_master_nodes: "192.168.10.2:9300,192.168.10.3:9300,192.168.10.4:9300"
discovery.seed_hosts: "192.168.10.2:9300,192.168.10.3:9300,192.168.10.4:9300"
http.port: 9200
node.data: false
node.master: true
bootstrap.memory_lock: false
network.host: '0.0.0.0'
es_plugins:
- plugin: ingest-attachment
- hosts: elk-data-nodes
roles:
- role: elastic.elasticsearch
vars:
es_enable_xpack: false
es_data_dirs:
- "/data/elasticsearch/data"
es_log_dir: "/data/elasticsearch/logs"
es_java_install: true
es_config:
cluster.name: "elk-cluster"
cluster.initial_master_nodes: "192.168.10.2:9300,192.168.10.3:9300,192.168.10.4:9300"
discovery.seed_hosts: "192.168.10.2:9300,192.168.10.3:9300,192.168.10.4:9300"
http.port: 9200
node.data: true
node.master: false
bootstrap.memory_lock: false
network.host: '0.0.0.0'
es_plugins:
- plugin: ingest-attachment
Key notes:
- Master nodes have node.master set to true and node.data set to false.
- Data nodes have node.data set to true and node.master set to false.
- The es_enable_xpack variable set to false for installation of ElasticSearch open source edition.
- cluster.initial_master_nodes & discovery.seed_hosts point to master nodes
- /data/elasticsearch/data is where Elasticsearch data shard will be stored – Recommended to be a separate partition from OS installation for performance reasons and scalability.
- /data/elasticsearch/logs is where Elasticsearch logs will be stored.
- The directories will be created automatically by ansible task. You only need to ensure /data is a mount point of desired data store for Elasticsearch.
For more customization options check the project’s github documentation.
Create inventory file
Create a new inventory file.
$ vim hosts
[elk-master-nodes]
elk-master01
elk-master02
elk-master03
[elk-data-nodes]
elk-data01
elk-data02
elk-data03
When all is set run the Playbook.
$ ansible-playbook -i hosts elk.yml
The execution should start. Just be patient as this could take some minutes.
PLAY [elk-master-nodes] ********************************************************************************************************************************
TASK [Gathering Facts] *********************************************************************************************************************************
ok: [elk-master02]
ok: [elk-master01]
ok: [elk-master03]
TASK [elastic.elasticsearch : set_fact] ****************************************************************************************************************
ok: [elk-master02]
ok: [elk-master01]
ok: [elk-master03]
TASK [elastic.elasticsearch : os-specific vars] ********************************************************************************************************
ok: [elk-master01]
ok: [elk-master02]
ok: [elk-master03]
.......
A successful ansible execution will have output similar to below.
PLAY RECAP *********************************************************************************************************************************************
elk-data01 : ok=38 changed=10 unreachable=0 failed=0 skipped=119 rescued=0 ignored=0
elk-data02 : ok=38 changed=10 unreachable=0 failed=0 skipped=118 rescued=0 ignored=0
elk-data03 : ok=38 changed=10 unreachable=0 failed=0 skipped=118 rescued=0 ignored=0
elk-master01 : ok=38 changed=10 unreachable=0 failed=0 skipped=119 rescued=0 ignored=0
elk-master02 : ok=38 changed=10 unreachable=0 failed=0 skipped=118 rescued=0 ignored=0
elk-master03 : ok=38 changed=10 unreachable=0 failed=0 skipped=118 rescued=0 ignored=0
See below screenshot.
Step 4: Confirm Elasticsearch Cluster installation on Ubuntu / CentOS
Login to one of the master nodes.
$ ssh elk-master01
Check cluster health status.
$ curl http://localhost:9200/_cluster/health?pretty
{
"cluster_name" : "elk-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 6,
"number_of_data_nodes" : 3,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Check master nodes.
$ curl -XGET 'http://localhost:9200/_cat/master'
G9X__pPXScqACWO6YzGx3Q 95.216.167.173 95.216.167.173 elk-master01
View Data nodes:
$ curl -XGET 'http://localhost:9200/_cat/nodes'
192.168.10.4 7 47 1 0.02 0.03 0.02 di - elk-data03
192.168.10.2 10 34 1 0.00 0.02 0.02 im * elk-master01
192.168.10.4 13 33 1 0.00 0.01 0.02 im - elk-master03
192.168.10.3 14 33 1 0.00 0.01 0.02 im - elk-master02
192.168.10.3 7 47 1 0.00 0.03 0.03 di - elk-data02
192.168.10.2 6 47 1 0.00 0.02 0.02 di - elk-data01
As confirmed you now have a Clean Elasticsearch Cluster on CentOS 8/7 & Ubuntu 20.04/18.04 Linux system.
Elasticsearch Learning Video Courses
Similar guides:
Install Graylog 3 with Elasticsearch 6.x on CentOS 8 / RHEL 8 Linux
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK