Using the Mutate Filter in Logstash
source link: https://www.tuicool.com/articles/FzmMfiB
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
One of the benefits of using Logstash in your data pipelines is the ability to transform the data into the desired format according to the needs of your system and organization. There are many ways of transforming data in Logstash, one of them is using the mutate filter plugin.
This Logstash filter plugin allows you to force fields into specific data types and add, copy, and update specific fields to make them compatible across the environment. Here’s a simple example of using the filter to rename an IP field HOST_IP .
... mutate { rename => { “IP” => “HOST_IP” } } ...
In this article, I’m going to explain how to set up and use the mutate filter using three examples that illustrate the types of field changes that can be executed with it.
The basics
The mutate filter plugin (a binary file) is built into Logstash. You can verify that with the following commands:
cd /usr/share/logstash/bin ./logstash-plugin list | grep -i mutate
The output will be:
logstash-filter-mutate
The mutate filter and its different configuration options are defined in the filter section of the Logstash configuration file. The available configuration options are described later in this article. Before diving into those, however, let’s take a brief look at the layout of the Logstash configuration file.
Generally, there are three main sections of a Logstash configuration file:
- Input – this is where the source of data to be processed is identified.
- Filter – this is where the fields of the incoming event logs can be transformed and processed.
- Output – this is where parsed data will be forwarded to.
More information about formatting the Logstash configuration file can be found here .
In the example below, we’re adding a tag (Apache Web Server) to incoming apache access logs with a condition that the source path contained the term “apache”. Note the mutate filter added in the filter section of the Logstash configuration file:
input { file { path => "/var/log/apache/apache_access.log" start_position => "beginning" sincedb_path => "NULL" } } filter { if [ source ] =~ /apache/ { mutate { add_tag => [ "Apache Web Server" ] } }} output { elasticsearch { hosts => ["localhost:9200"] } }
Mutate Filter Configuration Options
There are a number of configuration options which can be used with the mutate filter, such as copy, rename, replace, join, uppercase, and lowercase.
They are outlined in the table below:
Configuration Options
Usage
add_field
add a new field to the event
remove_field
remove an arbitrary field from the event
add_tag
add an arbitrary tag to the event
remove_tag
remove the tag from the event if present
convert
convert the field value to another data type
id
add a unique id to the field event
lowercase
convert a string field to its lowercase equivalent
replace
replace the field with the new value
strip
remove the leading and trailing white spaces
uppercase
convert a string field to its uppercase equivalent
update
update an existing field with new value
rename
rename a field in the event
gsub
for find and replace substitution in strings
merge
to merge the array or hash events
Simple and Conditional Removals
Performing a Simple Removal
In this example, we want to remove the “Password” field from a small CSV file with 10-15 records. This type of removal can be very helpful when shipping log event data that includes sensitive information. Payroll management systems, online shopping systems, and mobile apps handling transactions are just a few of the applications for which this action is necessary.
The configuration below will remove the field “Password”:
input { file { path => "/Users/put/Downloads/Mutate_plugin.CSV" start_position => "beginning" sincedb_path => "NULL" } } filter { csv { autodetect_column_names => true } mutate { remove_field => [ "Password" ] } } output { stdout { codec => rubydebug } }
After the code has been added, run the Logstash config file. In this case, the final outcome will be shown in the terminal since we are printing the output on stdout.
As seen below, the “Password” field has been removed from the events:
Performing a Conditional Removal
In this example, the field “Password” is again being removed from the events. This time, however, the removal is conditioned by the salary, if [Salary] == “154216.”
The code below will remove the field “Password” using the condition specified earlier:
input { file { path => "/Users/put/Downloads/Mutate_plugin.CSV" start_position => "beginning" sincedb_path => "NULL" } } filter { csv { autodetect_column_names => true } if [Salary] == "154216" { mutate { remove_field => [ "City" ] } } } output { stdout { codec => rubydebug } }
Now, run Logstash with this configuration code. The result of this conditional removal is shown below:
Merging Fields
In this example, we’re going to use the mutate filter to merge two fields, “State” and “City” using the MERGE option.
After merging the two, the “State” field will have the merged data in an array format. In addition, in order to make things clear, we will also RENAME the field as shown in the code below:
input { file { path => "/Users/put/Downloads/Mutate_plugin.CSV" start_position => "beginning" sincedb_path => "NULL" }} filter { csv { autodetect_column_names => true } mutate { merge => { "State" => "City" } } mutate { rename => [ "State" , "State-City" ] } } output { stdout { codec => rubydebug } }
Run the Logstash config file shown below to yield the merged data as shown below:
Adding White Spaces
Next, we’re going to use the mutate filter to add white spaces to the “message” field of incoming events. Currently, there is no space in the values of the “message” field. We will use mutate filter’s “GSUB” option as shown in the code below:
input { file { path => "/Users/put/Downloads/Mutate_plugin.CSV" start_position => "beginning" sincedb_path => "NULL" } } filter { csv { autodetect_column_names => true } mutate { gsub => [ "message", "," , ", " ] } } output { stdout { codec => rubydebug } }
Run the Logstash configuration to see the added white spaces in the message field, as shown below:
Endnotes
This article has demonstrated how a mutate filter can create new fields in a data set as well as replace and rename existing fields. There are many other important filter plugins in Logstash which can also be useful while parsing or creating visualizations.
Some of these include:
- JSON—used to parse the JSON events.
- KV—used to parse the key-value pairs.
- HTTP—used to integrate external APIs.
- ALTER—used to alter fields which are not handled by a mutate filter.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK