Nugget Post: Finding Average length of field in ElasticSearch – David Vassallo's...
source link: https://blog.davidvassallo.me/2020/10/07/nugget-post-finding-average-length-of-field-in-elasticsearch/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Nugget Post: Finding Average length of field in ElasticSearch
It’s sometimes useful to find the average length or size of a given field in Elasticsearch. It would help indicate the size taken up by a document, or help determine maximum lengths to be allowed and so on.
Unfortunately this is not a visualization built into Kibana, however it is possible to define an aggregation which calculates the statistics for a given field’s size.
For example, in the below request we query winlogbeat for the top 25 documents sorted by Event Code, and then calculate statistics for the szie of the “message” field in each:
GET /winlogbeat-*/_search
{
"size": 0,
"query": {
"range": {
"@timestamp": {
"gte": "now-1h/h",
"lte": "now"
}
}
},
"aggs": {
"eventIDs":{
"terms": {
"field": "event.code",
"size": 25
},
"aggs": {
"length": {
"stats": {
"script": "if (doc['message'].size()>0) { return doc['message'].value.length()}"
}
}
}
}
}
}
Note how we use the “terms” aggregation to find the top 25 event IDs, and then use the “stats” aggregation along with a “script“condition to calculate the min, max, sum, count and average. The script is straightforward:
- First check if the “message” field exists in the document :
doc['message'].size()>0
- If that check passes, return the length of that
doc['message'].value.length()
The results can be used to build your own visuals in something like Excel:
Average size of message field, split by Windows Event ID
Recommend
-
32
“I'm working on something new with @josh_pschorr:”
-
17
-
5
Golang: one function, one argument – multiple types For better or for worse, Golang is an extremely simple language. It lacks some of the constructs other languages has. One of these features is...
-
2
Making “certificate-transparency-go” tools more accessible While researching the best way to implement the SSL certificate monitoring feature for our Tutela product, we ran across the exc...
-
3
TL;DR When building non-trivial SQL queries, it helps to: State your objective in as “declarative” a way as possible Work from the “inside out”. When using subqueries in SQL it helps to find the innermost n...
-
3
InfluxDB: Monitoring Web Server HTTP Response Codes Scenario: Measuring the number of HTTP response codes returned to clients over time using InfluxDB. This would (for example) produce a line...
-
2
Spring Boot R2DBC INSERT batching (Reactive SQL) Batching is the act of gathering multiple statements together and executing them over a single database connection. Batching has performance benefits since the database...
-
5
Search Questions and Answers
-
3
Nugget Post: “429 Too Many Requests” on Elasticsearch/Opensearch
-
5
Average Erect Penis Length Has Increased 24 Percent and Scientists Have No Idea Why, Study SuggestsScientists suggest 'sedentary lifestyle' and environmental changes may have led to average erect penis length t...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK