76

Chef Server Logging with the ELK Stack – Part 1

 5 years ago
source link: https://www.tuicool.com/articles/hit/fuMJBzy
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
iUfqq27.png!web

How many recipes do you have in your cookbook? This isn’t off topic, for today I’ll be exploring the ins and outs of Chef logging to help you maintain the state of your kitchen. (Last of the bad puns.)

For those unfamiliar, Chef is a configuration management and platform automation tool that sits in the same space as Puppet ( which we wrote about previously ).  The secret sauce of these tools (apparently I wasn’t done with the puns after all) is that they help you manage your infrastructure as code, allowing you to create and maintain automated, consistent, testable, and version controlled infrastructure configurations.

The 411 on Chef Microservices

For this tutorial, I’ll be showing how some of the microservices log messages for the Chef server itself. That said, it’s important to have a basic understanding of the top level concepts in Chef-verse. When deploying Chef for regular use, please adhere to best practices for Chef deployments. For more information about Chef best practices, take a look at Chef’s blog post on the topic .

Now into the fray.

Chef has a lot of services running under the hood that are responsible for “making it tick”, all of which are putting their logs in the Chef logging directory, opscode . The top level view of how everything looks for an HA / Production Chef configuration:

For the servers, which is the focus of this piece, you can see the following:

  • bookshelf : An S3 compatible service for storing cookbooks and all their related files.
  • nginx : Manages traffic to the Chef server.
  • oc-id : The authentication/authorization service.
  • oc_bifrost : Authorization API server. It replaced the Authz service, whose name lingers on in the log naming conventions.
  • opscode-erchef : Handles internal server API requests.

The status of all services can be checked using the following:

sudo chef-server-ctl status ${SERVICENAME}

Taking a quick look at these services in the Chef logging directory:

$ sudo tree -L 1 /var/log/opscode/
/var/log/opscode/
├── bookshelf
├── logrotate.status
├── nginx
├── oc_bifrost
├── oc-chef-pedant
├── oc_id
├── opscode-chef-mover
├── opscode-erchef
├── opscode-expander
├── opscode-solr4
├── postgresql
├── rabbitmq
└── redis_lb

Phew! That’s a fair few log types to get through. For brevity, in this post I’ll be primarily focusing on the opscode-* and nginx logs. The opscode-* services correspond to the components responsible for handling both the internal and external API requests, unpacking data for insertion into either Apache Solr (single server) or Elasticsearch (high availability configuration), and handling tasks of Chef management. The nginx server hosts the management UI and related requests. When logging production Chef, it is important to import and parse all logs in the parent directory, which you will be able to do by applying the same principles I will be covering here.

Demo Environment Options

Prerequisites

In order to get started, ensure that you have:

  • An existing Chef environment with Filebeat installed.
  • An existing ELK environment – either self hosted ELK or our platform will work.

If you do not have an existing ELK or Chef environment, please feel free to take a look at my Github repo for building a demo environment to follow along with.

All Aboard: Shipping Logs!

Shipping to Logz.io

Getting started with Nginx

There are two ways to ship the nginx logs from the Chef server to us. Since we autoparse nginx logs, as well as other common types of logs including apache, there is the option to ship the nginx logs directly to our platform via Filebeat. Since we’ll also need to ship logs for other components of the Chef server, I’ll be shipping via Logstash. To tail all of nginx logs for your Chef server, run:

sudo chef-server-ctl tail nginx

YVbaamm.png!web

Using a grok debugger , a simple grok pattern for these logs is:

"^%{IPORHOST:clientip} (%{USER:ident}|-) (%{USER:auth}|-) 
\[%{NOTSPACE:timestamp}\]%{SPACE} 
(?:%{WORD:verb}%{SPACE}%{NOTSPACE:request}(?: 
HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})%{NUMBER:response} 
%{QS:field_1:FLOAT} (?:%{NUMBER:bytes}|-)( %{QS:referrer} %{QS:agent}

 %{QS:field_2} %{QS:field_3} %{QS:field_4} %{QS:field_5} 
%{QS:field_6} %{QS:field_7} %{QS:field_8} %{QS:field_9} %{NUMBER:field_10} 

%{QS:field_11})?"

In filebeat.yml , give the nginx logs a type ( fields.log_type ) to make filtering in the logstash configuration files easier, which will in turn make it easier to add additional logic for the remaining log types later on:

- type: log
  fields:
    log_type: chef-access
  paths:
    - /var/log/opscode/nginx/access.log
  fields_under_root: true
  encoding: utf-8
  ignore_older: 3h

If you do not have a filebeat.yml file yet and do not want to build one from scratch, you can use our Filebeat Configuration Wizard to generate a base file:

uu2amuu.png!web

For the file itself, you’ll need to provide both the path and type, then click Create:

IRbYf2q.png!web

The downloaded file will look something like this:

6FVZvmm.png!web

You’ll need to replace lines 5-14 with the block I indicated earlier. You’ll also need to replace lines 20-24 to ship to your Logstash instance:

output:
  logstash:
    hosts: ["localhost:5044"]

In my case, I have a single instance running the Chef server and Logstash. If you are running Logstash on a different instance, you’ll need to change localhost to that instance’s IP. For reference, the final filebeat.yml file should look like:

#=========================== Filebeat inputs =============================
filebeat.inputs:
 
############# CHEF NGINX ###############
- type: log
  fields:
    log_type: chef-access
  paths:
    - /var/log/opscode/nginx/access.log
  fields_under_root: true
  encoding: utf-8
  ignore_older: 3h
 
registry_file: /var/lib/filebeat/registry
 
#============================= Filebeat modules ===============================
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
 
#==================== Elasticsearch template setting ==========================
setup.template.settings:
  index.number_of_shards: 3
 
#================================ Outputs =====================================
 
output:
 
############### LOGSTASH ###############
  logstash:
    hosts: ["localhost:5044"]

To apply the changes, you will need to restart Filebeat:

sudo service filebeat restart
sudo service filebeat status

Now to configure Logstash. The logstash.yml file can stay as-is, so all I need to do is create a configuration file for Chef’s nginx (and other) logs. Since I added a log_type , I can check the value of that field and apply the grok statement. Remember to use mutate to add a field for your account token, as below:

filter {
  if [log_type] == "chef-access" {
    grok {
      match => { "message" => ["^%{IPORHOST:clientip} (%{USER:ident}|-) (%{USER:auth}|-) \[%{NOTSPACE:timestamp}\]%{SPACE} (?:%{WORD:verb}%{SPACE}%{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest}) %{NUMBER:response} %{QS:field_1:FLOAT} (?:%{NUMBER:bytes}|-)( %{QS:referrer} %{QS:agent} %{QS:field_2} %{QS:field_3} %{QS:field_4} %{QS:field_5} %{QS:field_6} %{QS:field_7} %{QS:field_8} %{QS:field_9} %{NUMBER:field_10} %{QS:field_11})?"] }
      mutate {
        add_field => { "read_timestamp" => "%{@timestamp}" }
      }
    }
  }
  mutate {
    add_field => { "token" => "REPLACEWITHYOURTOKEN" }
  }
}

I also need to add the following to ship the logs via TCP:

tcp {
    host => "listener.logz.io"
    port => 5050
    codec => json_lines
}

(This information, as well as the syntax for the mutate line, can be found on our logstash Log Shipping page .)

So you can compare, your final configuration file should be similar to:

input {
  beats {
    port => 5044
    host => "0.0.0.0"
  }
}
filter {
  if [log_type] == "chef-access" {
    grok {
      match => { "message" => ["^%{IPORHOST:clientip} (%{USER:ident}|-) (%{USER:auth}|-) \[%{NOTSPACE:timestamp}\]%{SPACE} (?:%{WORD:verb}%{SPACE}%{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest}) %{NUMBER:response} %{QS:field_1:FLOAT} (?:%{NUMBER:bytes}|-)( %{QS:referrer} %{QS:agent} %{QS:field_2} %{QS:field_3} %{QS:field_4} %{QS:field_5} %{QS:field_6} %{QS:field_7} %{QS:field_8} %{QS:field_9} %{NUMBER:field_10} %{QS:field_11})?"] }
      mutate {
        add_field => { "read_timestamp" => "%{@timestamp}" }
      }
    }
  }
  mutate {
    add_field => { "token" => "REPLACEWITHYOURTOKEN" }
  }
}
output {
  tcp {
      host => "listener.logz.io"
      port => 5050
      codec => json_lines
  }
}

Similar to filebeat.yml , you may need to adjust input.beats.host to match your host IP if running on a separate instance. Another similarity: you’ll need to restart Logstash to apply the changes.

sudo service logstash restart
sudo service logstash status

To quickly generate some access logs, I refreshed the Chef UI a few times in browser and also logged in and out a couple of times. In my account, I saw the logs start to appear:

MreiuqA.png!web

And taking a quick look at one of the parsed logs:

eYFjeuz.png!web

Right now I’m just parsing the access logs since those are the ones being generated with this configuration, but in a normal environment you will need to ship and parse the error logs as well. To ship all of Chef’s nginx access logs, including internal-chef and internal-authz , simply change the filename specified in the path in filebeat.yml from access.log to *access.log . I’d recommend updating the log_type from the generic access to chef-access as well to separate them from any other access logs in your environment.

A Quick Note About Filebeat and Auto Parsing

As a quick addendum: if you chose to ship the nginx logs directly to our platform via Filebeat, you will need to make the following changes to match our configuration:

  • The time and date should be formatted as: dd/MMM/yyyy:HH:mm:ss Z
  • Change the double spaces before the verb ( GET ) to a single space.
  • There is also a number between the “response” and the “bytes” in quoted strings that needs to be removed.

If you do not make these changes and just ship with Filebeat, you’ll just see an unparsed nginx log file with a _grokparsefailure tag (at the bottom):

Zj22YrQ.png!web

All Star Log Shipping for opscode-*

Now to ship the opscode-* logs. Most of the heavy lifting was done to ship the nginx logs, so I’ll only need to make some adjustments to ship and parse the opscode-* logs. There are actually quite a few of these:

$  sudo tree -L 1 /var/log/opscode/opscode-*
/var/log/opscode/opscode-chef-mover
├── config
├── current
├── lock
└── sasl
/var/log/opscode/opscode-erchef
├── config
├── crash.log.0
├── current
├── erchef.log.0
├── lock
├── requests.log.1
├── requests.log.idx
├── requests.log.siz
├── sasl
└── sasl-error.log
/var/log/opscode/opscode-expander
├── config
├── current
└── lock
/var/log/opscode/opscode-solr4
├── config
├── current
├── gclog.log.0.current
└── lock
 
2 directories, 19 files

For the sake of brevity, I’ll only be covering a subset of these – the opscode-*/current logs. You will be able to use the same procedure to parse the remaining logs, as needed. opscode-*/current logs mainly come in two or three parts: a timestamp, an error level in some, and then the rest of the message.

NvqimiA.png!web

Getting started with a grok pattern for the opscode-solr4/current logs (last set in the screenshot):

^%{NOTSPACE:timestamp} %{NUMBER:field_1} \[%{NOTSPACE:field_2}\] 
%{LOGLEVEL:level} %{GREEDYDATA:message}

To include the logs to those shipped to Logstash, update filebeat.yml similar to what was done for the nginx logs:

- type: log
  fields:
    log_type: chef-solr4
  paths:
    - /var/log/opscode/opscode-solr4/current
  fields_under_root: true
  encoding: utf-8
  ignore_older: 3h

Next, update the filter section to include the grok pattern. The completed section looks roughly like:

filter {
  if [log_type] == "chef-access" {
    grok {
      match => { "message" => ["^%{IPORHOST:clientip} (%{USER:ident}|-) (%{USER:auth}|-) \[%{NOTSPACE:timestamp}\]%{SPACE} (?:%{WORD:verb}%{SPACE}%{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest}) %{NUMBER:response} %{QS:field_1:FLOAT} (?:%{NUMBER:bytes}|-)( %{QS:referrer} %{QS:agent} %{QS:field_2} %{QS:field_3} %{QS:field_4} %{QS:field_5} %{QS:field_6} %{QS:field_7} %{QS:field_8} %{QS:field_9} %{NUMBER:field_10} %{QS:field_11})?"] }
    }
  }
  if [log_type] == "chef-solr4" {
    grok {
      match => { "message" => ["^%{NOTSPACE:timestamp} %{NUMBER:field_1} \[%{NOTSPACE:field_2}\] %{LOGLEVEL:level} %{GREEDYDATA:message}"] }
    }
  }
  mutate {
    add_field => { "token" => "REPLACEWITHYOURTOKEN" }
    add_field => { "read_timestamp" => "%{@timestamp}" }
  }
}

Important: the token field must be outside the conditional block(s) as the token needs to be added to all shipped logs.

Restart both Filebeat and Logstash for the changes to take effect:

sudo service filebeat restart
sudo service filebeat status
 
sudo service logstash restart
sudo service logstash status

Log into our application and verify that the parsed logs are arriving. If you need to generate some new logs, try refreshing the Chef UI and/or restarting the Chef server from the command line.

Rinse repeat for the remaining logs. Starting with Filebeat:

- type: log
  fields:
    log_type: chef-mover
  paths:
    - /var/log/opscode/opscode-chef-mover/current
  fields_under_root: true
  encoding: utf-8
  ignore_older: 3h
- type: log
  fields:
    log_type: chef-erchef
  paths:
    - /var/log/opscode/opscode-erchef/current
  fields_under_root: true
  encoding: utf-8
  ignore_older: 3h
- type: log
  fields:
    log_type: chef-expander
  paths:
    - /var/log/opscode/opscode-expander/current
  fields_under_root: true
  encoding: utf-8
  ignore_older: 3h

And updating the filter block in Logstash:

filter {
  if [log_type] == "chef-access" {
    grok {
      match => { "message" => ["^%{IPORHOST:clientip} (%{USER:ident}|-) (%{USER:auth}|-) \[%{NOTSPACE:timestamp}\]%{SPACE} (?:%{WORD:verb}%{SPACE}%{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest}) %{NUMBER:response} %{QS:field_1:FLOAT} (?:%{NUMBER:bytes}|-)( %{QS:referrer} %{QS:agent} %{QS:field_2} %{QS:field_3} %{QS:field_4} %{QS:field_5} %{QS:field_6} %{QS:field_7} %{QS:field_8} %{QS:field_9} %{NUMBER:field_10} %{QS:field_11})?"] }
    }
  }
  if [log_type] == "chef-solr4" {
    grok {
      match => { "message" => ["^%{NOTSPACE:timestamp} %{NUMBER:field_1} \[%{NOTSPACE:field_2}\] %{LOGLEVEL:level} %{GREEDYDATA:message}"] }
    }
  }
  if [log_type] == "chef-mover" {
    grok {
      match => { "message" => ["^%{NOTSPACE:timestamp} %{GREEDYDATA:message}"] }
    }
  }
  if [log_type] == "chef-erchef" {
    grok {
      match => { "message" => ["^%{NOTSPACE:timestamp} (\[%{NOTSPACE:level}\]) %{GREEDYDATA:message}"] }
    }
  }
  if [log_type] == "chef-expander" {
    grok {
      match => { "message" => ["^%{NOTSPACE:timestamp} (\[%{GREEDYDATA:field_1}\])? (%{NOTSPACE:level})? %{GREEDYDATA:message}"] }
    }
  }
  mutate {
    add_field => { "token" => "iDTtKytGgcmJKcbEgIZMigyAfiFKECSO" }
    add_field => { "read_timestamp" => "%{@timestamp}" }
  }
}

Just like before, restart both Filebeat and Logstash:

sudo service filebeat restart
sudo service filebeat status
 
sudo service logstash restart
sudo service logstash status

And that’s it! Now all the opscode-*/current logs are shipping to Logzio.

But what about self-hosted ELK?

Shipping to OSS ELK

Since I’m already using Logstash to ship, I only need to modify the Logstash configuration to ship to a self-hosted Elasticsearch rather than our product. Since I’m running both ELK and Chef on the same server for the purposes of this blog post, the host for Elasticsearch is simply localhost. To change your output to go to Elastic, change the output block from:

output {
  tcp {
      host => "listener.logz.io"
      port => 5050
      codec => json_lines
  }
}

To:

output {
  elasticsearch {
      hosts => "localhost:9200"
  }
}

You should also remove, or at least comment out, the line that adds your Logz.io token to the shipped logs.

Once the logs start arriving, if you are using a fresh ELK install you may need to create an index pattern:

fInUneJ.png!web

When you refresh Kibana will auto-detect the type, so I’ll match it with logstash-*:

VJVju2a.png!web

And indicate the correct timestamp field:

BV3Q7fB.png!web

Success!

quEBbuZ.png!web

Summary

And that’s it! Now we have a pipeline for shipping Chef logs into the ELK stack. If you are encountering any difficulties with your Chef and/or ELK configurations, please feel free to compare them to my Github repo that documents the MVP configuration I used for this post. It also includes the final form of the filebeat.yml and chef-logstash.conf files for comparison.

In part 2, I will explore how to analyze the data with Kibana dashboards and relevant alerts to help you keep tabs on Chef itself.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK