Brisbane Devops Meetup

!!! Logstash !!!

30th May 2013

Paul Czarkowski / paul@paulcz.net / @pczarkowski

Who am I ?

More Ops than Dev

  • Started out as an SA at several ISPs
  • Moved into Games ... IT Manager @ Pandemic (Brisbane)
  • Moved to BioWare to launch Star Wars: The Old Republic
  • Transferred to EA, Manage Cloud Team for Infrastruture Operations
  • Through M&As have 9 years tenure at EA.

Who am I ?

Stuff I like to do ...

  • Automation ( and I don't just mean Puppet and/or Chef )
  • Agile ( Scrum / Kanban )
  • Cloud ( OpenStack, Amazon )
  • Monitoring / Logging
  • Solve problems
  • Cook ( I would be a chef if they didn't work so hard for such little pay)

What is a log?

A log is a human readable, machine parsable representation of an event.
LOG = TIMESTAMP + DATA

Jan 19 13:01:13 paulcz-laptop anacron[7712]: Normal exit (0 jobs run)

120607 14:07:00 InnoDB: Starting an apply batch of log records to the database...

[1225306053] SERVICE ALERT: FTPSERVER;FTP SERVICE;OK;SOFT;2;FTP OK - 0.029 second response time on port 21 [220 ProFTPD 1.3.1 Server ready.]

[Sat Jan 19 01:04:25 2013] [error] [client 78.30.200.81] File does not exist: /opt/www/vhosts/crappywebsite/html/robots.txt

There's an ISO for that!

ISO 8601

A log is human readable...


208.115.111.74 - - [13/Jan/2013:04:28:55 -0500] "GET /robots.txt HTTP/1.1" 
    301 303 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
            


A human readable, machine parsable representation of an event.

 

 

...But logs are NOT !

But they're machine parsable right?

Logs are machine parseable


208.115.111.74 - - [13/Jan/2013:04:28:55 -0500] "GET /robots.txt HTTP/1.1" 
    301 303 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
            

Logs are machine parseable

Actual Regex to parse apache logs.

Logs are machine parseable

Actual Regex to parse apache logs.

Logs are machine parseable

  • Users will now call PERL Ninja to solve every problem they have - Hero Syndrome.
  • Does it work for every possible log line ?
  • Who's going to maintain that shit ?
  • Is it even useful without being surrounded by [bad] sysadmin scripts ?

So we agree ... This is Bad.


208.115.111.74 - - [13/Jan/2013:04:28:55 -0500] "GET /robots.txt HTTP/1.1" 
    301 303 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
            

Logstash turns this


208.115.111.74 - - [13/Jan/2013:04:28:55 -0500] "GET /robots.txt HTTP/1.1" 
    301 303 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
            

into that


{ 
  "client address": "208.115.111.74",
  "user": null,
  "timestamp": "2013-01-13T04:28:55-0500",
  "verb": "GET",
  "path": "/robots.txt",
  "query": null,
  "http version": 1.1,
  "response code": 301,
  "bytes": 303,
  "referrer": null
  "user agent": "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
} 
            

Logstash Plugins...

cat chain.plugins | grep together | sed 's/like/unix/' > pipeline

Logstash Plugins...

~ 25+ input plugins

  • file
  • tcp,udp
  • amqp, zeromq, redis, sqs
  • twitter, irc
  • lumberjack
  • ...

Logstash Plugins...

~ 20+ filter plugins

  • date
  • grok
  • geoip
  • multiline
  • metrics
  • split
  • ...

Logstash Plugins...

~ 40+ output plugins

  • Elasticsearch
  • StatsD, Graphite, Nagios, PagerDuty
  • amqp, zeromq, redis, sqs
  • boundary, cloudwatch, zabbix
  • lumberjack
  • ...

Two Very Important Filters

Let's talk briefly about two filters that are

very important to making our logs useful

Filter - Date

takes a timestamp and makes it ISO 8601 Compliant

Turns this:

13/Jan/2013:04:28:55 -0500

Into this:

2013-01-13T04:28:55-0500

Filter - grok

Grok parses arbitrary text and structures it.

Makes complex regex patterns simple.


USERNAME [a-zA-Z0-9_-]+
USER %{USERNAME}
INT (?:[+-]?(?:[0-9]+))
MONTH \b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\b
DAY (?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)
              

COMBINEDAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} 
  \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}
  (?: HTTP/%{NUMBER:httpversion})?|-)" %{NUMBER:response} 
  (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}
              

Remember our apache log from earlier?

Define Inputs and Filters.


input {
  file {
    type => "apache"
    path => ["/var/log/httpd/httpd.log"]
  }
}
filter {
  grok {
    type    => "apache"
    pattern => "%{COMBINEDAPACHELOG}"
  }
  date {
    type => "apache"
  }
  geoip {
    type => "apache"
  }
}
              

Define some outputs.


output {
  statsd {
    type      => "apache"
    increment => "apache.response.%{response}"
    # Count one hit every event by response
  }
  elasticsearch {
    type => "apache"
  }
}
              

Instant Graphification !

Kibana

Kibana

Analyze Twitter Streams

  • Marketing
  • Customer feedback
  • good for load testing - 'bieber'

Logstash - Twitter Input


input {
  twitter {
    type      => "twitter"
    keywords  => ["bieber","beiber"]
    user      => "username"
    password  => "*******"
  }
}
output {
  elasticsearch {
    type => "twitter"
  }
}
            
  • 4% Bieber Fans can't spell his name.
  • 10% Tweets from Blackberry ( 50ish business execs? )
  • ~ 200 Bieber tweets per minute.

Already have Central Rsyslog/SyslogNG Server?


input { 
  file {
    type => "syslog"
    path => ["/data/rsyslog/**/*.log"]
  }
}
filter {
  ### a bunch of groks, a date, and other filters
}

output {
  type => "elasticsearch"
}

Act as a Central Syslog Server

Good for Appliances / Switches


input { 
  tcp {
    type => "syslog"
    port => "514"
  }
  udp {
    type => "syslog"
    port => "514"
  }
}
filter {
  ### a bunch of groks, a date, and other filters
}

output {
  type => "elasticsearch"
}

In Cloud? Don't own network?

Use an encrypted Transport

Logstash Agent


input  {   file { ... }    }
output {
  lumberjack {
    hosts           => ["logstash-indexer1", "logstash-indexer2"]
    ssl_certificate => "/etc/ssl/logstash.crt"  
  }
}

Logstash Indexer


input {
  lumberjack {
    ssl_certificate => "/etc/ssl/logstash.crt"  
    ssl_key         => "/etc/ssl/logstash.key"
  }    
}
output { elasticsearch {} }

System Metrics ?

input {
  exec {
    type => "system-loadavg"
    command => "cat /proc/loadavg | awk '{print $1,$2,$3}'"
    interval => 30
  }
}
filter {
  grok {
    type => "system-loadavg"
    pattern => "%{NUMBER:load_avg_1m} %{NUMBER:load_avg_5m}
%{NUMBER:load_avg_15m}"
    named_captures_only => true
  }
}
output {
  graphite {
    host => "10.10.10.10"
    port => 2003
    type => "system-loadavg"
    metrics => [ "hosts.%{@source_host}.load_avg.1m", "%{load_avg_1m}",
"hosts.%{@source_host}.load_avg.5m", "%{load_avg_5m}",
"hosts.%{@source_host}.load_avg.15m", "%{load_avg_15m}" ]
  }
}
          

Unique Problem to solve ?

write a logstash module!

  • Input - Snmptrap
  • Filter - Translate

Can do powerful things with [ boilerplate + ] a few lines of ruby

Scaling logstash

Use Queues ( RabbitMQ, Redis) to help
scale horizontally.

Local log dir on clients = cheap queue

performance... events/sec

  • HP BL460, 48Gb, 24 Cores.
  • Several Filters - Groks, Mutates, etc
  • jump in performance from setting 8 filter workers (-w 8)

Misc considerations

  • file limits ( lol 1024 )
  • Java ( use JRE7 for Performance )
  • Limit Memory Usage
    • Elasticsearch - ~50% RAM for Java Heap
    • Redis / Rabbit - 1Gb
    • Logstash - 256Mb -> 512Mb for Java Heap
    • Be safe - avoid deadly OOM killer.
  • Kibana / Elasticsearch Security - Behind Reverse Proxy!

ElasticSearch

  • Primary Storage/Search engine for logstash
  • Clusterable/Scalable Search Engine
  • RESTful API
  • Lucene Syntax
  • Very large topic on its own...

Further Reading

  • http://www.logstash.net/
  • http://www.logstashbook.com/ [James Turnbull]
  • https://github.com/paulczar/vagrant-logstash
  • http://jujucharms.com/charms/precise/logstash-indexer
  • Logstash Puppet (github/electrical)
  • Logstash Chef (github/lusis)

Other tools you should all be using...

  • Vagrant
  • Chef / Puppet ( obviously! )
  • FPM
  • Omnibus
  • LXC containers for lightweight VMs
  • OpenStack ( run a cloud locally for dev )

Questions?

Paul Czarkowski / paul@paulcz.net / @pczarkowski