Logstash - transport and process your logs, events, or other data

Overview

Logstash

Logstash is part of the Elastic Stack along with Beats, Elasticsearch and Kibana. Logstash is a server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite "stash." (Ours is Elasticsearch, naturally.). Logstash has over 200 plugins, and you can write your own very easily as well.

For more info, see https://www.elastic.co/products/logstash

Documentation and Getting Started

You can find the documentation and getting started guides for Logstash on the elastic.co site

For information about building the documentation, see the README in https://github.com/elastic/docs

Downloads

You can download officially released Logstash binaries, as well as debian/rpm packages for the supported platforms, from downloads page.

Need Help?

Logstash Plugins

Logstash plugins are hosted in separate repositories under the logstash-plugins github organization. Each plugin is a self-contained Ruby gem which gets published to RubyGems.org.

Writing your own Plugin

Logstash is known for its extensibility. There are hundreds of plugins for Logstash and you can write your own very easily! For more info on developing and testing these plugins, please see the working with plugins section

Plugin Issues and Pull Requests

Please open new issues and pull requests for plugins under its own repository

For example, if you have to report an issue/enhancement for the Elasticsearch output, please do so here.

Logstash core will continue to exist under this repository and all related issues and pull requests can be submitted here.

Developing Logstash Core

Prerequisites

  • Install JDK version 8 or 11. Make sure to set the JAVA_HOME environment variable to the path to your JDK installation directory. For example set JAVA_HOME=<JDK_PATH>
  • Install JRuby 9.2.x It is recommended to use a Ruby version manager such as RVM or rbenv.
  • Install rake and bundler tool using gem install rake and gem install bundler respectively.

RVM install (optional)

If you prefer to use rvm (ruby version manager) to manage Ruby versions on your machine, follow these directions. In the Logstash folder:

gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3
\curl -sSL https://get.rvm.io | bash -s stable --ruby=$(cat .ruby-version)

Check Ruby version

Before you proceed, please check your ruby version by:

$ ruby -v

The printed version should be the same as in the .ruby-version file.

Building Logstash

The Logstash project includes the source code for all of Logstash, including the Elastic-Licensed X-Pack features and functions; to run Logstash from source using only the OSS-licensed code, export the OSS environment variable with a value of true:

export OSS=true
  • Set up the location of the source code to build
export LOGSTASH_SOURCE=1
export LOGSTASH_PATH=/YOUR/LOGSTASH/DIRECTORY
  • To run Logstash from the repo you must first bootstrap the environment:
rake bootstrap
  • You can then use bin/logstash to start Logstash, but there are no plugins installed. To install default plugins, you can run:
rake plugin:install-default

This will install the 80+ default plugins which makes Logstash ready to connect to multiple data sources, perform transformations and send the results to Elasticsearch and other destinations.

To verify your environment, run the following to send your first event:

bin/logstash -e 'input { stdin { } } output { stdout {} }'

This should start Logstash with stdin input waiting for you to enter an event

hello world
2016-11-11T01:22:14.405+0000 0.0.0.0 hello world

Advanced: Drip Launcher

Drip is a tool that solves the slow JVM startup problem while developing Logstash. The drip script is intended to be a drop-in replacement for the java command. We recommend using drip during development, in particular for running tests. Using drip, the first invocation of a command will not be faster but the subsequent commands will be swift.

To tell logstash to use drip, set the environment variable JAVACMD=`which drip`.

Example (but see the Testing section below before running rspec for the first time):

JAVACMD=`which drip` bin/rspec

Caveats

Drip does not work with STDIN. You cannot use drip for running configs which use the stdin plugin.

Building Logstash Documentation

To build the Logstash Reference (open source content only) on your local machine, clone the following repos:

logstash - contains main docs about core features

logstash-docs - contains generated plugin docs

docs - contains doc build files

Make sure you have the same branch checked out in logstash and logstash-docs. Check out master in the docs repo.

Run the doc build script from within the docs repo. For example:

./build_docs.pl --doc ../logstash/docs/index.asciidoc --chunk=1 -open

Testing

Most of the unit tests in Logstash are written using rspec for the Ruby parts. For the Java parts, we use junit. For testing you can use the test rake tasks and the bin/rspec command, see instructions below:

Core tests

1- To run the core tests you can use the Gradle task:

./gradlew test

or use the rspec tool to run all tests or run a specific test:

bin/rspec
bin/rspec spec/foo/bar_spec.rb

Note that before running the rspec command for the first time you need to set up the RSpec test dependencies by running:

./gradlew bootstrap

2- To run the subset of tests covering the Java codebase only run:

./gradlew javaTests

3- To execute the complete test-suite including the integration tests run:

./gradlew check

4- To execute a single Ruby test run:

SPEC_OPTS="-fd -P logstash-core/spec/logstash/api/commands/default_metadata_spec.rb" ./gradlew :logstash-core:rubyTests --tests org.logstash.RSpecTests    

5- To execute single spec for integration test, run:

./gradlew integrationTests -PrubyIntegrationSpecs=specs/slowlog_spec.rb

Sometimes you might find a change to a piece of Logstash code causes a test to hang. These can be hard to debug.

If you set LS_JAVA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005" you can connect to a running Logstash with your IDEs debugger which can be a great way of finding the issue.

Plugins tests

To run the tests of all currently installed plugins:

rake test:plugins

You can install the default set of plugins included in the logstash package:

rake test:install-default

Note that if a plugin is installed using the plugin manager bin/logstash-plugin install ... do not forget to also install the plugins development dependencies using the following command after the plugin installation:

bin/logstash-plugin install --development

Building Artifacts

Built artifacts will be placed in the LS_HOME/build directory, and will create the directory if it is not already present.

You can build a Logstash snapshot package as tarball or zip file

./gradlew assembleTarDistribution
./gradlew assembleZipDistribution

OSS-only artifacts can similarly be built with their own gradle tasks:

./gradlew assembleOssTarDistribution
./gradlew assembleOssZipDistribution

You can also build .rpm and .deb, but the fpm tool is required.

rake artifact:rpm
rake artifact:deb

and:

rake artifact:rpm_oss
rake artifact:deb_oss

Using a Custom JRuby Distribution

If you want the build to use a custom JRuby you can do so by setting a path to a custom JRuby distribution's source root via the custom.jruby.path Gradle property.

E.g.

./gradlew clean test -Pcustom.jruby.path="/path/to/jruby"

Project Principles

  • Community: If a newbie has a bad time, it's a bug.
  • Software: Make it work, then make it right, then make it fast.
  • Technology: If it doesn't do a thing today, we can make it do it tomorrow.

Contributing

All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.

Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.

It is more important to me that you are able to contribute.

For more information about contributing, see the CONTRIBUTING file.

Comments
  • Manage plugins to be installed offline

    Manage plugins to be installed offline

    The motivation of this PR is to enable users to dump a set of plugins, to later be managed in machines that have no internet connection.

    This introduces a set of new commands that enable offline plugin management.

    • bin/plugin pack: let's you create a bundle with all plugins currently installed in a LS instance plus their dependencies. By default this package of plugins is created as zip files in windows and tar.gz in unix machines.
    • bin/plugin unpack: From a previously created package this command lets you install it in another logstash instance.
    • Add the --local flag option to the bin/plugin install command. When the new flag is passed the plugin manager will fetch the plugins from a local file system directory (created when doing the bin/plugin unpack).
    • Add the --local flag option to the bin/plugin update command. This command has the same behaviour and motivation as the one for the bin/plugin install

    Expected workflow

    The expected workflow would be:

    • a user created a package with the necessary plugins from a LS that has internet access.
    • with the generated bundle (zip/tgz) the user unpack this file into a LS that has no internet.
    • then using the --local flag when doing install or update he can install plugins from the local filesystem.

    Other options added

    bin/plugin pack:

    option "--tgz", :flag, "compress package as a tar.gz file", :default => !LogStash::Environment.windows?   option "--zip", :flag, "compress package as a zip file", :default => LogStash::Environment.windows?
    option "--[no-]clean", :flag, "clean up the generated dump of plugins", :default => true
    option "--overwride", :flag, "Overwrite a previously generated package file", :default => false
    

    bin/plugin unpack

    +  option "--tgz", :flag, "unpack a packaged tar.gz file", :default => !LogStash::Environment.windows?
    +  option "--zip", :flag, "unpack a packaged  zip file", :default => LogStash::Environment.windows?
    

    Relates/Fixes to #2376

    enhancement bundler v2.1.0 
    opened by purbon 131
  • Logstash 1:1.5.0-1 dies after some time

    Logstash 1:1.5.0-1 dies after some time

    Hi, today I've upgraded to logstash version 1:1.5.0-1 on ubuntu 14.04, x86_64, and it seems to hangs after some time without any notification in the logs whatsoever. When I try to stop the process with issuing

    service logstash stop
    

    or

    /etc/init.d/logstash stop
    

    I get the following:

    killing logstash (pid 26022) with SIGTERM
    Waiting logstash (pid 26022) to die...
    Waiting logstash (pid 26022) to die...
    Waiting logstash (pid 26022) to die...
    Waiting logstash (pid 26022) to die...
    Waiting logstash (pid 26022) to die...
    logstash stop failed; still running.
    logstash started.
    

    and then I have to do a

    kill -9 
    

    to kill the process and it goes on and on. I started logstash manually with the debug option and after some time it died again but I got the following line at the end

    @metadata_accessors=#<LogStash::Util::Accessors:0x2513a1f1 @store={"retry_count"=>0}, @lut={}>, @cancelled=false>]]}, :batch_timeout=>1, :force=>true, :final=>nil, :level=>:debug, :file=>"stud/buffer.rb", :line=>"207", :method=>"buffer_flush"}
    Sending bulk of actions to client[0]: localhost {:level=>:debug, :file=>"logstash/outputs/elasticsearch.rb", :line=>"461", :method=>"flush"}
    Shifting current elasticsearch client {:level=>:debug, :file=>"logstash/outputs/elasticsearch.rb", :line=>"468", :method=>"flush"}
    Switched current elasticsearch client to #0 at localhost {:level=>:debug, :file=>"logstash/outputs/elasticsearch.rb", :line=>"518", :method=>"shift_client"}
    

    The debug output stops at this point and the process hangs.

    I have a lot of logshtash "senders" which just use lumberjack to the main node to send encrypted logs and they are upgraded to 1.5 and they work fine. It seems that the "master" logstash with elasticsearch output has a problem somewhere which is causing it to hang. here's the output config

    elasticsearch {
          host => "localhost"
          protocol => "http"
        }
    

    At the moment I've reverted back to logstash 1.4.2 Thanks and regards.

    bug crashes 
    opened by parabolic 71
  • Fix plugin_manager install and update commands to work properly with no internet env private gem repos

    Fix plugin_manager install and update commands to work properly with no internet env private gem repos

    Add all defined sources to rubygems so verification can talk to all the repositories, even the private ones.

    Fix the workflow for using a private gem repo and fixed issues with #3576 .

    Fixes https://github.com/elastic/logstash/issues/3576

    bug v2.1.0 
    opened by purbon 67
  • NotImplementedError: block device detection unsupported or native support failed to load

    NotImplementedError: block device detection unsupported or native support failed to load

    Issue

    I am encountering the following error when trying to use the file input to watch /var/log/syslog and /var/log/auth.log

    I have run the following on the log files:

    setfacl -m u:logstash:r /var/log/{syslog,auth.log}
    

    The following exception stack trace is from /var/log/logstash/logstash.err

    NotImplementedError: block device detection unsupported or native support failed to load
           blockdev? at org/jruby/RubyFileTest.java:67
             device? at /opt/logstash/vendor/bundle/jruby/1.9/gems/filewatch-0.6.2/lib/filewatch/helper.rb:67
      _sincedb_write at /opt/logstash/vendor/bundle/jruby/1.9/gems/filewatch-0.6.2/lib/filewatch/tail.rb:230
       sincedb_write at /opt/logstash/vendor/bundle/jruby/1.9/gems/filewatch-0.6.2/lib/filewatch/tail.rb:203
            teardown at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-file-0.1.9/lib/logstash/inputs/file.rb:151
         inputworker at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0.rc3-java/lib/logstash/pipeline.rb:203
         synchronize at org/jruby/ext/thread/Mutex.java:149
         inputworker at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0.rc3-java/lib/logstash/pipeline.rb:203
         start_input at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0.rc3-java/lib/logstash/pipeline.rb:171
    

    If I run interactively using sudo -u logstash /opt/logstash/bin/logstash agent -f /etc/logstash/conf.d/ --verbose

    I get the same exception stack trace plus this:

    The error reported is:
      Bad file descriptor - Bad file descriptor
    

    I have a number of file inputs in the pipeline, but it seems only be this file input that causes a problem. Removing this file input allows it to start fine.

    Environment Details

    root@logstash1:/# lsb_release -d
    Description:    Ubuntu 14.04.1 LTS
    
    root@logstash1:/# /opt/logstash/bin/logstash -V
    logstash 1.5.0-rc3
    
    root@logstash1:/# java -version
    java version "1.8.0_45"
    Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
    Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
    
    bug 
    opened by mattstibbs 60
  • Possible event corruption

    Possible event corruption

    We've been seeing a lot of strange errors in logstash logs complaining about normal errors but dumping the event in what appears to be this weird nested format. Here's an example..

    {:timestamp=>"2015-12-11T10:01:53.058000-0500", :message=>"Invalid IP address, skipping", :address=>"%{remote_addr}", :event=>#<LogStash::Event:0x4987eee7 @metadata={"beat"=>"filebeat", "type"=>"nginx_dragonet"}, @accessors=#<LogStash::Util::Accessors:0x33de4c73 @store={"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, @lut={"@timestamp"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "@timestamp"], "beat"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "beat"], "count"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "count"], "input_type"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "input_type"], "offset"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "offset"], "source"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "source"], "type"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "type"], "host"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "host"], "[type]"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "type"], "query_strings"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "query_strings"], "[upstream_status]"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "upstream_status"], "[upstream_response_time]"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "upstream_response_time"], "[upstream_addr]"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "upstream_addr"], "[upstream_port]"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "upstream_port"], "upstream_status"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "upstream_status"], "upstream_response_time"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "upstream_response_time"], "upstream_port"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "upstream_port"], "remote_addr"=>[{"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, "remote_addr"]}>, @data={"message"=>"2015/12/11 10:01:49 [info] 31068#0: *413768 client 127.0.0.1 closed keepalive connection", "tags"=>["_jsonparsefailure"], "@version"=>"1", "@timestamp"=>"2015-12-11T15:01:52.260Z", "beat"=>{"hostname"=>"dragonetapi01.chartbeat.net", "name"=>"dragonetapi01.chartbeat.net"}, "count"=>1, "input_type"=>"log", "offset"=>223212, "source"=>"/mnt/logs/nginx/error.log", "type"=>"nginx_dragonet", "host"=>"dragonetapi01.chartbeat.net", "query_strings"=>nil}, @metadata_accessors=#<LogStash::Util::Accessors:0x6e48c56 @store={"beat"=>"filebeat", "type"=>"nginx_dragonet"}, @lut={}>, @cancelled=false>, :level=>:warn}
    

    You can see the event seems to repeat itself. I'm not sure if this is due to my configuration or something going wrong somewhere else. Running logstash with stdin and rubydebug stdout shows parsing working without issues if I just copy and paste the same log line. Here's is the relevant logstash configuration for Nginx error log parsing from our config

        # Format Nginx Error logs
        if [type] =~ /nginx_.*_error/ {
            grok {
                match => {
                    "message" => [
                        "%{DATESTAMP:timestamp} \[%{DATA:severity}\] (%{NUMBER:pid:int}#%{NUMBER}: \*%{NUMBER}|\*%{NUMBER}) %{GREEDYDATA:message}",
                        "%{DATESTAMP:timestamp} \[%{DATA:severity}\] %{GREEDYDATA:message}",
                        "%{DATESTAMP:timestamp} %{GREEDYDATA:message}"
                            ]
                }
                overwrite => [ "message" ]
            }
    
            grok {
                match =>  { "message" => [ "%{DATA:nginx_error}, %{GREEDYDATA:message}"] }
                overwrite => [ "message" ]
            }
            kv {
                field_split  => ","
                value_split  => ":"
                trimkey      => " "
                trim         => "\""
                include_keys => [ "client", "request", "server", "referrer", "upstream"]
            }
            mutate {
                strip => ["client", "server", "request", "upstream", "referrer"]
            }
            grok {
                match       => { "upstream" => [ "%{URIPROTO}://%{IPORHOST:upstream_ip}(?::%{POSINT:upstream_port})?%{URIPATH:upstream_request}%{URIPARAM:upstream_qs}" ] }
                remove_field => [ "message", "upstream", "port"]
            }
    
    
            if "/ping/ad" in [request] {
                mutate {
                    add_tag => [ "ad_ping_drop" ]
                }
            }
            grok {
                match   => { "request" => [ "GET /ping/?(ad)?\?h=%{DATA:qs_host}&" ] }
            }
    
            if [upstream_ip] {
                cidr {
                  add_tag => ["ec2_ip"]
                  address => ["%{upstream_ip}"]
                  network => [
                    "10.0.0.0/8"
                  ]
                }
    
                mutate {
                    add_field => { "upstream_host" => "%{upstream_ip}" }
                }
    
                if "ec2_ip" in [tags] {
                    dns {
                        reverse => ["upstream_host"]
                        action  => "replace"
                    }
                }
    
                if [upstream_port] {
                    mutate {
                        add_field => { "upstream_addr" => "%{upstream_host}:%{upstream_port}" }
                    }
                } 
                else {
                    mutate {
                        add_field => { "upstream_addr" => "%{upstream_host}" }
                    }
                }
            }
    
            date {
                match => [ "timestamp", "yy/MM/dd HH:mm:ss" ]
                remove_field => [ "timestamp" ]
            }
        }
    

    We're inputting events via Filebeat and the beats input with the JSON codec

    Versions: Filebeat 1.0 Logstash 2.1.1 logstash-codec-collectd (2.0.2) logstash-codec-dots (2.0.2) logstash-codec-edn (2.0.2) logstash-codec-edn_lines (2.0.2) logstash-codec-es_bulk (2.0.2) logstash-codec-fluent (2.0.2) logstash-codec-graphite (2.0.2) logstash-codec-json (2.0.4) logstash-codec-json_lines (2.0.2) logstash-codec-line (2.0.2) logstash-codec-msgpack (2.0.2) logstash-codec-multiline (2.0.4) logstash-codec-netflow (2.0.2) logstash-codec-oldlogstashjson (2.0.2) logstash-codec-plain (2.0.2) logstash-codec-rubydebug (2.0.4) logstash-filter-anonymize (2.0.2) logstash-filter-checksum (2.0.2) logstash-filter-cidr (2.0.2) logstash-filter-clone (2.0.4) logstash-filter-csv (2.1.0) logstash-filter-date (2.0.2) logstash-filter-dns (2.0.2) logstash-filter-drop (2.0.2) logstash-filter-fingerprint (2.0.2) logstash-filter-geoip (2.0.4) logstash-filter-grok (2.0.2) logstash-filter-json (2.0.2) logstash-filter-kv (2.0.2) logstash-filter-metrics (3.0.0) logstash-filter-multiline (2.0.3) logstash-filter-mutate (2.0.2) logstash-filter-ruby (2.0.2) logstash-filter-sleep (2.0.2) logstash-filter-split (2.0.2) logstash-filter-syslog_pri (2.0.2) logstash-filter-throttle (2.0.2) logstash-filter-urldecode (2.0.2) logstash-filter-useragent (2.0.3) logstash-filter-uuid (2.0.3) logstash-filter-xml (2.0.2) logstash-input-beats (2.0.3) logstash-input-couchdb_changes (2.0.2) logstash-input-elasticsearch (2.0.2) logstash-input-eventlog (3.0.1) logstash-input-exec (2.0.4) logstash-input-file (2.0.3) logstash-input-ganglia (2.0.4) logstash-input-gelf (2.0.2) logstash-input-generator (2.0.2) logstash-input-graphite (2.0.4) logstash-input-heartbeat (2.0.2) logstash-input-http (2.0.2) logstash-input-imap (2.0.2) logstash-input-irc (2.0.3) logstash-input-jdbc (2.1.0) logstash-input-kafka (2.0.2) logstash-input-log4j (2.0.4) logstash-input-lumberjack (2.0.5) logstash-input-pipe (2.0.2) logstash-input-rabbitmq (3.1.1) logstash-input-redis (2.0.2) logstash-input-s3 (2.0.3) logstash-input-snmptrap (2.0.2) logstash-input-sqs (2.0.3) logstash-input-stdin (2.0.2) logstash-input-syslog (2.0.2) logstash-input-tcp (3.0.0) logstash-input-twitter (2.2.0) logstash-input-udp (2.0.3) logstash-input-unix (2.0.4) logstash-input-xmpp (2.0.3) logstash-input-zeromq (2.0.2) logstash-output-cloudwatch (2.0.2) logstash-output-csv (2.0.2) logstash-output-elasticsearch (2.2.0) logstash-output-email (3.0.2) logstash-output-exec (2.0.2) logstash-output-file (2.2.0) logstash-output-ganglia (2.0.2) logstash-output-gelf (2.0.2) logstash-output-graphite (2.0.2) logstash-output-hipchat (3.0.2) logstash-output-http (2.0.5) logstash-output-irc (2.0.2) logstash-output-juggernaut (2.0.2) logstash-output-kafka (2.0.1) logstash-output-lumberjack (2.0.4) logstash-output-nagios (2.0.2) logstash-output-nagios_nsca (2.0.3) logstash-output-null (2.0.2) logstash-output-opentsdb (2.0.2) logstash-output-pagerduty (2.0.2) logstash-output-pipe (2.0.2) logstash-output-rabbitmq (3.0.6) logstash-output-redis (2.0.2) logstash-output-s3 (2.0.3) logstash-output-sns (3.0.2) logstash-output-sqs (2.0.2) logstash-output-statsd (2.0.4) logstash-output-stdout (2.0.3) logstash-output-tcp (2.0.2) logstash-output-udp (2.0.2) logstash-output-xmpp (2.0.2) logstash-output-zeromq (2.0.2) logstash-patterns-core (2.0.2)

    opened by jlintz 53
  • logstash will not open a listening port.

    logstash will not open a listening port.

    Error message:

    {:timestamp=>"2014-07-31T17:07:57.613000-0700", :message=>"syslog udp listener died", :address=>"0.0.0.0:514", :exception=>#<SocketError: bind: name or service not known>, :backtrace=>["org/jruby/ext/socket/RubyUDPSocket.java:160:in bind'", "/opt/logstash/lib/logstash/inputs/syslog.rb:116:inudp_listener'", "/opt/logstash/lib/logstash/inputs/syslog.rb:76:in `run'"], :level=>:warn}

    config:

    input { redis { host => "" data_type => "list" key => "logstash" } file { type => "syslog" path => ["/var/log/*.log"] sincedb_path => "/opt/logstash/sincedb-access" } syslog { type => "syslog" port => "5514" tags => ["______"] } syslog { type => "syslog" port => "514" tags => ["_____"] } }

    Have referenced the following: https://github.com/elasticsearch/logstash/pull/1398 and https://github.com/elasticsearch/logstash/blob/74dfd2b05eb2a5369370e5756963bdc0221b8449/pkg/logstash.sysv

    Issue persists.

    Logstash user groupds:

    root@myServer:/etc/logstash/conf.d# id logstash uid=999(logstash) gid=999(logstash) groups=999(logstash),0(root),4(adm),104(syslog)

    Issue also exists for UDP input listeners.

    Error also seen here: https://groups.google.com/forum/#!topic/logstash-users/1skCn079h5g

    Not sure where else. If logstash is unable to listen on 514 I am unable to send anything to logstash, making logstash useless for me.

    opened by heytchap 51
  • Java Filter Execution

    Java Filter Execution

    Status

    • [x] Removed duplicate grammar
    • [x] All tests (including RATs) are green
    • [x] All types of conditionals are implemented
    • [x] All types of conditionals are implemented
    • [x] All possible permutations of input types are implemented
    • [x] All possible permutations of input types have tests
    • [x] If statements are evaluated twice in some cases and need smarter caching
    • [x] Ordinal comparisonons are not in line with Ruby in all cases
    • [x] Some equalities are still based on string comparision where inappropriate
    • [x] Output Conditionals are enabled
    • [x] Metrics counters for output, input and filter are not correctly separated (specs still pass though ...)
    • [x] we have a lot fewer checks for cancelled events now, there must be some cases missing handling (seems we have a lot of specs for this and they still pass)
    • [x] Don't call shutdown only flushers periodically
    • [x] Joni for regexes, java.util.regex may not be compatible so we gotta use that instead
    • [x] Add nand and xor operators

    Performance

    • [x] GC overhead is lower in every case compared to the old implementation
    • [x] Conditionals evaluate a lot faster
    • [x] The case of no filters or conditionals is about 5% slower than before (I don't know why yet, this needs to be investigated/fixed before merging, there must be some redundant action in the execution, that only shows in this case and is outweighed by general improvements for all other cases) ... Fixed, this was a measurement error, we faster in all cases now I think :)

    Possible Improvements

    • Janino (obviously, but also kind of easily doable from this implementation since I kept the translation from PipelineIR very explicit, not sure how much we will gain though since thanks to the way FieldReference lookups now work, there is very little that doesn't get inlined)
    • Less complicated: Smarter chaining of conditionals (currently none of them are getting optimized except for constants, there is a lot of room here by reordering etc.)

    Benchmarks

    Baseline:

    ➜  logstash git:(lir-if-else) ✗ bin/benchmark.sh --local-path=${PWD} -repeat-data=10
    Logstash Benchmark
    ------------------------------------------
    Benchmarking Version: /Users/brownbear/src/logstash
    Running Test Case: baseline (x10)
    ------------------------------------------
    Start Time: Sun 8 13 18:50:23 2017 CEST
    Statistical Summary:
    
    Elapsed Time: 51s
    Num Events: 10000000
    Throughput Min: 0.00
    Throughput Max: 315750.00
    Throughput Mean: 285714.29
    Throughput StdDev: 71738.67
    Throughput Variance: 5146436229.27
    Mean CPU Usage: 28.34%
    ➜  logstash git:(lir-if-else) git checkout master
    Switched to branch 'master'
    Your branch is up-to-date with 'elastic/master'.
    ➜  logstash git:(master) rake compile:all &> /dev/null
    ➜  logstash git:(master) bin/benchmark.sh --local-path=${PWD} -repeat-data=10
    Logstash Benchmark
    ------------------------------------------
    Benchmarking Version: /Users/brownbear/src/logstash
    Running Test Case: baseline (x10)
    ------------------------------------------
    Start Time: Sun 8 13 18:51:59 2017 CEST
    Statistical Summary:
    
    Elapsed Time: 53s
    Num Events: 9798106
    Throughput Min: 20033.00
    Throughput Max: 301875.00
    Throughput Mean: 279945.89
    Throughput StdDev: 55609.24
    Throughput Variance: 3092387983.10
    Mean CPU Usage: 36.91%
    

    => Much lighter on the GC now!

    Also kind of cool is the Apache benchmark:

    Master:

    ➜  logstash git:(master) time cat ~/Downloads/apache_access_logs ~/Downloads/apache_access_logs | bin/logstash -b 128 -w 4 -f ~/tmp/logstash.cfg | pv | wc -c
    WARNING: Default JAVA_OPTS will be overridden by the JAVA_OPTS defined in the environment. Environment JAVA_OPTS are -Xms4g -Xmx4g -Djava.net.preferIPv4Stack=true
    13.2MiB 0:07:21 [30.5KiB/s] [                                                                                                 <=>                                                                                                          ]
     13801522
    cat ~/Downloads/apache_access_logs ~/Downloads/apache_access_logs  0.16s user 2.72s system 0% cpu 7:20.90 total
    bin/logstash -b 128 -w 4 -f ~/tmp/logstash.cfg  1833.99s user 30.28s system 422% cpu 7:21.71 total
    pv  3.18s user 20.04s system 5% cpu 7:21.71 total
    wc -c  0.41s user 3.21s system 0% cpu 7:21.71 total
    

    This:

    ➜  logstash git:(lir-if-else) time cat ~/Downloads/apache_access_logs ~/Downloads/apache_access_logs | bin/logstash -b 128 -w 4 -f ~/tmp/logstash.cfg | pv | wc -c
    WARNING: Default JAVA_OPTS will be overridden by the JAVA_OPTS defined in the environment. Environment JAVA_OPTS are -Xms4g -Xmx4g -Djava.net.preferIPv4Stack=true
    13.2MiB 0:05:14 [42.9KiB/s] [    <=>                                                                                                                                                                                                       ]
     13801710
    cat ~/Downloads/apache_access_logs ~/Downloads/apache_access_logs  0.14s user 2.80s system 0% cpu 5:13.88 total
    bin/logstash -b 128 -w 4 -f ~/tmp/logstash.cfg  1315.97s user 31.09s system 428% cpu 5:14.42 total
    pv  3.07s user 19.74s system 7% cpu 5:14.42 total
    wc -c  0.37s user 2.89s system 1% cpu 5:14.42 total
    

    Much better throughput by batching for each filter (hence making good use of the CPU cache) function :)

    Note:

    • There is a million performance improvements we could add here, but pretty much all of them also involve adding more (or more complicated) code => I tried to be as short as possible here for round 1
    • "filtered" and "output" metrics are always the same size in master as well (just in case you wonder why both increment calls now happen back to back)
    • filter(event, &block) on the pipeline should really go away as it way complicates the Java logic, requiring interpreting all leafs as outputs when there are no outputs in the config, and also forcing us to have the debug terminal dataset
    • Obviously passing the flush "state" up/down the Dataset topology is kind of dirty by just using params, but I feld this was the easiest approach to clean up when/if we drop flush + didn't add a performance it to the general case
    • This contains the fix to #8110 since I needed typed-deserialization for RubyString here to get good performance on String comparisons (otherwise I have to convert String to RubyString after deserializing String which is a loss in every scenario ... either way this is a big speedup for serializing RubyString)
    • Also contains https://github.com/elastic/logstash/pull/8205 as a dependency (still waiting for a review there ;))
    enhancement performance improvements v6.1.0 v7.0.0-alpha1 
    opened by original-brownbear 49
  • Persisted Queue Performance Design Issue (Writing Data)

    Persisted Queue Performance Design Issue (Writing Data)

    The current implementation in master uses mmap to write to fixed size pages that are created and written to sequentially. This introduces a lot of overhead for the VM system, requires hacking around the GC's Cleaner behavior for the mmaped buffers and hence comes with unpredictable and suboptimal performance.

    https://github.com/elastic/logstash/pull/7316 contains an extremely trivial implementation illustrating the problem.

    It simply writes to the channel directly and avoids mmap for writes. The durability features of the approach are exactly the same as those in the master implementation since MappedBuffer.force() is simply replaced by calling Channel.force() on the underlying FileChannel.

    The branch passes all tests and the performance difference is about a factor of 3 (plus a lot more consistent throughput).

    Before (master):

    # Run progress: 0.00% complete, ETA 00:00:01
    # Fork: 1 of 1
    # Warmup Iteration   1: ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
    78.110 ops/ms
    # Warmup Iteration   2: 72.397 ops/ms
    # Warmup Iteration   3: 54.313 ops/ms
    Iteration   1: 42.278 ops/ms
    Iteration   2: 71.426 ops/ms
    Iteration   3: 66.277 ops/ms
    Iteration   4: 55.886 ops/ms
    Iteration   5: 45.107 ops/ms
    Iteration   6: 57.468 ops/ms
    Iteration   7: 59.576 ops/ms
    Iteration   8: 54.282 ops/ms
    Iteration   9: 45.083 ops/ms
    Iteration  10: 50.365 ops/ms
    
    
    Result "org.logstash.benchmark.QueueBenchmark.pushToPersistedQueue":
      54.775 ±(99.9%) 14.293 ops/ms [Average]
      (min, avg, max) = (42.278, 54.775, 71.426), stdev = 9.454
      CI (99.9%): [40.482, 69.068] (assumes normal distribution)
    
    
    # Run complete. Total time: 00:01:57
    
    Benchmark                             Mode  Cnt   Score    Error   Units
    QueueBenchmark.pushToPersistedQueue  thrpt   10  54.775 ± 14.293  ops/ms
    
    BUILD SUCCESSFUL
    
    Total time: 2 mins 1.626 secs
    

    After:

    # Run progress: 0.00% complete, ETA 00:00:01
    # Fork: 1 of 1
    # Warmup Iteration   1: ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
    156.287 ops/ms
    # Warmup Iteration   2: 179.250 ops/ms
    # Warmup Iteration   3: 177.683 ops/ms
    Iteration   1: 188.793 ops/ms
    Iteration   2: 166.776 ops/ms
    Iteration   3: 187.501 ops/ms
    Iteration   4: 189.186 ops/ms
    Iteration   5: 188.416 ops/ms
    Iteration   6: 188.693 ops/ms
    Iteration   7: 186.178 ops/ms
    Iteration   8: 188.649 ops/ms
    Iteration   9: 187.341 ops/ms
    Iteration  10: 186.805 ops/ms
    
    
    Result "org.logstash.benchmark.QueueBenchmark.pushToPersistedQueue":
      185.834 ±(99.9%) 10.230 ops/ms [Average]
      (min, avg, max) = (166.776, 185.834, 189.186), stdev = 6.767
      CI (99.9%): [175.604, 196.064] (assumes normal distribution)
    
    
    # Run complete. Total time: 00:00:36
    
    Benchmark                             Mode  Cnt    Score    Error   Units
    QueueBenchmark.pushToPersistedQueue  thrpt   10  185.834 ± 10.230  ops/ms
    
    BUILD SUCCESSFUL
    
    Total time: 40.806 secs
    

    There's probably a bit of a speedup (50%) achievable with almost no effort by buffering writes in line with durability settings instead of straight flushing the buffer on every write, but I didn't want to invest more than a few minutes just yet before we had a chance to talk about things :) Also note that the current implementation maxes out already at about twice the above throughput if no physical writes are made to the channel (just out-comment the actual write), so we need to optimize the surroundings before we want more than this imo, we're far from I/O bound here even with the linked PR.

    So for me, the approach here seems straight forward:

    • Move to standard channel IO for the page (in a sense done ... imo)
    • Optimize the approach a little by at least buffering optimally
    • Design issue checkpointing
    • Better checkpointing
    • Done?

    What do you guys think?

    design performance improvements persistent queues 
    opened by original-brownbear 47
  • Mass update plugins for 5.0.0 release

    Mass update plugins for 5.0.0 release

    mass update and publish to all plugins needed because of the constraints in the gemspec for plugins.

    See for example: https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/logstash-output-elasticsearch.gemspec#L27

    s.add_runtime_dependency "logstash-core", ">= 2.0.0", "< 3.0.0"
    

    ain't gonna cut it for 5.0

    blocker v5.0.0 v5.0.0-alpha1 
    opened by suyograo 47
  • logstash-core* version constrains problem

    logstash-core* version constrains problem

    currently our gemspecs uses constrains on logstash-core (and soon logstash-core-event), let's call them the core gems, using a pattern similar to:

    s.add_runtime_dependency "logstash-core", '>= 1.4.0', '< 2.0.0'
    

    the problem with this notation is that a logstash-core gem release version 2.0.0.rc1 is considered prior to 2.0.0, because it is a "pre-release" version so a gem version 2.0.0.rc1 would satisfy the < 2.0.0 constrain.

    Obviously the intent in specifying < 2.0.0 is to exclude anything 2.0.ish but I guess that semantically the current behaviour is arguable.

    One possible solution is to specify ~> 1.4 but then this notation does not allow any pre-release versions to satisfy the constrain which is bad since we need to release pre-release beta/rc packages that uses pre-release gem versions.

    I am not sure what our options are. A solution is required for current and future releases and should ideally be also applied to the 1.5 branch.

    This related to #4235 logstash-plugins/logstash-output-file#22 #4188 #4231

    bug discuss 
    opened by colinsurprenant 47
  • Fix for chroot not getting supplemental groups

    Fix for chroot not getting supplemental groups

    linux chroot doesn't get the supplemental groups before dropping privileges. As such, we need to pull it from /etc/group and tweak it so that we can send them to chroot.

    Original report came from @spuder on irc -- here is his GIST report:

    https://gist.github.com/spuder/a1c3c7d10ce129507858

    Spuder tested this and reported it working. Should work on other systems that use sysv init.

    bug packaging v1.5.3 
    opened by coolacid 47
  • Provide supported path for Input and Output plugins to not have codecs

    Provide supported path for Input and Output plugins to not have codecs

    While the base classes for Input and Output plugins include a to-be-inherited codec declaration, not all plugins use the resulting @codec that is instantiated for the plugin. On these plugins, it is still possible to specify codec => xyz in a pipeline configuration, but how that specification is handled is entirely up to the plugin (in most cases it is simply silently ignored). See :no_codec!: for a non-exhaustive list of affected plugins.

    While it would be tempting for these plugins to mark the codec that they inherit as obsolete to rely on Logstash's plugin DSL rejecting an explicit codec, plugins defined in this way cannot be instantiated; obsolete causes the @codec ivar to not get set (even if the config also provides a default value), and the Logstash internals that convert the LIR into a CompiledPipeline will throw NoMethodError's trying to set up a nil codec after the input/output has been Plugin#initialize-d and before it has been Plugin#register-ed.

    1. Logstash internals should be tolerant of a nil (unset) codec during and after pipeline initialization
    2. Plugins should be able to define in-code that they do not have a codec in a way that makes them continue to work on older Logstashes where the above issue exists. One possible solution would be a support adapter that makes use of obsolete, but hooks into Plugin#register to ensure that the @codec ivar is set when run on Logstashes where it is required.
      module LogStash::PluginMixins::NoCodecSupport
      
        def self.included(base_class)
          fail(ArgumentError, "`#{base}` must inherit LogStash::Plugin") unless base < LogStash::Plugin
      
          base.instance_exec do
            config :codec, obsolete: 'this plugin does not use a user-configurable codec'
          end
        end
      
        if true # TODO: conditionally on LS not being broken for nil-`@codec`
      
          NOOP_CODEC = Class.new(LogStash::Codecs::Base) do
            config_name 'noop'
            # TODO: should error if used to decode/encode/multi_encode/encode_sync
          end
      
          def initialize(*a)
            super
      
            @codec = NOOP_CODEC.new("id" => "#{id}/NOOP_CODEC")
          end
        end
      end
      
    enhancement status:needs-triage 
    opened by yaauie 0
  • Allow if/else inside pipelines.yaml

    Allow if/else inside pipelines.yaml

    Hello everyone,

    I know I can use env variables in pipeline.yaml, but I cannot find anything about enabling pipelines based on the existence of an env variable...

    This would be fantastic in container environments where I cpuld enabled one or more pipelines based on the env variables passed.

    What I'm after is the following

     If ${ENABLE_CISCO} == true
     - pipeline.id: cisco
       path.config: "/usr/share/logstash/pipeline/cisco.conf"
     endif
     If ${ENABLE_JUNIPER} == true
     - pipeline.id: Juniper
       path.config: "/usr/share/logstash/pipeline/Juniper.conf"
     endif
    
    enhancement status:needs-triage 
    opened by anubisg1 0
  • Add setting to disable the GeoIP database auto-update

    Add setting to disable the GeoIP database auto-update

    Added a config xpack.geoip.db.auto_update to explicitly disable the GeoIP database auto-update

    Release notes

    What does this PR do?

    This PR adds a config xpack.geoip.db.auto_update to globally disable the database auto-update, applying the following rules:

    auto_update: true, database: nil, same as current behavior trying to update the database. auto_update: true, database: path, same as current behavior using the user-provided database. auto_update: false, database: nil, use the CC database, and delete all EULA databases. auto_update: false, database: path, same as current behavior using the user-provided database.

    Why is it important/What is the impact to the user?

    If users want to disable the GeoIP database auto-update, they need to set the geoip database => "/PATH/TO/DB" option to a file path. As this method is not very intuitive, providing a specific configuration to globally disable the feature, would improve the user experience. It also allows users to stick with the CC databases.

    Checklist

    • [x] My code follows the style guidelines of this project
    • ~[ ] I have commented my code, particularly in hard-to-understand areas~
    • [x] I have made corresponding changes to the documentation
    • [x] I have made corresponding change to the default configuration files (and/or docker env variables)
    • [x] I have added tests that prove my fix is effective or that my feature works

    Author's Checklist

    • [x] Documentation updates PR

    How to test this PR locally

    Pipeline config for testing:

    input { 
      generator {
        count => 1
        message => '{ "ip": "8.8.8.8" }'
        codec => json
      } 
    }
    
    filter {
      geoip {
        source => "ip"
        target => "geoip"
      }
    }
    
    output {
      stdout {
      }
    }
    

    Test 1:

    • Start Logstash with the test pipeline.
    • The plugin should work the same way it does today, downloading the EULA databases and using them.

    Test 2:

    • Change the logstash.yml file and set the config xpack.geoip.db.auto_update to false.
    • Start Logstash with the test pipeline.
    • Logstash should display an info message about deleting EULA database copies, removing them, and using the CC databases instead.
    • Repeat the same test setting the xpack.geoip.db.auto_update config to true. It should download the EULA databases and work as it does today.

    Test 3:

    • Change the logstash.yml file and set the config xpack.geoip.db.auto_update to false.
    • Change the test pipeline by adding the database => "path/to/db" option on the geoip filter.
    • Start Logstash, it should use the user-provided database.

    Related issues

    Closes #14724

    opened by edmocosta 3
  • Guard reserved tags field against incorrect use

    Guard reserved tags field against incorrect use

    Fixed: #14711

    Release notes

    The reserved top-level tags field has required a certain shape since v7.9. It supposes to be a string or an array of strings. Assigning a map to tags field map could crash unexpectedly. The pull request #14822 guards tags field against incorrect use. Assigning a map will result in having a _tagsparsefailure in tags field and the map will be written to _tags field.

    A new setting --event_api.tags.illegal is added for backward compatibility. Since 8.7, the default value is rename to move illegal value to _tags field. warn maintains the same behavior of allowing illegal value assignment to tags field.

    What does this PR do?

    This PR prevents the reserved tags field to be assigned with key/value map. Top-level tags should only accept string of list. When tags get a map value, LogStash::Event will rewrite the value to _tags and add _tagsparsefailure to tags.

    After the change, tags does not hold map value, while it uses to be allowed with ruby { code => "event.set('[tags][k]' , 'v');" }. This means the unconsumed events in PQ and DLQ with a map value in tags will have a different event structure. The map value will write to _tags instead of tags.

    To make this change backward compatible, this PR adds a flag --event_api.tags.illegal to allow fallback to old logic. There are two options. warn - the old flow that allows illegal value assignment to tags field. rename - the new flow that assigns illegal value to _tags field and adds _tagsparsefailure to tags. This is the default value in 8.7

    Why is it important/What is the impact to the user?

    Prior to this change, when a json payload contains tags with map value, the value is set successfully but if there is any error from plugins that needs to add a failure tag to tags, such as _jsonparsefailure or _timestampparsefailure, the pipeline crashes as it cannot append a tag to a map value. This is bad because users lose visibility of what went wrong.

    Checklist

    • [x] My code follows the style guidelines of this project
    • [x] I have commented my code, particularly in hard-to-understand areas
    • [ ] I have made corresponding changes to the documentation
    • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
    • [ ] I have added tests that prove my fix is effective or that my feature works

    Author's Checklist

    • [ ]

    How to test this PR locally

    Follow the reproducer of #14711

    Extra test case

    input {
      generator {
        message => '{"super": "ball"}'
        codec => json
        count => 1
      }
    }
    filter {
      mutate { "add_field" => {"[tags][k]" => "v" } }
      ruby {  code => 'fail "intentional"' }
    }
    output {
     stdout {}
    }
    

    Expected output: tags=> [_tagsparsefailure, _rubyexception] _tags => { k => v}

    Related issues

    Use cases

    Screenshots

    Logs

    opened by kaisecheng 2
Releases(v8.5.3)
  • v8.5.3(Dec 8, 2022)

  • v7.17.8(Dec 8, 2022)

  • v8.5.2(Nov 22, 2022)

  • v8.5.1(Nov 15, 2022)

  • v8.5.0(Nov 1, 2022)

  • v7.17.7(Oct 25, 2022)

  • v8.4.3(Oct 5, 2022)

  • v8.4.2(Sep 20, 2022)

  • v8.4.1(Aug 30, 2022)

  • v8.4.0(Aug 24, 2022)

  • v7.17.6(Aug 24, 2022)

  • v8.3.3(Jul 28, 2022)

  • v8.3.2(Jul 7, 2022)

  • v8.3.1(Jun 30, 2022)

  • v8.3.0(Jun 28, 2022)

  • v7.17.5(Jun 28, 2022)

  • v8.2.3(Jun 14, 2022)

  • v8.2.2(May 26, 2022)

  • v8.2.1(May 24, 2022)

  • v7.17.4(May 24, 2022)

  • v8.2.0(May 3, 2022)

  • v8.1.3(Apr 20, 2022)

  • v7.17.3(Apr 20, 2022)

  • v8.1.2(Mar 31, 2022)

  • v7.17.2(Mar 31, 2022)

  • v8.1.1(Mar 22, 2022)

  • v8.1.0(Mar 8, 2022)

  • v8.0.1(Mar 1, 2022)

  • v7.17.1(Feb 28, 2022)

  • v8.0.0(Feb 10, 2022)

Log sourcing is method of trying to map all the ERROR and WARN logs you have in your system in a cost effective way.

log-sourcing Log sourcing is method of trying to map all the ERROR and WARN logs you have in your system in a cost effective way. The basic idea is th

Shimon Magal 12 Apr 19, 2021
A Java library that facilitates reading, writing and processing of sensor events and raw GNSS measurements encoded according to the Google's GNSS Logger application format.

google-gnss-logger This library facilitates reading, writing and processing of sensor events and raw GNSS measurements encoded according to the Google

Giulio Scattolin 5 Dec 21, 2022
PortalLogger - Logs portals into a text file and in chat

Logs portals into a text file and in chat. Useful if afk flying under bedrock. Feel free to add to your client The logs are stored in .minecraft/ARTEMIS/PortalLogger

null 7 Dec 2, 2022
P6Spy is a framework that enables database data to be seamlessly intercepted and logged with no code changes to the application.

p6spy P6Spy is a framework that enables database data to be seamlessly intercepted and logged with no code changes to existing application. The P6Spy

p6spy 1.8k Dec 27, 2022
Your window into the Elastic Stack

Kibana Kibana is your window into the Elastic Stack. Specifically, it's a browser-based analytics and search dashboard for Elasticsearch. Getting Star

elastic 18.1k Jan 2, 2023
Sidekick is a live application debugger that lets you troubleshoot your applications while they keep on running

Explore Docs » Quick Start Tutorial » Table of Contents What is Sidekick? Sidekick Actions Why Sidekick? Features Who should use Sidekick? How does Si

Sidekick 1.6k Jan 6, 2023
An extensible Java library for HTTP request and response logging

Logbook: HTTP request and response logging Logbook noun, /lɑɡ bʊk/: A book in which measurements from the ship's log are recorded, along with other sa

Zalando SE 1.3k Dec 29, 2022
Best-of-breed OpenTracing utilities, instrumentations and extensions

OpenTracing Toolbox OpenTracing Toolbox is a collection of libraries that build on top of OpenTracing and provide extensions and plugins to existing i

Zalando SE 181 Oct 15, 2022
Free and open source log management

Graylog Welcome! Graylog is an open source log management platform. You can read more about the project on our website and check out the documentation

Graylog 6.4k Jan 6, 2023
The reliable, generic, fast and flexible logging framework for Java.

About logback Thank you for your interest in logback, the reliable, generic, fast and flexible logging library for Java. The Logback documentation can

QOS.CH Sarl 2.6k Jan 7, 2023
The Apache Software Foundation 3k Jan 4, 2023
tinylog is a lightweight logging framework for Java, Kotlin, Scala, and Android

tinylog 2 Example import org.tinylog.Logger; public class Application { public static void main(String[] args) { Logger.info("Hello

tinylog.org 547 Dec 30, 2022
Best-of-breed OpenTracing utilities, instrumentations and extensions

OpenTracing Toolbox OpenTracing Toolbox is a collection of libraries that build on top of OpenTracing and provide extensions and plugins to existing i

Zalando SE 181 Oct 15, 2022
tinylog is a lightweight logging framework for Java, Kotlin, Scala, and Android

tinylog 2 Example import org.tinylog.Logger; public class Application { public static void main(String[] args) { Logger.info("Hello

tinylog.org 551 Jan 4, 2023
Logging filters for Spring WebFlux client and server request/responses

webflux-log Logging filters for Spring WebFlux client and server request/responses. Usage To log WebClient request/response, do the following specify

null 10 Nov 29, 2022
By this package we can get sim info, call logs and sms logs.Also we can find for specific sim info and call logs as well.

sim_sms_call_info A new flutter plugin project. Getting Started This project is a starting point for a Flutter plug-in package, a specialized package

 Hasib Akon 3 Sep 17, 2022
Output Keycloak Events and Admin Events to a Kafka topic.

keycloak-kafka-eventlistener Output Keycloak Events and Admin Events to a Kafka topic. Based on Keycloak 15.0.2+ / RH-SSO 7.5.0+ How to use the plugin

Dwayne Du 4 Oct 10, 2022
Echopraxia - Java Logging API with clean and simple structured logging and conditional & contextual features. Logback implementation based on logstash-logback-encoder.

Echopraxia Echopraxia is a Java logging API that and is designed around structured logging, rich context, and conditional logging. There is a Logback-

Terse Systems 43 Nov 30, 2022