Time Series Metrics Engine based on Cassandra

Overview

Hawkular Metrics, a storage engine for metric data

Build Status
Coverity Scan Build Status (coverity_scan branch)

About

Hawkular Metrics is the metric data storage engine part of Hawkular community.

It relies on Apache Cassandra as a backend and is comprised of:

  • a core library

  • a REST/HTTP interface

Important
Cassandra 3.0.12 or later is required.

The core library

A Java library, built with RxJava on top of the Cassandra Java driver.

This is for advanced users only, for embedding the core functionality in another project.

REST/HTTP interface

Most users will work with the web application. It exposes a REST/HTTP interface based on the core library. It is implemented with the JAX-RS 2 asynchronous API and runs on a Wildfly 10 server. The data format is JSON.

Goals

Simple, easy to use REST interface

The REST API should be easy to use. Users should be able to send data with the simplest tools: shell scripts and curl.

Getting started with a release build

There are a couple of options for running Hawkular Metrics:

  • WildFly distribution

  • EAR distribution

Important
Earlier versions of Hawkular Metrics could be run by deploying hawkular-metrics-api-jaxrs-X.Y.Z.war. This is longer supported.

The WildFly distribution is a pre-configured WildFly 10 server that includes Hawkular Alerts in addition to Hawkular Metrics. Check out the Metrics releases page and download the latest version of hawkular-metrics-wildfly-standalone-X.Y.Z.Final.tar.gz.

The EAR distribution includes both Hawkular Metrics and Hawkular Alerts. Check out the Metrics releases page and download the latest version of hawkular-metrics-standalone-dist-X.Y.Z.Final.ear. Copy the EAR file to the standalone/deployments directory. You will have to manually configure WildFly.

The following cache container declarations are needed in the infinispan subsection:

<cache-container name="hawkular-alerts" default-cache="triggers" statistics-enabled="true">
      <local-cache name="partition"/>
      <local-cache name="triggers"/>
      <local-cache name="data"/>
      <local-cache name="publish"/>
      <local-cache name="schema"/>
      <local-cache name="dataIds"/>
      <local-cache name="globalActions"/>
</cache-container>
<cache-container name="hawkular-metrics" default-cache="triggers" statistics-enabled="true">
        <local-cache name="locks"/>
</cache-container>

If you are running a cluster of WildFly servers in HA mode, then you will instead want to edit standalone-ha.xml, adding the following cache containers:

<cache-container name="hawkular-alerts" default-cache="triggers" statistics-enabled="true">
        <transport lock-timeout="60000"/>
        <replicated-cache name="partition" mode="SYNC">
               <transaction mode="BATCH"/>
        </replicated-cache>
        <replicated-cache name="triggers" mode="ASYNC">
               <transaction mode="BATCH"/>
        </replicated-cache>
        <replicated-cache name="data" mode="ASYNC">
               <transaction mode="BATCH"/>
        </replicated-cache>
        <replicated-cache name="publish" mode="ASYNC">
               <transaction mode="BATCH"/>
        </replicated-cache>
        <replicated-cache name="schema" mode="SYNC">
               <transaction mode="NON_XA"/>
        </replicated-cache>
        <replicated-cache name="dataIds" mode="ASYNC">
               <transaction mode="BATCH"/>
        </replicated-cache>
        <replicated-cache name="globalActions" mode="ASYNC">
               <transaction mode="BATCH"/>
        </replicated-cache>
</cache-container>
<cache-container name="hawkular-metrics" default-cache="triggers" statistics-enabled="true">
        <transport lock-timeout="60000"/>
        <replicated-cache name="locks" mode="SYNC">
               <transaction mode="NON_XA" locking="PESSIMISTIC"/>
        </replicated-cache>
</cache-container>

By default, Metrics will try to connect to a Cassandra on localhost. If you want to start a Cassandra server embedded in WildFly for testing, add the hawkular-metrics-embedded-cassandra-ear-X.Y.Z.ear archive to the standalone/deployments directory.

Build Instructions

Important
A running Cassandra cluster, which can be a single node, is required for unit and integration tests.
git clone [email protected]:hawkular/hawkular-metrics.git
cd hawkular-metrics
./mvnw install
Tip
If you only want to build the sources without a running C* cluster, you can run ./mvnw install -DskipTests.

Setting up Cassandra for development or testing

For development or testing, the easiest way to setup Cassandra is to use the Cassandra Cluster Manager, CCM.

ccm create -v 3.0.12 -n 1 -s hawkular

These steps build and start a single node cluster of Cassandra. Note that while it is recommended to use ccm, it is not necessary. You just need to make sure you have a running 3.0.12 cluster.

Client tools

If you want to send or fetch metrics from your own application, there are client libraries available to assist:

Working with monitoring tools

You can send data collected with your usual monitoring tools to Hawkular Metrics:

  • collectd

  • ganglia

  • jmxtrans

  • statsd

In order to do so, you must start our network protocol adapter, ptrans.

Contributing & Community

If you are a user of Hawkular Metrics please ask your question in the Hawkular user forum. To contribute or participate in design discussion, please use the Hawkular developer mailing list.

We love contributions and pull-requests :-)

To file an issue, please use the Hawkular-Metrics JIRA.

To chat, join us on Freenode IRC in channel #hawkular. If you can not use the irc protocol, you can also use a web to irc gateway like Web chat on Freenode.

Hawkular-Metrics is @hawkular_org on Twitter.

Comments
  • ETL compression process based on Gorilla algorithms

    ETL compression process based on Gorilla algorithms

    Adds ETL based compression methods in the core for Gauge data (for now, I will later in the PR add support for counters & availabilities also).

    After this PR is merged, I recommend replacing Deflate with LZ4 again (since we need speed due to the ETL processing)

    This PR for now is missing the job scheduler stuff, that is coming later to this same PR.

    opened by burmanm 44
  • [HWKMETRICS-614] Temporary tables for writes

    [HWKMETRICS-614] Temporary tables for writes

    This is WIP. At the time of writing this, gauge inserts / fetches work correctly, but for example metrics_idx queries rely on the partition keys and that needs to be fetched from the temp tables also (otherwise our unit tests fail because they test two things and not just insert / delete like they claim to)

    opened by burmanm 31
  • HWKMETRICS-83 Create glue code component to integrate with hawkular-bus

    HWKMETRICS-83 Create glue code component to integrate with hawkular-bus

    HWKMETRICS-83 Create glue code component to integrate with hawkular-bus

    This pull request is comprised of three commits.

    • Implement an InsertedDataPoint event stream

    The MetricsService interface now exposes an #insertedDataEvents method. The method returns a hot Observable emitting Metric objects, when their data has been succesfully inserted.

    Clients can subscribe to this Observable (preferably on a dedicated scheduler!) and they will get notified.

    • Fire @ServiceReady event when MetricsService is initialized

    The Metrics JAX- RS module now fires a CDI @ServiceReady event. This event can be observed by a "super"-module in order to know when the MetricsService instance is initialized.

    The event is fired just before the MetricsServiceLifecycle status changes to "STARTED", so that clients can subscribe to the #insertedDataEvents observable before new data points are accepted.

    • Introduce Hawkular Metrics Component module

    There is a new "hawkular-metrics-component" WAR module. It is meant to replace "hawkular-metrics-api-jaxrs" module in Hawkular deployment.

    This new WAR is a "super"-module: it uses Maven WAR plugin overlay mechanism to add extra classes to the standalone metrics web application.

    The extra classes allow to subscribe to the #insertedDataEvents observable and publish messages on the bus.

    In order to make the migration easy, I have shamelessly copied Alerts' AvailDataMessage and MetricDataMessage classes. Theses messages are posted on to the "HawkularAvailData" and "HawkularMetricData" topics, respectively.

    After this PR is merged we only need to change the agent code to remove the "double-push" part. And change Hawkular POM to switch to the new Metrics component, that's it.

    You may notice that the integration test class borrows a lot from the RESTTest class in rest-tests module. I think we need to introduce a new 'test-utils' module, in a DRY approach. But I wouldn't want to block this PR and would rather do this as a separate PR.

    opened by tsegismont 27
  • Filter published metric on bus

    Filter published metric on bus

    This is an experimental PoC to study the demand publishing pattern inside hawkular-services/metrics. I have opened the PR with the intention to record the comments but this is just a WIP draft to continue the discussion of HAWKULAR-1102 issue, this could lead in something formal to contribute or definitely discard it, but making technical proposal may help in any case. First, @FilipB it could be possible for you to build a hawkular-services distribution with these changes and run the performance tests ? Thanks.

    DO NOT MERGE 
    opened by lucasponce 23
  • [HWKMETRICS-424] Fetch stats from multiple metrics in a single request

    [HWKMETRICS-424] Fetch stats from multiple metrics in a single request

    This PR adds an endpoint for fetching stats (i.e., bucket data points) from multiple metrics in a single request. Right now it only supports gauges and counter. I will update the endpoint to also support gauge and counter rate stats as well. I wanted to go ahead submit the PR for review though because adding support for rates is mostly boiler plate along with additional integration tests.

    @lucasponce would it make sense to include availability stats as well?

    opened by jsanda 22
  • [HWKMETRICS-126] Implement timestamp validation for all metrics

    [HWKMETRICS-126] Implement timestamp validation for all metrics

    A missing timestamp will result in rejecting the entire request before processing it.

    Note: this is implement in handler code rather than using JAX-RS framework features beacuse of divergence between the two JAX-RS standards. If/when one of the standards is dropped this code could be revisited although this implementation is simpler and easier to extend.

    opened by stefannegrea 22
  • HWKMETRICS-330 Update filter to be async

    HWKMETRICS-330 Update filter to be async

    HWKMETRICS-330 Update filter to be async

    A bit of context first. Hawkular Metrics is a reactive application and calls to a blocking API should be avoided as much as possible.

    The Openshift integration library had a servlet filter mechanism to authenticate users. The filter code used the JDK's HTTP client to call Kubernetes' master server. While this was alright has a first implementation (in order to provide the feature as quickly as possible), it is a serious limitation to reach our performance goals.

    There were two problems to tackle:

    1. choose an async HTTP client API
    2. make the filter code asynchronous

    For the HTTP client, I considered a few options: Netty, vert.x, Undertow. I picked undertow because it does not add a new dependency, we can reuse the io threads (instead of creating yet another thread pool). The downside is that the Undertow client API is low-level (no pool implementation) and not well documented.

    For the filter code, my first try was to use servlet async API from the filter. But RestEasy throws an exception because it wants start the async exchange itself. I wrote to the resteasy-user list but got no reply. And even if we had a fix, we would have to wait for a new RestEasy and Wildfly release. So I wrote an Undertow extension which is executed in io threads before the servlet handler is involved.

    The implementation consists in pooling Undertow HTTP client connections and filering the Metrics client requests (only dispatch to the servlet handler if the user is authenticated).

    While working on the implementation I had to find a way to share the MetricsRegistry between the webapp code and the authenticator code. So there's a MetricsRegistry provider class in core-util now.

    Also, I enhanced the Gatling scenario file to support multiple authentication mechanisms (none, Hawkular integration, Openshift htpasswd file, Openshift token). I took the opportunity to document the scenario options in the project README.

    Note that performance enhancements will be more visible in environments where Kubernetes reponse time is minimal.

    opened by tsegismont 21
  • [HWKMETRICS-692] Remove unnecessary calls to findMetric

    [HWKMETRICS-692] Remove unnecessary calls to findMetric

    This PR requires PR #818 to be merged first. This removes the calls to metricsService.findMetric() in cases where it's not needed. This reduces the amount of Cassandra calls in read requests that use tags (other than fetching full metric definition).

    opened by burmanm 19
  • HWKMETRICS-291 Provide different strategies for C* statement grouping

    HWKMETRICS-291 Provide different strategies for C* statement grouping

    HWKMETRICS-291 Provide different strategies for C* statement grouping

    This PR introduces a contract for batching strategies.

    So far, two strategies are implemented:

    • use batch statements
    • use single statement

    We can add variations of the first strategy if needed.

    The user can choose the batching strategy with configuration. If batch statement strategy is chosen, batch statement size can be configured.

    opened by tsegismont 19
  • [HWKMETRICS-465] New POST endpoints gauges/stats/query and counters/stats/query

    [HWKMETRICS-465] New POST endpoints gauges/stats/query and counters/stats/query

    It's a kind of duplication of the GET endpoints "gauges/stats" and "counters/stats". The reason is the same as the one explained here: https://issues.jboss.org/browse/HWKMETRICS-410 and which ended up in duplicating GET "gauges/raw" endpoint to POST "gauges/raw/query" in a similar way.

    Also added integration tests

    opened by jotak 17
  • [HWKMETRICS-180] findMetricsWithTags supports now MatchType (ANY / ALL)

    [HWKMETRICS-180] findMetricsWithTags supports now MatchType (ANY / ALL)

    Implements logical AND for filtering with tags. Also adds a MatchType enum to be used, which has a following meaning:

    • ANY = logical OR
    • ALL = logical AND
    opened by burmanm 17
Releases(0.28.8)
Owner
Hawkular
Monitoring services: Metrics, Alerting, Inventory, Application Performance Management
Hawkular
The Heroic Time Series Database

DEPRECATION NOTICE This repo is no longer actively maintained. While it should continue to work and there are no major known bugs, we will not be impr

Spotify 842 Dec 20, 2022
IoTDB (Internet of Things Database) is a data management system for time series data

English | 中文 IoTDB Overview IoTDB (Internet of Things Database) is a data management system for time series data, which can provide users specific ser

The Apache Software Foundation 3k Jan 1, 2023
Fast scalable time series database

KairosDB is a fast distributed scalable time series database written on top of Cassandra. Documentation Documentation is found here. Frequently Asked

null 1.7k Dec 17, 2022
A scalable, distributed Time Series Database.

___ _____ ____ ____ ____ / _ \ _ __ ___ _ _|_ _/ ___|| _ \| __ ) | | | | '_ \ / _ \ '_ \| | \___ \| | | | _ \

OpenTSDB 4.8k Dec 26, 2022
An open source SQL database designed to process time series data, faster

English | 简体中文 | العربية QuestDB QuestDB is a high-performance, open-source SQL database for applications in financial services, IoT, machine learning

QuestDB 9.9k Jan 1, 2023
Accumulo backed time series database

Timely is a time series database application that provides secure access to time series data. Timely is written in Java and designed to work with Apac

National Security Agency 367 Oct 11, 2022
The Most Advanced Time Series Platform

Warp 10 Platform Introduction Warp 10 is an Open Source Geo Time Series Platform designed to handle data coming from sensors, monitoring systems and t

SenX 322 Dec 29, 2022
Scalable Time Series Data Analytics

Time Series Data Analytics Working with time series is difficult due to the high dimensionality of the data, erroneous or extraneous data, and large d

Patrick Schäfer 286 Dec 7, 2022
The Prometheus monitoring system and time series database.

Prometheus Visit prometheus.io for the full documentation, examples and guides. Prometheus, a Cloud Native Computing Foundation project, is a systems

Prometheus 46.3k Jan 10, 2023
HurricaneDB a real-time distributed OLAP engine, powered by Apache Pinot

HurricaneDB is a real-time distributed OLAP datastore, built to deliver scalable real-time analytics with low latency. It can ingest from batch data sources (such as Hadoop HDFS, Amazon S3, Azure ADLS, Google Cloud Storage) as well as stream data sources (such as Apache Kafka).

GuinsooLab 4 Dec 28, 2022
FlexyPool adds metrics and failover strategies to a given Connection Pool, allowing it to resize on demand.

Introduction The FlexyPool library adds metrics and flexible strategies to a given Connection Pool, allowing it to resize on demand. This is very hand

Vlad Mihalcea 970 Jan 1, 2023
Code metrics for Java code by means of static analysis

CK CK calculates class-level and method-level code metrics in Java projects by means of static analysis (i.e. no need for compiled code). Currently, i

Maurício Aniche 286 Jan 4, 2023
Apache Druid: a high performance real-time analytics database.

Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download Apache Druid Druid is a high performance real-time a

The Apache Software Foundation 12.3k Jan 1, 2023
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.

About CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time. CrateDB offers the

Crate.io 3.6k Jan 2, 2023
MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.

MapDB: database engine MapDB combines embedded database engine and Java collections. It is free under Apache 2 license. MapDB is flexible and can be u

Jan Kotek 4.6k Dec 30, 2022
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Trino is a fast distributed SQL query engine for big data analytics. See the User Manual for deployment instructions and end user documentation. Devel

Trino 6.9k Dec 31, 2022
The official home of the Presto distributed SQL query engine for big data

Presto Presto is a distributed SQL query engine for big data. See the User Manual for deployment instructions and end user documentation. Requirements

Presto 14.3k Dec 30, 2022
MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.

MapDB: database engine MapDB combines embedded database engine and Java collections. It is free under Apache 2 license. MapDB is flexible and can be u

Jan Kotek 4.6k Jan 1, 2023