Fast scalable time series database

Last update: Dec 17, 2022

Overview

KairosDB is a fast distributed scalable time series database written on top of Cassandra.

Documentation

Documentation is found here.

Frequently Asked Questions

Installing

Download the latest KairosDB release.

Installation instructions are found here

If you want to test KairosDB in Kubernetes please follow the instructions from KairosDB Helm chart.

Getting Involved

Join the KairosDB discussion group.

Contributing to KairosDB

Contributions to KairosDB are very welcome. KairosDB is mainly developed in Java, but there's a lot of tasks for non-Java programmers too, so don't feel shy and join us!

What you can do for KairosDB:

KairosDB Core: join the development of core features of KairosDB.
Website: improve the KairosDB website.
Documentation: improve our documentation, it's a very important task.

If you have any questions about how to contribute to KairosDB, join our discussion group and tell us your issue.

License

The license is the Apache License 2.0

Comments

Replace Hector by Datastax Java Driver for Apache Cassandra

The last release of hector was on 2014-06-16 and the homepage states

THIS PROJECT IS NO LONGER ACTIVE

Please use the official java-driver at https://github.com/datastax/java-driver/ for all Java-based Apache Cassandra projects. [...] The currently active branch is 1.0. The master tracks Apache Cassandra active development which is 1.1.x presently.

The current release of Cassandra is 2.1.10. I don't know if that is too far away or if nothing important (from the point of the Java client side) has changed from 1.1.x to 2.1.10.
enhancement

opened by koppor 36
Incredibly high (inaccurate) values reported

after updating to 0.9.5-beta2 on one of our ten ingest nodes, we saw this node reporting ludicrously high values, in the sextillion range, for metrics that should be reporting in the hundreds, also all tags that we apply to our metrics were not present on these high values. the config was the same as we used for 0.9.4, though the kairosdb.datapoint,factory values were later added to see if the lack of that config was the cause (it was not).
bug

opened by arussellsaw 23
Potential failure when storing String datatype

Dear Author,

I tried to store the date in the string type, and a data looks like this: ver_format=3,fmt_opt=1,app=AirBox,ver_app=0.35.0,device_id=74DA3895C5C0,tick=1517536368,date=2018-02-02,time=09:52:48,device=tses,s_0=178,s_1=100,s_2=1,s_3=0,s_d0=7,s_d1=8,s_d2=5,s_t0=17,s_h0=83,gps_lat=24.06,gps_lon=120.696,gps_fix=1,gps_num=9,gps_alt=2"

However, when I query the data out, I got the follwoing: "values": [ [1517536201000, "ver_format=3,fm"], [1517536203000, "ver_for"], [1517536205000, "ver_fo"], [1517536208000, "ver_format"], [1517536209000, "ver_fo"], [1517536209000, "ver_for"], [1517536210000, "ver_format=3"], [1517536213000, "ver_forma"], [1517536214000, "ver_f"], [1517536216000, "ver"], [1517536217000, "ver_fo"], [1517536219000, "ver"], [1517536220000, "ver_fo"], [1517536221000, "ver"], [1517536223000, "v"], [1517536223000, "ver_fo"],

It seems like the string is not fully stored in the database.

I confirmed that my insert query is correct, which looks like: [{'tags': {'SiteName': 'taichung99', 'device': 'Taichung99', 'app': 'AirBox', 'device_id': '74DA3895C55A'}, 'type': 'string', 'name': 'AirBox.AllData', 'datapoints': [[1494470190000.0, 'ver_format=3,fmt_opt=1,app=AirBox,ver_app=0.35.0,device_id=74DA3895C55A,tick=1494470190,date=2017-05-11,time=10:36:30,device=Taichung99,s_0=1621,s_1=100,s_2=1,s_3=0,s_d0=47,s_d1=62,s_d2=33,s_t0=32.25,s_h0=73,gps_lat=24.308,gps_lon=120.704,gps_fix=1,gps_num=9,gps_alt=2']]}]

Do you have any suggestion what might be going wrong? Thank you!

opened by hippoandy 20
Create option for choosing second granularity in Cassandra

Taken from this conversation: https://groups.google.com/forum/?hl=en#!topic/kairosdb-group/k0rgI8w8DuE

Create a way to switch Cassandra to store data with second granulaity. Basically making the rows 1000 times larger. The use case is for users who want to store data with one day granularity.
enhancement

opened by brianhks 20
DST and leap year fix
Hi,

We had two problems with the computation of the ranges in the RangeAggregator:

it does not take DST into account. Example: when computing day aggregations over the year, the day range is from 0am to 0am in winter and from 1am to 1am in summer.

it does not take leap years into account.

Getting the time zone

In order to fix for DST, I needed the timezone of the user so

I added a timezone field in webroot/index.html

I modified the functions in webroot/js/kairosdb.js and webroot/js/kairosdb.js to deal with the timezone field

I added a DateTimeZoneDeserializer in GsonParser

I added a field timezone in core/datastore/Sampling

If no timezone is specified, the time zone of the machine will be used (just like in previous versions).

Fixing range computation

All range computation are done with jodatime. To avoid dealing with TimeUnit throughout the code of RangeDataPointAggregator, I extract the DateTimeProperty corresponding to the TimeUnit in the constructor. When I compute the start and end of the ranges, I only use DateTimeProperties forgetting about the TimeUnit and their special cases.

Jodatime deals with dst, leap years, month arithmetics for us.

Additionally, the sampling is not aligned anymore to the beginning of weeks, month or year when using a unit of respectively, week, month or year (It allows to compute aggregation wednesdays to wednesday, etc).

Unit tests

Added unit tests in the range aggregator:

test_dstMarch

test_dstOctober

test_februaryRange

test_leapYears

This is my first pull request, any advice is welcome :)
opened by robinkeunen 18
Deb package init script issue

Hi, once successfully installed KairosDB from the last DEB package on Ubuntu 14.04, after issuing the command for starting the service sudo service kairosdb start it replies with /etc/init.d/kairosdb: line 52: syntax error: unexpected end of file. Has anyone experienced the same behaviour? Thanks in advance!

opened by slavomirdittrich 17

Undesired deletion of values depending on start_absolute parameter and datapoint timestamp

Hello,

Scenario

We are using Kairosdb 1.2 with an underlying Scylladb. We are inserting a double of '0.01' two times into the same metric 'doubledeletiontest', with the same timestamps, but with different tags (first: 'tag' -> 'child', second: 'tag' -> 'parent'). After waiting 500ms, we will try and delete the 'parent' datapoint with the following payload:

{
  "cache_time": 0,
  "metrics": [
    {
      "name": "doubledeletiontest",
      "tags": {
        "tag": [
          "1"
        ]
      },
      "group_by": [],
      "aggregators": []
    }
  ],
  "start_absolute": 0
}

Problem

But after this, both doubles are deleted. What are we missing here? Isn't the Delete-Query supposed to work the same way as the Get-Query, i.e. filtering the values by tag? Also, the undesired deletion does not seem to occur if we just send start_absolute of 1000. On further research, we found that the bug seems to happen when deleting datapoints from year 2009, but not so much from 1970.

Proof

I have written a JUnit Test case for better understanding.

import org.apache.commons.collections.CollectionUtils;
import org.apache.commons.lang3.StringUtils;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.kairosdb.client.HttpClient;
import org.kairosdb.client.builder.*;
import org.kairosdb.client.response.QueryResponse;
import org.kairosdb.client.response.Result;

import javax.annotation.Nonnull;
import java.io.IOException;
import java.net.URISyntaxException;
import java.time.ZoneId;
import java.util.*;
import java.util.concurrent.TimeUnit;
import java.util.function.BiFunction;

import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;

public class DoubleDeletionFailingTest {

    private static final String TAG_NAME = "tag";
    private static final String KAIROS_URL = "http://localhost:8083";
    private static final String PARENT_ID_TAG = "1";
    private static final String CHILD_ID_TAG = "2";

    private HttpClient kairosClient;
    private HashMap<String, String> parentTags;
    private HashMap<String, String> childTags;
    private long dec2009;
    private long jan1970;

    @Before
    public void setup() throws IOException {
        kairosClient = new HttpClient(KAIROS_URL);

        parentTags = new HashMap<>();
        parentTags.put(TAG_NAME, PARENT_ID_TAG);

        childTags = new HashMap<>();
        childTags.put(TAG_NAME, CHILD_ID_TAG);

        jan1970 = TimeUnit.HOURS.toMillis(24 * 10);
        dec2009 = TimeUnit.HOURS.toMillis(24 * 365 * 40);
    }

    private void insertData(long singleDate) throws URISyntaxException, IOException {
        // create 'child' data and 'parent' data
        MetricBuilder instance = MetricBuilder.getInstance();
        instance.addMetric(getMetricName()).addTag(TAG_NAME, CHILD_ID_TAG).addDataPoint(singleDate, 100.1D);
        kairosClient.pushMetrics(instance);

        instance = MetricBuilder.getInstance();
        instance.addMetric(getMetricName()).addTag(TAG_NAME, PARENT_ID_TAG).addDataPoint(singleDate, 100.1D);
        kairosClient.pushMetrics(instance);
    }

    @Test
    public void deleteSome1970Zero() throws InterruptedException, IOException, URISyntaxException {
        // doesn't work
        deleteAndCheck(0L);
    }

    @Test
    public void deleteSome1970One() throws InterruptedException, IOException, URISyntaxException {
        insertData(jan1970);
        // works
         deleteAndCheck(1L);
    }

    @Test
    public void deleteSome1970Two() throws InterruptedException, IOException, URISyntaxException {
        insertData(jan1970);
        // works
        deleteAndCheck(2L);
    }

    @Test
    public void deleteSome1970SomeSeconds() throws InterruptedException, IOException, URISyntaxException {
        insertData(jan1970);
        // works
        deleteAndCheck(86400L);
    }

    @Test
    public void deleteSome1970OneDay() throws InterruptedException, IOException, URISyntaxException {
        insertData(jan1970);
        // doesn't work
        deleteAndCheck(TimeUnit.HOURS.toMillis(24L));
    }

    @Test
    public void deleteSome1970TwoDays() throws InterruptedException, IOException, URISyntaxException {
        insertData(jan1970);
        // doesn't work
        deleteAndCheck(TimeUnit.HOURS.toMillis(48L));
    }
    
    @Test
    public void deleteSome2009Zero() throws InterruptedException, IOException, URISyntaxException {
        insertData(dec2009);
        // doesn't work
        deleteAndCheck(0L);
    }

    @Test
    public void deleteSome2009One() throws InterruptedException, IOException, URISyntaxException {
        insertData(dec2009);
        // works
         deleteAndCheck(1L);
    }

    @Test
    public void deleteSome2009Two() throws InterruptedException, IOException, URISyntaxException {
        insertData(dec2009);
        // works
        deleteAndCheck(2L);
    }

    @Test
    public void deleteSome2009SomeSeconds() throws InterruptedException, IOException, URISyntaxException {
        insertData(dec2009);
        // works
        deleteAndCheck(86400L);
    }

    @Test
    public void deleteSome2009OneDay() throws InterruptedException, IOException, URISyntaxException {
        insertData(dec2009);
        // doesn't work
        deleteAndCheck(TimeUnit.HOURS.toMillis(24L));
    }

    @Test
    public void deleteSome2009TwoDays() throws InterruptedException, IOException, URISyntaxException {
        insertData(dec2009);
        // doesn't work
        deleteAndCheck(TimeUnit.HOURS.toMillis(48L));
    }

    private void deleteAndCheck(long date) throws InterruptedException, URISyntaxException, IOException {
        TimeUnit.MILLISECONDS.sleep(500L);

        long count = 1L;
        assertThat(count(getMetricName(), childTags), is(count));
        assertThat(count(getMetricName(), parentTags), is(count));

        // delete only 'parent' values
        QueryBuilder deleteQueryBuilder = QueryBuilder.getInstance();
        deleteQueryBuilder.setStart(new Date(date));
        deleteQueryBuilder.addMetric(getMetricName()).addTags(parentTags);

        kairosClient.delete(deleteQueryBuilder);

        TimeUnit.MILLISECONDS.sleep(500L);

        assertThat(count(getMetricName(), parentTags), is(0L));
        // 'child' values must not have been deleted, but are in my environment
        assertThat(count(getMetricName(), childTags), is(count));
    }

    @After
    public void tearDown() throws IOException {
        kairosClient.deleteMetric(getMetricName());

        kairosClient.shutdown();

    }

    public String getMetricName() {
        return "doubledeletiontest";
    }

    private long count(@Nonnull String metricName, @Nonnull Map<String, String> tags) {

        QueryBuilder queryBuilder = QueryBuilder.getInstance();
        queryBuilder.setStart(new Date(1000L));
        queryBuilder.addMetric(metricName).addTags(tags);

        queryBuilder.getMetrics().get(0).addAggregator(AggregatorFactory.createCountAggregator(10, org.kairosdb.client.builder.TimeUnit.YEARS));

        return queryAndTransform(queryBuilder, (results, zoneId) -> {
            if (CollectionUtils.isEmpty(results) || CollectionUtils.isEmpty(results.get(0).getDataPoints())) {
                return 0L;
            }
            DataPoint dataPoint = results.get(0).getDataPoints().get(0);
            try {
                return dataPoint.longValue();
            } catch (DataFormatException e) {
                throw new RuntimeException("Error deserializing values!", e);
            }
        }, ZoneId.of("UTC"));
    }

    private <T> T queryAndTransform(@Nonnull QueryBuilder queryBuilder,
                                    @Nonnull BiFunction<List<Result>, ZoneId, T> transformFunction,
                                    @Nonnull ZoneId timeZone) {
        try {
            queryBuilder.setTimeZone(TimeZone.getTimeZone(timeZone));

            QueryResponse queryResponse = kairosClient.query(queryBuilder);
            if (CollectionUtils.isNotEmpty(queryResponse.getErrors())) {
                // throw IllegalState here to indicate it is most likely the programmer's error
                throw new IllegalStateException("There have been errors upon querying the Kairos database! " + StringUtils.join(queryResponse.getErrors().iterator(), " | "));
            }
            List<Result> results = queryResponse.getQueries().get(0).getResults();
            return transformFunction.apply(results, timeZone);
        } catch (IOException ioex) {
            // we cannot recover here, so we must rethrow this as internal error
            throw new RuntimeException("IO-Exception while querying the Kairosdb database!", ioex);
        } catch (URISyntaxException uri) {
            // we cannot recover here, so we must rethrow this as internal error
            throw new IllegalStateException("Illegal URI configuration led to URISyntaxException!", uri);
        }
    }
}

opened by gmuehlenberg 15

Get different result of same point when query between different time range
my two queries:

query between 1489075200000 and 1489593600000 {'cache_time': 0, 'end_absolute': 1489593600000, 'metrics': [{'aggregators': [{'align_sampling': False, 'name': 'sum', 'sampling': {'unit': 'milliseconds', 'value': '900000'}}], 'name': 'ubmq_click_sum_1'}], 'start_absolute': 1489075200000} result {"queries":[{"sample_size":2505568,"results":[{"name":"ubmq_click_sum_1","group_by":[{"name":"type","type":"number"}],"tags":{"cmatch":["204","222","223","225","227","228","229"],"cookie":["no","yes"],"flow":["Hao123","Lu","Organic","Union","other"],"home_page":["no","yes"],"namespace":["fc_model_mean"],"page_turn":["no","yes"],"platform":["207L-0","207L-7","207L-dz","247L-1","247L-3","247L-4","247L-6","247L-dz","2615-1","2615-dz","2669-1","2669-dz","other"],"rank":["1","2","3","4","5"],"wmatch":["15","31","63"]},"values":[[1489075200000,456727]

2.query between 1489075200000 and 1489161600000 {'cache_time': 0, 'end_absolute': 1489161600000, 'metrics': [{'aggregators': [{'align_sampling': False, 'name': 'sum', 'sampling': {'unit': 'milliseconds', 'value': '900000'}}], 'name': 'ubmq_click_sum_1'}], 'start_absolute': 1489075200000} result {"queries":[{"sample_size":422939,"results":[{"name":"ubmq_click_sum_1","group_by":[{"name":"type","type":"number"}],"tags":{"cmatch":["204","222","223","225","227","228"],"cookie":["no","yes"],"flow":["Hao123","Lu","Organic","Union","other"],"home_page":["no","yes"],"namespace":["fc_model_mean"],"page_turn":["no","yes"],"platform":["207L-0","207L-7","207L-dz","247L-1","247L-3","247L-4","247L-dz","2615-1","2615-dz","2669-1","2669-dz","other"],"rank":["1","2","3","4","5"],"wmatch":["15","31","63"]},"values":[[1489075200000,432949] finally get two different result of 1489075200000
opened by james1986 15
Cassandra Compaction Strategy

We been experiencing the following issue: Large amounts of data is logged over the course of the day or night. The first query that is run which touches any of this new data will trigger compactions across the entire cassandra cluster. This first query will be rejected by cassandra with a timeout exception (same as this: http://stackoverflow.com/questions/6514343/cassandra-hector-timeouts-what-to-do). Once all new data has been "touched" it will become query-able reliably.

It would seem this is related to the fact that kairos assumes the default compaction strategy (size tierd). This apparently not the best strategy for update-heavy workloads (i.e. kairos' wide rows): http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

We have manually altered our keyspace to use leveled compaction and are have already seen our data size fall by three quaters (down to around 100GB from 400). We have yet to observe any timeout exceptions with this strategy but haven't had any long periods of query inactivity so cannot make any judgements just yet.

Anyway. I believe the cassandra defaults may need some investigation or input from others who have manually altered their config to reach a more optimum configuration for large workloads.

opened by warmans 15
Official docker support
Docker can make it easier for people to get started with Kairosdb, There are already some unofficial images. But most of them do not have multiple versions, ie: only have 1.1.0. Also we can use docker compose to have Kairosdb run with Cassandra easily.

Possible scenarios:

run Kairosdb locally without installing JDK and Cassandra on local machine.

use Kairosdb in CI. (GitLab CI supports using docker image)

use Docker image for production deployment

Existing docker images on dockerhub:

https://hub.docker.com/r/wangdrew/kairosdb/ 100k+ pulls

https://hub.docker.com/r/mbessler/archlinux-kairosdb/ 5.5k pulls

https://hub.docker.com/r/kowens/kairosdb-statsd/ 1.4k pulls (but does not have description)

I have already made one following this dockerfile, it is not published yet, the repo is here. And I have a few questions for Kairosdb environment setup

Which version of JDK is prefered? JDK7 or JDK8.

Which version of Cassandra is prefered?

Thanks~
opened by at15 14
Trying to resolve straight answers on performance...
I think in any situation the user must be willing to do the science and confirm their ideas about performance, that is write little scripts to test that. Someone has already asked for something like that or if anyone else has done it.

While the user has some responsibility, the tool must also have some performance intentions otherwise it would be a bit pointless. I'd imagine there must be some tests somewhere.

That aside the first performance document that comes up for a search isn't helpful even when it comes to covering basic usage. It's quite cryptic and potentially counter productive. It doesn't really explain things well and in some areas it really makes no sense at all,

There are cases where it seems to jump into low level aspect where as it might be better to just say what kind of schema it establishes in for example cassandra allowing users to look at cassandra specs for specifics.

It helps to narrow down some things:

Data:

number of unique metric names

number of unique tag names

number of unique tag values

number of unique times

overlap and other complexities

Access patterns:

Are tags:

and

or

I assume or but the documentation is a little tight lipped about it:

It is possible to filter the data returned by specifying a tag. The data returned will only contain data points associated with the specified tag. Filtering is done using the “tags” property.

It talks a little too much in the singular about a plural, however we find out they're ors here:

Tags narrow down the search. Only metrics that include the tag and matches one of the values are returned. Tags is optional.

This should probably be only the metrics that match as least one of the tag key value combinations will be returned. This I'm still left unsure. An array of values obviously would be an or but what about two key names? Though it would be a bit dysfunctional for those to be and. Most people would want (a = 1 AND b = 2) OR (a = 2 AND b = 1) not (a = 1 OR a = 2) AND (b = 2 OR b = 1)? Just a IN (1, 2) OR b IN(1, 2) would make more sense. Though the example with customer as well as hosts would imply they're AND between multiple key names. Will people flip key and value to get AND to work? Are people not asking the questions I am getting incorrect metrics without knowing (often it's worse as excess tends to appear more valid than deficit or vice versa in most cases)?

For when people want and, the only solution to that is to DIY, that is:

tags = {hotel: 123, room: 321, person: 666}; keys = Object.keys(tags).sort(); values = keys.map(key => tags[key]); tags[JSON.stringify(keys)] = JSON.stringify(values); query = JSON.stringify({metrics: [{tags}]});

This would allow to search for "for how long during this period did person ? stay in room ? of the hotel ?".

Concerns like this depend on actual usage and access. In this case it's quite common to want to want:

How many stays (a day) were there for the hotel ? during the period ?.

During the period ?, how many stays were there in room ? of hotel ?.

Many people might do something simpler than the above and just have room: [hotel, room].join(delim). It's quite common to have a usage pattern where your lookup tends to consist of a list of possible ands (like a and b and c) but may only want out of those either a, a and b or a and b or c but not just b or a and c.

Metric versus tags: Fight

Starting out you must define both a metric and a tag for a datapoint. A problem here is that all basic use cases involving and/or can be both managed with metrics and tags. Whenever starting out with a single first use case it's very much neither here not there which works best to use more of. It's not until you start piling on the use cases that things start to become apparent.

You might say well surely it's obvious when your queries are ten times bigger with using metrics (unless it turns out it secretly takes an array for multiple items like tag values do) or based on it spamming rows but it's not immediately obvious if that will be the case and the line between when to use metrics or tags is blurred especially where performance is concerned.

Yes, I have seen people using metrics in place of tags. It happens, probably quite often. Then the moment you need to add another access pattern it quickly becomes insufficient. For tags you can easily add them or stop populating them with little impact but metrics has a lot of impact. This should be considered in any design, what will happen with tags versus metrics when different access patterns are needed.

My view on the matter is keep metrics quite shallow and use tags by default. By shallow it's usually as in whichever first set of ands you'll always want. As in (application = hotels AND type = stays) AND (a = 1 OR a = 2 AND b = 1) then for that initial static part in brackets that might make a good metric but for the part that's very dynamic based on use case then that should be tags.

Another rule of thumb (obvious one) should be to use metrics for any data that's always isolated, as in never included together with a query, should also use a separate metric. If it's separate data, use a separate metric.

As I see it by default more metrics is bad but tags can build up until that's bad as well and there's then probably a kind of to and fro between splitting things up a bit with metrics than tags but generally the preference should be on tags, not metrics. I think you'l always have cases where either it turns out a portion of a metric should have been a tag or a tag should have been a metric.

It's technically possible to make an abstraction that can switch between the two approaches for performance testing. It's also technically possible to make a profiling mode where given the appropriate usage patterns indicates if it appears a tag should be a metric or vise versa (usually by identifying that when reduced to metrics there's no overlap or that two metrics needed to be used for otherwise the same query).

I never saw a database system where you can insert your use cases like rather than just insert a = 1, b = 2, insert a = 1 AND b = 2;a = 1;a IN(1) though that's probably out of the scope of this.

The battle becomes even more epic when you pit group by against the other alternatives.

Schema

What is a row key? What do the indexes actually look like? Are there composite indexes? Are they ordered? Are they hashes? Is there a plan to expose left to right indexes?

Rather than trying to explain it, it might be easier to just give the definitions used on the most concise level, IE in cassies syntax for example.

Buckets you say?

This immediately stands out as it gives the impression that's all their is. As in you always end up fetching at least three weeks of data for a given metric and time range. I assume that's not actually the case but if I were looking at this database at a glance and saw that I'd quickly walk if for example my use case consisted of a lot of small range lookups across a busy (populated with a lot of data) metric within a retention period of a fortnight.

I would assume in reality cassandra provides a sparse array implementation and allows you to say you want from this column to that column? That raises the question because sparse columns are just an abstraction. Usually they're backed by either a hash map or a tree structure (though in some cases simplre or more complex solutions). If it's a map then that tends to preclude the possibility of a range lookup.

WHAT?

Similar to a query but only returns the tags (no data points returned). This can potentially return more tags than a query because it is optimized for speed and does not query all rows to narrow down the time range. This queries only the Row Key Index and thus the time range is the starting time range. Since the Cassandra row is set to 3 weeks, this can return tags for up to a 3 week period. See Cassandra Schema.

{"start_absolute": 1357023600000, "end_relative": {"value": "5", "unit": "days"},

Kick the bucket, bad bucket.
opened by joeyhub 12
Remove Genormous from unrelated classes (other than H2)

Hello, Genormous is an old artifact and we have warning reported about vulnerabilities on this library. It should only be required for H2... So we could remove this dependency in production.

But Some other classes like AdminResource and QueryQueuingManager use Pair class from this library that prevent from removal in operation.

opened by lcoulet 1
fix(sec): upgrade com.beust:jcommander to 1.75
What happened？

There are 1 security vulnerabilities found in com.beust:jcommander 1.35

MPS-2022-12225

What did I do？

Upgrade com.beust:jcommander from 1.35 to 1.75 for vulnerability fix

What did you expect to happen？

Ideally, no insecure libs should be used.

The specification of the pull request

PR Specification from OSCS
opened by TopScrew 0
Fix typo for cassandra auth

There is a typo in autsecret resource usage leading to error message as Error: YAML parse error on kairosdb/templates/deployment.yaml: error converting YAML to JSON: yaml: invalid map key: map[interface {}]interface {}{".Values.storage.cassandra.authSecret":interface {}(nil)}

opened by rverma-nsl 0

Query Metric Tags does not respect relative time ranges

The API for querying metric tags describes the possibiliby to use relative time ranges ("start_relative" and "end_relative").

In our tests, these query properties are not being used. The response of those queries always contains every value available. We're using KairosDB v1.3.0.

quick demo

Posting 2 datapoints, both have tags "m" and "h" with different values:

first at Thu Aug 25 2022 05:00:00 GMT+0000
second at Thu Aug 25 2022 06:00:00 GMT+0000

[
  {
      "name": "queryMetricTags.bug",
      "datapoints": [
          [1661403600000, 1]
      ],
      "tags": {
          "m": "true",
          "h": "h1"
      }
  },
  {
      "name": "queryMetricTags.bug",
      "datapoints": [
          [1661407200000, 2]
      ],
      "tags": {
          "m": "false",
          "h": "h2"
      }
  }
]

Querying the tag values, filtering for tag "m"=false, and setting are relative time range of 1m (both datapoints are a couple of hours old at the time of writing/querying):

{
   "start_relative": {
       "value": "1",
       "unit": "minutes"
   },
   "metrics": [
       {
           "tags": {
               "m": ["true"]
           },
           "name": "queryMetricTags.bug"
       }
   ]
}

The result contains the combination of the second datapoint, which is wrong -> should be an empty result.

{
  "queries": [
    {
      "results": [
        {
          "name": "queryMetricTags.bug",
          "tags": {
            "h": [
              "h2"
            ],
            "m": [
              "false"
            ]
          },
          "values": []
        }
      ]
    }
  ]
}

opened by sspieker 0

Data Allignment in aggregation is not proper.

If the timestamp of my data point is exactly at 1:45:00, And I do a 1 min Average aggregation, and allign by end time,

The same data point is moving to 1:46:00.

Ideally this should not happen. it should stay in 1:45:00

opened by biswaKL 0

Releases(v1.3.0)

v1.3.0(Aug 9, 2021)

This is a big release that has been in the works for a couple of years now. For a list of all items included in this list see the 1.3.0 issues: https://github.com/kairosdb/kairosdb/milestone/10?closed=1
Source code(tar.gz)
Source code(zip)
kairosdb-1.3.0-1.rpm(31.01 MB)
kairosdb-1.3.0-1.tar.gz(30.97 MB)
kairosdb_1.3.0-1_all.deb(29.46 MB)
v1.2.2(Nov 10, 2018)
This is a patch adds some fixes made in 1.3 to the 1.2 code branch.

The following list was added

Creating the Cassandra schema was exposed as an option on startup.

Cassandra Java driver socket connect timeout and read timeout were exposed as configuration.

Retry logic was added to try other Cassandra replicas before giving up.

Source code(tar.gz)
Source code(zip)
kairosdb-1.2.2-1.rpm(24.54 MB)
kairosdb-1.2.2-1.tar.gz(24.50 MB)
kairosdb_1.2.2-1_all.deb(23.12 MB)
v1.3.0-beta1(Apr 14, 2018)
Big change in this release is the change to to hocon configuration. Read more about it here: https://github.com/lightbend/config/blob/master/HOCON.md

This change was driven by the need to add multi cluster support to kairos (also in this release). Multi cluster support allows you to have one write cluster and 0 or more read clusters. Say you are running Kairos on Cassandra 2.0.14 and you would really like to upgrade but the data is mission critical and the hardware is old so you would rather just create a new cluster with a newer version of Cassandra. With this version you simply set the 2.0.14 cluster as one of your read clusters and then create a new cluster as the current write cluster. Queries are made to all clusters (read and write) and data is only written to the write cluster.

Reasons to use multi cluster:

Easy way to upgrade to newer versions of C*

Provides a way to age out old data without using ttl.

Way of managing kairos data growth.

Use it to try new versions of Kairos - just set existing cluster as a read cluster.

Source code(tar.gz)
Source code(zip)
kairosdb-1.3.0-0.1beta.rpm(24.82 MB)
kairosdb-1.3.0-0.1beta.tar.gz(24.79 MB)
kairosdb_1.3.0-0.1_all.deb(23.40 MB)
v1.2.1(Mar 31, 2018)

Bug fix release

List of bugs: https://github.com/kairosdb/kairosdb/milestone/9?closed=1
Source code(tar.gz)
Source code(zip)
kairosdb-1.2.1-1.rpm(24.53 MB)
kairosdb-1.2.1-1.tar.gz(24.50 MB)
kairosdb_1.2.1-1_all.deb(23.11 MB)
v1.2.0(Feb 1, 2018)

The big change in this release is a change from Thrift to CQL. A lot of work was done to make CQL perform better than the thrift code. Queries are noticeably faster. The release will create new index tables that can be queried via the CQL shell. The code is compatible with data ingested by the previous release of Kairos (1.1.3).

The release is not compatible with schema created by 1.2 beta 1 and beta 2.

Now for all the heaps of coolness that can be found within this release

Uses CQL for reads and writes Up to 8x improvement in query speed - wait what?? Yes you read that right, I hit the turbo button. New service api for storing whatever you want in Kairos. (metadata, configuration, etc..) Extra metrics for tracking queries - even logged if the query fails Added ability to specify node roles (ie ingest or query node) Requires C* 2.1 or higher. Added a demo module that lets you load up a year of demo data to play with

Fixes

Fixed the open file handle issue Import/Export will exit when done.

List of issues resolved in this release: https://github.com/kairosdb/kairosdb/milestone/7?closed=1
Source code(tar.gz)
Source code(zip)
kairosdb-1.2.0-1.rpm(24.49 MB)
kairosdb-1.2.0-1.tar.gz(24.46 MB)
kairosdb_1.2.0-1_all.deb(23.08 MB)
v1.1.3(Jan 4, 2017)

Issues closed as part of this release: https://github.com/kairosdb/kairosdb/milestone/6?closed=1

Most importantly is issue #346 where data is lost when Cassandra goes down.
Source code(tar.gz)
Source code(zip)
kairosdb-1.1.3-1.rpm(20.50 MB)
kairosdb-1.1.3-1.tar.gz(20.47 MB)
kairosdb_1.1.3-1_all.deb(19.73 MB)
v1.1.2(Sep 8, 2016)

Issues closed as part of this release: https://github.com/kairosdb/kairosdb/milestone/5?closed=1
Source code(tar.gz)
Source code(zip)
kairosdb-1.1.2-1.rpm(21.10 MB)
kairosdb-1.1.2-1.tar.gz(21.06 MB)
kairosdb_1.1.2-1_all.deb(20.31 MB)
v1.1.1(Dec 8, 2015)

This release is a patch that fixes the per metric TTL introduced in the 1.1.0 release.
Source code(tar.gz)
Source code(zip)
kairosdb-1.1.1-1.rpm(20.28 MB)
kairosdb-1.1.1-1.tar.gz(20.25 MB)
kairosdb_1.1.1-1_all.deb(19.64 MB)
v1.1.0(Nov 5, 2015)

Issues fixed: https://github.com/kairosdb/kairosdb/issues?q=milestone%3A%221.0.1+Release%22

Thanks for everyone's contribution to this project.

This release was previously - and incorrectly - named 1.0.1, which would have been fine if it only included bug fixed but, there were several new features added. I also took the opportunity to fix the kairosdb service script to not fail for some platforms.
Source code(tar.gz)
Source code(zip)
kairosdb-1.1.0-1.rpm(20.28 MB)
kairosdb-1.1.0-1.tar.gz(20.25 MB)
kairosdb_1.1.0-1_all.deb(19.64 MB)
v1.0.0(Jun 5, 2015)

This has a boat load of stuff. Have a look at the milestone items to see what is there:

https://github.com/kairosdb/kairosdb/issues?utf8=%E2%9C%93&q=milestone%3A%221.0.0+Release%22+is%3Aclosed+

Thanks to everyone who has contributed to this release.
Source code(tar.gz)
Source code(zip)
kairosdb-1.0.0-1.rpm(19.15 MB)
kairosdb-1.0.0-1.tar.gz(19.14 MB)
kairosdb_1.0.0-1_all.deb(19.14 MB)
v0.9.5beta2(Apr 3, 2015)

Fixed a bug in the new time zone sensitive aggregators.

This is a pre release for 1.0. Instead of releasing 0.9.5 this will be the 1.0 release.

Here is a list of fixes in this release https://github.com/kairosdb/kairosdb/issues?q=milestone%3A%220.9.5+Release%22+is%3Aclosed
Source code(tar.gz)
Source code(zip)
kairosdb-0.9.5-0.2beta.rpm(19.69 MB)
kairosdb-0.9.5-0.2beta.tar.gz(19.69 MB)
kairosdb_0.9.5-0.2_all.deb(19.69 MB)
v0.9.5beta1(Jan 26, 2015)

All the major fixes/features are in for the 0.9.5 release. Please give it a test drive.

Here is a list of fixes in this release https://github.com/kairosdb/kairosdb/issues?q=milestone%3A%220.9.5+Release%22+is%3Aclosed
Source code(tar.gz)
Source code(zip)
kairosdb-0.9.5-0.1beta.rpm(22.12 MB)
kairosdb-0.9.5-0.1beta.tar.gz(22.11 MB)
kairosdb_0.9.5-0.1_all.deb(21.75 MB)
v0.9.4(Jul 7, 2014)
New Features

Custom data support.

Built in support for string metric data.

HBase is no longer supported.

Added batch file for starting KairosDB under windows.

Now using Sphinx documentation.

Removed jars from code base (pulled in using Ivy).

Moved Pickle protocol handler into a plugin.

New logo.

Added sampler aggregator.

Added TTL for expiring data points.

Bug Fixes

Fixed CORS problem with submitting metrics and missing header in error messages.

Fixed parsing exception when submitting 0.000 for data.

Fixed locking problem in H2.

Fixed concurrent modification error in data cache.

Fixed word splitter to handle all white space characters.

Fixed null column error when invalid tags were sent.

Source code(tar.gz)
Source code(zip)
kairosdb-0.9.4-6.rpm(18.71 MB)
kairosdb-0.9.4-6.tar.gz(18.70 MB)
kairosdb_0.9.4-6_all.deb(18.70 MB)
v0.9.3(Jun 18, 2014)
New Features

Added additional information to json parsing errors.

Added save and export buttons in ui.

New framework for loading plugins.

Updated to Hector 1.1-4

Added Cassandra consistency levels to configuration.

Increased remote upload speed.

Improved json validation speed.

Added configurable row key cache for Cassandra.

Bug Fixes

Tag validation in pickle handler

Fixed bug with deleting query cache file

Source code(tar.gz)
Source code(zip)
kairosdb-0.9.3-2.rpm(21.87 MB)
kairosdb-0.9.3.tar.gz(21.87 MB)
kairosdb_0.9.3-2_all.deb(21.87 MB)
v0.9.2(Jun 18, 2014)
New Features

Graphite plaintext and pickle protocol handlers (ingest only).

Export single metric.

Moved logback file into conf

Export no longer uses cache - faster exports.

Added low memory check so large queries cannot run the system out of memory.

Bug Fixes

Fixed bug in UI where tag names and values are not retrieved correctly.

Fixed auto complete drop downs to limit the list size.

Fixed various UI bugs.

Fixed telnet handler to handle extra spaces in data.

Fixed queries to use less memory.

Source code(tar.gz)
Source code(zip)
kairosdb-0.9.2-4.rpm(21.20 MB)
kairosdb-0.9.2.tar.gz(21.20 MB)
kairosdb_0.9.2-4.deb(21.20 MB)