Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

Overview

Firehose

build workflow package workflow License Version

Firehose is a cloud native service for delivering real-time streaming data to destinations such as service endpoints (HTTP or GRPC) & managed databases (Postgres, InfluxDB, Redis, Elasticsearch, Prometheus and MongoDB). With Firehose, you don't need to write applications or manage resources. It can be scaled up to match the throughput of your data. If your data is present in Kafka, Firehose delivers it to the destination(SINK) that you specified.

Key Features

Discover why users choose Firehose as their main Kafka Consumer

  • Sinks: Firehose supports sinking stream data to :

    • log console
    • MongoDB
    • Prometheus
    • HTTP
    • GRPC
    • PostgresDB(JDBC)
    • InfluxDB
    • Elasticsearch
    • Redis
    • Bigquery
    • Blob Storage/Object Storage :
      • Google Cloud Storage
  • Scale: Firehose scales in an instant, both vertically and horizontally for high performance streaming sink and zero data drops.

  • Extensibility: Add your own sink to firehose with a clearly defined interface or choose from already provided ones.

  • Runtime: Firehose can run inside VMs or containers in a fully managed runtime environment like kubernetes.

  • Metrics: Always know what’s going on with your deployment with built-in monitoring of throughput, response times, errors and more.

To know more, follow the detailed documentation

Usage

Explore the following resources to get started with Firehose:

  • Guides provides guidance on creating Firehose with different sinks.
  • Concepts describes all important Firehose concepts.
  • Reference contains details about configurations, metrics and other aspects of Firehose.
  • Contribute contains resources for anyone who wants to contribute to Firehose.

Run with Docker

Use the docker hub to download firehose docker image. You need to have docker installed in your system.

# Download docker image from docker hub
$ docker pull odpf/firehose

# Run the following docker command for a simple log sink.
$ docker run -e SOURCE_KAFKA_BROKERS=127.0.0.1:6667 -e SOURCE_KAFKA_CONSUMER_GROUP_ID=kafka-consumer-group-id -e SOURCE_KAFKA_TOPIC=sample-topic -e SINK_TYPE=log -e SOURCE_KAFKA_CONSUMER_CONFIG_AUTO_OFFSET_RESET=latest -e INPUT_SCHEMA_PROTO_CLASS=com.github.firehose.sampleLogProto.SampleLogMessage -e SCHEMA_REGISTRY_STENCIL_ENABLE=true -e SCHEMA_REGISTRY_STENCIL_URLS=http://localhost:9000/artifactory/proto-descriptors/latest odpf/firehose:latest

Note: Make sure your protos (.jar file) are located in work-dir, this is required for Filter functionality to work.

Run with Kubernetes

  • Create a firehose deployment using the helm chart available here
  • Deployment also includes telegraf container which pushes stats metrics

Running locally

# Clone the repo
$ git clone https://github.com/odpf/firehose.git  

# Build the jar
$ ./gradlew clean build 

# Configure env variables
$ cat env/local.properties

# Run the Firehose
$ ./gradlew runConsumer 

Note: Sample configuration for other sinks along with some advanced configurations can be found here

Running tests

# Running unit tests
$ ./gradlew test

# Run code quality checks
$ ./gradlew checkstyleMain checkstyleTest

#Cleaning the build
$ ./gradlew clean

Contribute

Development of Firehose happens in the open on GitHub, and we are grateful to the community for contributing bugfixes and improvements. Read below to learn how you can take part in improving Firehose.

Read our contributing guide to learn about our development process, how to propose bugfixes and improvements, and how to build and test your changes to Firehose.

To help you get your feet wet and get you familiar with our contribution process, we have a list of good first issues that contain bugs which have a relatively limited scope. This is a great place to get started.

Credits

This project exists thanks to all the contributors.

License

Firehose is Apache 2.0 licensed.

Comments
  • chore: Update gradle version to 7.2

    chore: Update gradle version to 7.2

    Gradle version update

    References to other issues or PRs

    Fixes: #20

    Brief description of what is fixed or changed

    Updated the grade version from 5.0 to 6.8.3 to support the latest OpenJDK version.

    ss1

    opened by vaish3496 15
  • feat: blob s3 support

    feat: blob s3 support

    Adding S3 as the Blob storage type for sink and DLQ.

    • Implementation of BlobStorage
    • Added S3 type in BlobStorageType
    • Added S3 in blob sink and DLQ factory methods
    • Added documentation
    opened by akhildv 8
  • Add instrumentation and logging to GCS sink.

    Add instrumentation and logging to GCS sink.

    Acceptance criteria:

    1. Analysis of metrics for GCS sink.
    2. Implementation.
    3. Tracing.

    Metrics:

    1. record_write_count , tags : filename(partition+uuid)
    2. file_open_total
    3. file_closed_total, tags: success(true/false)
    4. file_closing_time_milliseconds
    5. file_size_bytes_total
    6. file_upload_total , tags: success(true/false)
    7. file_upload_time_milliseconds
    8. file_upload_bytes
    Discussion: 
    1. Distribution of the file size.
    2. upload time.
    3. success/failures of upload.
    4. how many open files are there
    5. time taken to close parquet files.
    6. messages read/messages per parquet file.
    7. No of Error messages or dlqed. 
    8. Think about completeness/freshness/deduplication.
    
    opened by lavkesh 4
  • feat: log http response body in debug mode

    feat: log http response body in debug mode

    • Adding functionality to log HTTP response body when log_level=debug
    • Classes affected: AbstractHttpSink.java, SerializableHttpResponse.java, SerializableHttpResponseTest.java HttpSinkTest.java, PromSinkTest.java
    opened by MayurGubrele 3
  • Need Clarification on Protobuf, Kafka and Prometheus interaction

    Need Clarification on Protobuf, Kafka and Prometheus interaction

    Hello,

    Apologies in advance, I was unable to access the slack linked in the bug report menu. I am trying to connect Kafka to a Prometheus instance and I am confused on how index mapping works surrounding Protobuf.

    From the guide:

    SINK_PROM_METRIC_NAME_PROTO_INDEX_MAPPING

    The mapping of fields and the corresponding proto index which will be set as the metric name on Cortex. This is a JSON field.

    Example value: {"2":"tip_amount","1":"feedback_ratings"} Proto field value with index 2 will be stored as metric named tip_amount in Cortex and so on Type: required

    Would "tip_amount" be a record header? Can firehose handle Kafka messages with variable record list lengths?

    Thank you, Liam

    question 
    opened by liam-hogan 2
  • feat: add S3 sink support

    feat: add S3 sink support

    Hey, I just made a Pull Request!

    Please describe what you added, and add a screenshot if possible. That makes it easier to understand the change so we can :shipit: faster.

    :heavy_check_mark: Checklist

    • [ ] A changelog describing the changes and affected packages.
    • [ ] Added or updated documentation
    • [ ] Tests for new functionality and regression tests for bug fixes
    • [ ] Screenshots attached (for UI changes)
    opened by akhildv 2
  • NullPointerException on using repeated field for JSON body template in HTTP sink

    NullPointerException on using repeated field for JSON body template in HTTP sink

    🐛 Bug Report

    For HTTP Sink JSON Templatised body, if there is a repeated field being used for JSON body creation, then firehose silently fails with NullPointerException while reporting metrics

    java.lang.NullPointerException: null
    	at java.time.Duration.between(Duration.java:473)
    	at io.odpf.firehose.metrics.StatsDReporter.captureDurationSince(StatsDReporter.java:53)
    	at io.odpf.firehose.metrics.Instrumentation.captureSinkExecutionTelemetry(Instrumentation.java:165)
    	at io.odpf.firehose.sink.AbstractSink.pushMessage(AbstractSink.java:56)
    	at io.odpf.firehose.sinkdecorator.SinkDecorator.pushMessage(SinkDecorator.java:28)
    

    Expected Behavior

    Firehose should throw proper errors for non-supported field types

    Steps to Reproduce

    Steps to reproduce the behavior.

    1. Use an input proto having repeated field
    2. Use that repeated field in JSON body template
    3. Run firehose
    bug 
    opened by prakharmathur82 2
  • Add instrumentation and logging to BQ sink.

    Add instrumentation and logging to BQ sink.

    Acceptance criteria:

    • Analysis of metrics for BQ sink.
    • Implementation.

    Discussion:

    Insert time. Counter for success/failures of insert messages. No of Error messages(deserialisation and repose errors from bigquery) or dlqed. Log the offsetinfo when the error happens.
    table/dataset creation logging and metrics. stencil Proto update logging and metrics. (log exceptions etc) Think about completeness/freshness/deduplication.

    opened by lavkesh 2
  • Firehose to gRPC sink job failure with error `Uncaught exception in the SynchronizationContext. Panic! java.lang.IllegalStateException: Could not find policy 'pick_first'. `

    Firehose to gRPC sink job failure with error `Uncaught exception in the SynchronizationContext. Panic! java.lang.IllegalStateException: Could not find policy 'pick_first'. `

    Firehose to gRPC sink job failure with error Uncaught exception in the SynchronizationContext. Panic! java.lang.IllegalStateException: Could not find policy 'pick_first'.

    Nov 14, 2022 5:57:37 PM io.grpc.internal.ManagedChannelImpl$2 uncaughtException
    SEVERE: [Channel<1>: (127.0.0.1:6565)] Uncaught exception in the SynchronizationContext. Panic!
    java.lang.IllegalStateException: Could not find policy 'pick_first'. Make sure its implementation is either registered to LoadBalancerRegistry 
            or included in META-INF/services/io.grpc.LoadBalancerProvider from your jar files.
            at io.grpc.internal.AutoConfiguredLoadBalancerFactory$AutoConfiguredLoadBalancer.<init>(AutoConfiguredLoadBalancerFactory.java:92)
            at io.grpc.internal.AutoConfiguredLoadBalancerFactory.newLoadBalancer(AutoConfiguredLoadBalancerFactory.java:63)
            at io.grpc.internal.ManagedChannelImpl.exitIdleMode(ManagedChannelImpl.java:406)
            at io.grpc.internal.ManagedChannelImpl$RealChannel$2.run(ManagedChannelImpl.java:972)
            at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:95)
            at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:127)
            at io.grpc.internal.ManagedChannelImpl$RealChannel.newCall(ManagedChannelImpl.java:969)
            at io.grpc.internal.ManagedChannelImpl.newCall(ManagedChannelImpl.java:911)
            at io.grpc.internal.ForwardingManagedChannel.newCall(ForwardingManagedChannel.java:63)
            at io.grpc.stub.MetadataUtils$HeaderAttachingClientInterceptor.interceptCall(MetadataUtils.java:74)
            at io.grpc.ClientInterceptors$InterceptorChannel.newCall(ClientInterceptors.java:156)
            at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142)
            at io.odpf.firehose.sink.grpc.client.GrpcClient.execute(GrpcClient.java:59)
            at io.odpf.firehose.sink.grpc.GrpcSink.execute(GrpcSink.java:38)
            at io.odpf.firehose.sink.AbstractSink.pushMessage(AbstractSink.java:46)
            at io.odpf.firehose.sinkdecorator.SinkDecorator.pushMessage(SinkDecorator.java:28)
            at io.odpf.firehose.sinkdecorator.SinkWithFailHandler.pushMessage(SinkWithFailHandler.java:34)
            at io.odpf.firehose.sinkdecorator.SinkDecorator.pushMessage(SinkDecorator.java:28)
            at io.odpf.firehose.sinkdecorator.SinkWithRetry.pushMessage(SinkWithRetry.java:54)
            at io.odpf.firehose.sinkdecorator.SinkDecorator.pushMessage(SinkDecorator.java:28)
            at io.odpf.firehose.sinkdecorator.SinkFinal.pushMessage(SinkFinal.java:28)
            at io.odpf.firehose.consumer.FirehoseSyncConsumer.process(FirehoseSyncConsumer.java:43)
            at io.odpf.firehose.launch.Main.lambda$multiThreadedConsumers$0(Main.java:65)
            at io.odpf.firehose.launch.Task.lambda$run$0(Task.java:49)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:750)
    

    Expected Behavior

    A firehose job need to interact with gRPC API for the response.

    Steps to Reproduce

    1. Proto used to reproduce the scenario
    syntax = "proto3";
    package io.odpf.dagger.consumer;
    option java_multiple_files = true;
    option java_package = "io.odpf.dagger.consumer";
    option java_outer_classname = "SampleGrpcServerProto";
    
    service TestServer {
      rpc TestRpcMethod (TestGrpcRequest) returns (TestGrpcResponse) {}
    }
    message TestGrpcRequest {
      string field1 = 1;
      string field2 = 2;
    }
    message TestGrpcResponse {
      bool success = 1;
      repeated Error error = 2;
      string field3 = 3;
      string field4 = 4;
    }
    message Error {
      string code = 1;
      string entity = 2;
    }
    
    1. Write a simple gRPC API which expects two fields and gives back response as is.

    2. Run a firehose job in local with below local properties which consumes data from local kafka and uses gRPC as sink.

    java -jar build/libs/firehose-0.4.2.jar

    KAFKA_RECORD_PARSER_MODE=message
    SINK_TYPE=grpc
    INPUT_SCHEMA_PROTO_CLASS=io.odpf.dagger.consumer.TestGrpcRequest
    SCHEMA_REGISTRY_STENCIL_ENABLE=false
    SOURCE_KAFKA_BROKERS=127.0.0.1:9092
    SOURCE_KAFKA_TOPIC=test-grpc-request
    SOURCE_KAFKA_CONSUMER_GROUP_ID=sample-grpc-group-id2
    SINK_GRPC_SERVICE_HOST=127.0.0.1
    SINK_GRPC_SERVICE_PORT=6565
    SINK_GRPC_METHOD_URL=io.odpf.dagger.consumer.TestServer/TestRpcMethod
    SINK_GRPC_RESPONSE_SCHEMA_PROTO_CLASS=io.odpf.dagger.consumer.TestGrpcResponse
    

    The job is failing with the above mentioned error.

    ##Analysis: In the current implementation, the gRPC client chooses default LoadBalancerProvider ('pick_first') and default NameResolverProvider(DNS). The implementation classes PickFirstLoadBalancerProvider and DnsNameResolverProvider respectively are missing.

    We could able to solve the issue with the including implementation classes through service provider like creating META-INF/services folder and creating a file named io.grpc.LoadBalancerProvider with value as io.grpc.internal.PickFirstLoadBalancerProvider and create another file io.grpc.NameResolverProvider with value io.grpc.internal.DnsNameResolverProvider under it.

    Also if we provide only one service provider io.grpc.LoadBalancerProvider and miss other, we are getting below error.

    Failed to resolve name. status=Status{code=UNAVAILABLE, description=Failed to initialize xDS, 
    cause=io.grpc.xds.XdsInitializationException: Cannot find bootstrap configuration
    Environment variables searched:
    - GRPC_XDS_BOOTSTRAP
    - GRPC_XDS_BOOTSTRAP_CONFIG
    
    Java System Properties searched:
    - io.grpc.xds.bootstrap
    - io.grpc.xds.bootstrapConfig
            at io.grpc.xds.BootstrapperImpl.bootstrap(BootstrapperImpl.java:101)
            at io.grpc.xds.SharedXdsClientPoolProvider.getOrCreate(SharedXdsClientPoolProvider.java:90)
            at io.grpc.xds.XdsNameResolver.start(XdsNameResolver.java:155)
            at io.grpc.internal.ManagedChannelImpl.exitIdleMode(ManagedChannelImpl.java:412)
            at io.grpc.internal.ManagedChannelImpl$RealChannel$2.run(ManagedChannelImpl.java:972)
            at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:95)
            at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:127)
            at io.grpc.internal.ManagedChannelImpl$RealChannel.newCall(ManagedChannelImpl.java:969)
            at io.grpc.internal.ManagedChannelImpl.newCall(ManagedChannelImpl.java:911)
            at io.grpc.internal.ForwardingManagedChannel.newCall(ForwardingManagedChannel.java:63)
            at io.grpc.stub.MetadataUtils$HeaderAttachingClientInterceptor.interceptCall(MetadataUtils.java:74)
            at io.grpc.ClientInterceptors$InterceptorChannel.newCall(ClientInterceptors.java:156)
            at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142)
            at io.odpf.firehose.sink.grpc.client.GrpcClient.execute(GrpcClient.java:59)
            at io.odpf.firehose.sink.grpc.GrpcSink.execute(GrpcSink.java:38)
            at io.odpf.firehose.sink.AbstractSink.pushMessage(AbstractSink.java:46)
            at io.odpf.firehose.sinkdecorator.SinkDecorator.pushMessage(SinkDecorator.java:28)
            at io.odpf.firehose.sinkdecorator.SinkWithFailHandler.pushMessage(SinkWithFailHandler.java:34)
            at io.odpf.firehose.sinkdecorator.SinkDecorator.pushMessage(SinkDecorator.java:28)
            at io.odpf.firehose.sinkdecorator.SinkWithRetry.pushMessage(SinkWithRetry.java:54)
            at io.odpf.firehose.sinkdecorator.SinkDecorator.pushMessage(SinkDecorator.java:28)
            at io.odpf.firehose.sinkdecorator.SinkFinal.pushMessage(SinkFinal.java:28)
            at io.odpf.firehose.consumer.FirehoseSyncConsumer.process(FirehoseSyncConsumer.java:43)
            at io.odpf.firehose.launch.Main.lambda$multiThreadedConsumers$0(Main.java:65)
            at io.odpf.firehose.launch.Task.lambda$run$0(Task.java:49)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:750)
    }
    
    opened by Shreyansh228 1
  • fix: prom sink changed to PoolingHttpClientConnectionManager for idle connection eviction

    fix: prom sink changed to PoolingHttpClientConnectionManager for idle connection eviction

    We have recently started using Prom Sink, We are seeing NoHttpResponseException in log-

    2022-09-28T12:20:59,632Z [pool-2-thread-1] WARN i.o.f.sink.prometheus.PromSink - caught class org.apache.http.NoHttpResponseException p-gopaydata-id-radar-router.gtfdata.io:80 failed to respond 2022-09-28T12:20:59,632Z [pool-2-thread-1] WARN i.o.f.sink.prometheus.PromSink - xxxxxxxxxxxxxxx.gtfdata.io:80 failed to respond org.apache.http.NoHttpResponseException: xxxxxxxxxxxxxxxxxx.gtfdata.io:80 failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)

    Fix - PoolingHttpClientConnectionManager will evict partially closed connection by default.

    opened by mayankrai09 1
  • feat: redis sink using depot

    feat: redis sink using depot

    We are extracting out Redis sink code from firehose and using the Redis sink connector from ODPF Depot library.

    Here are the changed configs -

    • we are replacing INPUT_SCHEMA_PROTO_TO_COLUMN_MAPPING config with the config SINK_REDIS_HASHSET_FIELD_TO_COLUMN_MAPPING . The Proto field index in the mapping must be replaced by the proto field name.
    • we are replacing SINK_REDIS_LIST_DATA_PROTO_INDEX config with the config SINK_REDIS_LIST_DATA_FIELD_NAME . The Proto field index must be replaced by the proto field name
    • we are replacing SINK_REDIS_LIST_DATA_PROTO_INDEX config with the config SINK_REDIS_LIST_DATA_FIELD_NAME . The Proto field index must be replaced by the proto field name.
    • SINK_REDIS_KEY_TEMPLATE will now accept proto field names as the argument instead of the proto index.
    opened by sumitaich1998 1
  • Add Troubleshooting section in docs

    Add Troubleshooting section in docs

    Add a Troubleshooting section in Firehose documentation containing common runtime problems and their solutions.

    Sink - specific troubleshooting as well as generic issues ( related to Stencil client, Kafka consumer, etc) need to be covered in this section.

    documentation 
    opened by sumitaich1998 0
  • Deprecate jaeger tracing in Firehose

    Deprecate jaeger tracing in Firehose

    WHAT ?

    Remove dependency of jaeger-client from Firehose along with usages anywhere in the code.

    WHY ?

    As mentioned on their Github handle, Jaeger Clients are being deprecated and users are being recommended to move to OpenTelemetry APIs and SDKs.

    Announcement on Github handle Announcement on their documentation Issue on Github for the deprecation

    Firehose, as of release 0.2, has a dependency on jaeger-client. However, tracing is not used actively in Firehose in production and hence this dependency can be removed safely.

    Is there a deadline ?

    As per the notice on jaeger-tracing :

    We plan to continue accepting pull requests and making new releases of Jaeger clients through the end of 2021. 
    In January 2022 we will enter a code freeze period for 6 months, during which we will no longer accept pull requests with 
    new features, with the exception of security-related fixes. After that we will archive the client library repositories and will 
    no longer accept new changes.
    
    opened by Meghajit 0
  • Deprecate config SINK_HTTP_PARAMETER_SCHEMA_PROTO_CLASS

    Deprecate config SINK_HTTP_PARAMETER_SCHEMA_PROTO_CLASS

    WHAT ?

    Deprecate config SINK_HTTP_PARAMETER_SCHEMA_PROTO_CLASS

    WHY ?

    For a Firehose with HTTP Sink configured with header or query parameter source ( that is, SINK_HTTP_PARAMETER_SOURCE != disabled), the proto class that is used for parsing the incoming Kafka message during request creation is configured using SINK_HTTP_PARAMETER_SCHEMA_PROTO_CLASS.

    This is confusing, as there is already a config INPUT_SCHEMA_PROTO_CLASS which tells the proto class that needs to be used for parsing the incoming Kafka message.

    Ideally, we would like to keep a single variable which denotes this.

    opened by Meghajit 0
Releases(v0.7.0)
  • v0.7.0(Dec 6, 2022)

    What's Changed

    • feat: redis sink using depot library by @sumitaich1998 in https://github.com/odpf/firehose/pull/206

    Full Changelog: https://github.com/odpf/firehose/compare/v0.6.1...v0.7.0

    Source code(tar.gz)
    Source code(zip)
  • v0.6.1(Nov 29, 2022)

    What's Changed

    • chore : version bump by @sumitaich1998 in https://github.com/odpf/firehose/pull/207

    Full Changelog: https://github.com/odpf/firehose/compare/v0.6.0...v0.6.1

    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Nov 29, 2022)

    What's Changed

    • feat: bigtable sink using depot by @sumitaich1998 in https://github.com/odpf/firehose/pull/198

    Full Changelog: https://github.com/odpf/firehose/compare/v0.5.1...v0.6.0

    Source code(tar.gz)
    Source code(zip)
  • v0.5.1(Nov 23, 2022)

    What's Changed

    • fix: Adding service provider configuration files to fix the grpc sink… by @Shreyansh228 in https://github.com/odpf/firehose/pull/203
    • Revert: "feat: redis sink using depot (#193)" by @sumitaich1998 in https://github.com/odpf/firehose/pull/204

    New Contributors

    • @Shreyansh228 made their first contribution in https://github.com/odpf/firehose/pull/203

    Full Changelog: https://github.com/odpf/firehose/compare/v0.5.0...v0.5.1

    Source code(tar.gz)
    Source code(zip)
  • v0.4.3-patch(Nov 22, 2022)

  • v0.5.0(Nov 9, 2022)

    What's Changed

    • fix: prom sink changed to PoolingHttpClientConnectionManager for idle connection eviction by @mayankrai09 in https://github.com/odpf/firehose/pull/195
    • feat: redis sink using depot by @sumitaich1998 in https://github.com/odpf/firehose/pull/193
    • chore: version bump for redis sink using depot by @sumitaich1998 in https://github.com/odpf/firehose/pull/201

    New Contributors

    • @mayankrai09 made their first contribution in https://github.com/odpf/firehose/pull/195

    Full Changelog: https://github.com/odpf/firehose/compare/v0.4.2...v0.5.0

    Source code(tar.gz)
    Source code(zip)
  • v0.4.2(Aug 29, 2022)

    What's Changed

    • docs: docs for support bq sink clustering by @jesrypandawa in https://github.com/odpf/firehose/pull/189
    • feat: add http patch method by @lavkesh in https://github.com/odpf/firehose/pull/191
    • docs: fix heading and contribution link by @punit-kulal in https://github.com/odpf/firehose/pull/192

    New Contributors

    • @punit-kulal made their first contribution in https://github.com/odpf/firehose/pull/192

    Full Changelog: https://github.com/odpf/firehose/compare/v0.4.1...v0.4.2

    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Aug 10, 2022)

    What's Changed

    • chore :version bump of depot by @lavkesh in https://github.com/odpf/firehose/pull/186
    • Chore version by @lavkesh in https://github.com/odpf/firehose/pull/187

    Bug fixes in depot:

    • Create bq columns with mode set as nullable for primitive columns for Json datatype.

    Full Changelog: https://github.com/odpf/firehose/compare/v0.4.0...v0.4.1

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Aug 5, 2022)

    What's Changed

    • doc: add documentation for firehose local setup by @eyeofvinay in https://github.com/odpf/firehose/pull/181
    • fix: fix checkstyle error on github workflow by @lavkesh in https://github.com/odpf/firehose/pull/182
    • feat: bigquery Sink using depot by @lavkesh in https://github.com/odpf/firehose/pull/179

    New Contributors

    • @eyeofvinay made their first contribution in https://github.com/odpf/firehose/pull/181

    Full Changelog: https://github.com/odpf/firehose/compare/v0.3.3...v0.4.0

    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Jul 5, 2022)

    What's Changed

    • docs: move docs to docusaurus by @ravisuhag in https://github.com/odpf/firehose/pull/172
    • docs: update bigquery sink configs by @ravisuhag in https://github.com/odpf/firehose/pull/175
    • feat: delay the commits with configuration by @lavkesh in https://github.com/odpf/firehose/pull/171
    • docs: upgrade theme and restructure configurations by @ravisuhag in https://github.com/odpf/firehose/pull/178

    Full Changelog: https://github.com/odpf/firehose/compare/v0.3.2...v0.3.3

    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(May 13, 2022)

    What's Changed

    • chore: enables blank issue template and adds slack channels links by @gauravsinghania in https://github.com/odpf/firehose/pull/163
    • feat: upgrade stencil client version by @harikrishnakanchi in https://github.com/odpf/firehose/pull/167
    • Stencil upgrade 0.2.1 by @harikrishnakanchi in https://github.com/odpf/firehose/pull/169

    Full Changelog: https://github.com/odpf/firehose/compare/v0.3.1...v0.3.2

    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Mar 29, 2022)

    What's Changed

    • fix: firehose prevented from exiting when out of memory error happened by @fzrvic in https://github.com/odpf/firehose/pull/159
    • feat: add consumer group id tag as global metric tag by @fzrvic in https://github.com/odpf/firehose/pull/161
    • refactor: handle null value of startExecutionTime inside abstract sink by @MayurGubrele in https://github.com/odpf/firehose/pull/162

    Full Changelog: https://github.com/odpf/firehose/compare/v0.3.0...v0.3.1

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Mar 7, 2022)

    What's Changed

    • feat: blob s3 support by @akhildv in https://github.com/odpf/firehose/pull/158

    New Contributors

    • @akhildv made their first contribution in https://github.com/odpf/firehose/pull/158

    Full Changelog: https://github.com/odpf/firehose/compare/v0.2.3...v0.3.0

    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Feb 14, 2022)

    What's Changed

    • feature:add key value support for redis sink by @kevinbheda in https://github.com/odpf/firehose/pull/151
    • fix: ttl not being set hasset redis by @kevinbheda in https://github.com/odpf/firehose/pull/154
    • fix: ttl not being set for keyvalue redis by @kevinbheda in https://github.com/odpf/firehose/pull/153
    • chore: logging fix with bigquery optimisations by @lavkesh in https://github.com/odpf/firehose/pull/155

    New Contributors

    • @kevinbheda made their first contribution in https://github.com/odpf/firehose/pull/151

    Full Changelog: https://github.com/odpf/firehose/compare/v0.2.2...v0.2.3

    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Jan 24, 2022)

    What's Changed

    • fix: log4j cve fix by @fzrvic in https://github.com/odpf/firehose/pull/148
    • feat: upgrade stencil version to 0.1.6 by @harikrishnakanchi in https://github.com/odpf/firehose/pull/149

    New Contributors

    • @harikrishnakanchi made their first contribution in https://github.com/odpf/firehose/pull/149

    Full Changelog: https://github.com/odpf/firehose/compare/v0.2.1...v0.2.2

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Dec 6, 2021)

    Features

    • Added Async Consumer.
    • Added Json Filter.
    • Added new offset management.
    • Added DLQ writer with GCS support.
    • Added Error handling of messages returned from sink.
    • Added Blob sink with GCS support.
    • Added Bigquery Sink.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.4(Oct 13, 2021)

  • v0.1.3(Oct 1, 2021)

  • v0.1.2(Sep 23, 2021)

    Features:

    • Log http response body when log_level = debug
    • Upgrade to gradle version 7.2
    • The x- prefix from the header is removed

    Fixes:

    • Input Stream of httpbody to be read only once
    • Handle Http Response Code Zero for retry
    • Use runtimeclasspath instead of runtime to include all dependencies into the jar
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Aug 16, 2021)

  • v0.1.0(Jun 21, 2021)

    Features

    • Add log sink
    • Add HTTP sink
    • Add JDBC sink
    • Add InfluxDB sink
    • Add Redis sink
    • Add ElasticSearch sink
    • Add GRPC sink
    • Add Prometheus sink
    Source code(tar.gz)
    Source code(zip)
Owner
Open DataOps Foundation
Building and promoting free and open-source modern data platform.
Open DataOps Foundation
Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.

Dagger Dagger or Data Aggregator is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processi

Open DataOps Foundation 238 Dec 22, 2022
FLiP: StreamNative: Cloud-Native: Streaming Analytics Using Apache Flink SQL on Apache Pulsar

StreamingAnalyticsUsingFlinkSQL FLiP: StreamNative: Cloud-Native: Streaming Analytics Using Apache Flink SQL on Apache Pulsar Running on NVIDIA XAVIER

Timothy Spann 5 Dec 19, 2021
An easy-to-use wrapper for many storage systems.

Data Store An easy-to-use wrapper for redis' cached storage system. (support for more data types coming soon) Note: This project is unfinished, and th

Subham 4 Jul 17, 2022
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

SeaTunnel SeaTunnel was formerly named Waterdrop , and renamed SeaTunnel since October 12, 2021. SeaTunnel is a very easy-to-use ultra-high-performanc

The Apache Software Foundation 4.4k Jan 2, 2023
A template and introduction for the first kafka stream application. The readme file contains all the required commands to run the Kafka cluster from Scrach

Kafka Streams Template Maven Project This project will be used to create the followings: A Kafka Producer Application that will start producing random

null 2 Jan 10, 2022
Demo project for Kafka Ignite streamer, Kafka as source and Ignite cache as sink

ignite-kafka-streamer **Description : Demo project for Kafka Ignite streamer, Kafka as source and Ignite cache as sink Step-1) Run both Zookeeper and

null 1 Feb 1, 2022
Kafka example - a simple producer and consumer for kafka using spring boot + java

Kafka example - a simple producer and consumer for kafka using spring boot + java

arturcampos 1 Feb 18, 2022
Real Time communication library using Animated Gifs as a transport™

gifsockets "This library is the websockets of the '90s" - Somebody at Hacker News. This library shows how to achieve realtime text communication using

Alvaro Videla 1.8k Dec 17, 2022
Pipeline for Visualization of Streaming Data

Seminararbeit zum Thema Visualisierung von Datenströmen Diese Arbeit entstand als Seminararbeit im Rahmen der Veranstaltung Event Processing an der Ho

Domenic Cassisi 1 Feb 13, 2022
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.

Apache Camel Apache Camel is a powerful, open-source integration framework based on prevalent Enterprise Integration Patterns with powerful bean integ

The Apache Software Foundation 4.7k Dec 31, 2022
Dataflow template which read data from Kafka (Support SSL), transform, and outputs the resulting records to BigQuery

Kafka to BigQuery Dataflow Template The pipeline template read data from Kafka (Support SSL), transform the data and outputs the resulting records to

DoiT International 12 Jun 1, 2021
Fast and reliable message broker built on top of Kafka.

Hermes Hermes is an asynchronous message broker built on top of Kafka. We provide reliable, fault tolerant REST interface for message publishing and a

Allegro Tech 742 Jan 3, 2023
KC4Streams - a simple Java library that provides utility classes and standard implementations for most of the Kafka Streams pluggable interfaces

KC4Streams (which stands for Kafka Commons for Streams) is a simple Java library that provides utility classes and standard implementations for most of the Kafka Streams pluggable interfaces.

StreamThoughts 2 Mar 2, 2022
Output Keycloak Events and Admin Events to a Kafka topic.

keycloak-kafka-eventlistener Output Keycloak Events and Admin Events to a Kafka topic. Based on Keycloak 15.0.2+ / RH-SSO 7.5.0+ How to use the plugin

Dwayne Du 4 Oct 10, 2022
A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues

Nakadi Event Broker Nakadi is a distributed event bus broker that implements a RESTful API abstraction on top of Kafka-like queues, which can be used

Zalando SE 866 Dec 21, 2022
Mirror of Apache Kafka

Apache Kafka See our web site for details on the project. You need to have Java installed. We build and test Apache Kafka with Java 8, 11 and 15. We s

The Apache Software Foundation 23.9k Jan 5, 2023
Kryptonite is a turn-key ready transformation (SMT) for Apache Kafka® Connect to do field-level 🔒 encryption/decryption 🔓 of records. It's an UNOFFICIAL community project.

Kryptonite - An SMT for Kafka Connect Kryptonite is a turn-key ready transformation (SMT) for Apache Kafka® to do field-level encryption/decryption of

Hans-Peter Grahsl 53 Jan 3, 2023
A command line client for Kafka Connect

kcctl -- A CLI for Apache Kafka Connect This project is a command-line client for Kafka Connect. Relying on the idioms and semantics of kubectl, it al

Gunnar Morling 274 Dec 19, 2022
A command line client for Kafka Connect

?? kcctl – Your Cuddly CLI for Apache Kafka Connect This project is a command-line client for Kafka Connect. Relying on the idioms and semantics of ku

kcctl 274 Dec 19, 2022