Distributed, masterless, high performance, fault tolerant data processing

Overview

Logo Onyx

Join the chat at https://gitter.im/onyx-platform/onyx

What is it?

  • a masterless, cloud scale, fault tolerant, high performance distributed computation system
  • batch and stream hybrid processing model
  • exposes an information model for the description and construction of distributed workflows
  • Competes against Storm, Flink, Cascading, Cascalog, Spark, Map/Reduce, Sqoop, etc
  • written in pure Clojure

What would I use this for?

  • Realtime event stream processing
  • CQRS
  • Continuous computation
  • Extract, transform, load
  • Data transformation à la map-reduce
  • Data ingestion and storage medium transfer
  • Data cleaning

Installation

Available on Clojars:

[org.onyxplatform/onyx "0.14.6"]

Changelog

Changelog can be found at changes.md.

Quick Lookup Doc

A searchable set of documentation for the Onyx data model is available.

Project Template

A project template can be found at onyx-template.

Plugins and Libraries

Plugin Template

We provide a plugin template for use in building new plugins. This can be found at onyx-plugin.

Plugin Use

To use the supported plugins, please use version coordinates such as [org.onyxplatform/onyx-amazon-sqs "0.14.6.SNAPSHOT.0"], and read the READMEs on the 0.14.x branches linked above.

Build Status

Component release unstable
onyx core Circle CI Circle CI
onyx-local-rt Circle CI Circle CI
onyx-kafka Circle CI Circle CI
onyx-datomic Circle CI Circle CI
onyx-redis Circle CI Circle CI
onyx-sql Circle CI Circle CI
onyx-bookkeeper Circle CI Circle CI
onyx-amazon-sqs Circle CI Circle CI
onyx-amazon-s3 Circle CI Circle CI
onyx-http Circle CI Circle CI
learn-onyx Circle CI -
onyx-examples Circle CI Circle CI
onyx-peer-http-query Circle CI Circle CI
lib-onyx Circle CI Circle CI
onyx-plugin Circle CI Circle CI
onyx-template Circle CI Circle CI
  • release: stable, released content
  • unstable: unreleased content

Unsupported plugins

Some plugins are currently unsupported in onyx 0.14.x. These are:

Companies Running Onyx in Production

LockedOn                                                                                               

Quick Start Guide

Feeling impatient? Hit the ground running ASAP with the onyx-starter repo and walkthrough. You can also boot into preloaded a Leiningen application template.

User Guide 0.14.6

Developer's Guide 0.14.6

API Docs 0.14.6

Code level API documentation can be found here.

Official plugin listing

Official plugins are vetted by Michael Drogalis. Ensure in your project that plugin versions directly correspond to the same Onyx version (e.g. onyx-kafka version 0.14.6.0-SNAPSHOT goes with onyx version 0.14.6). Fixes to plugins can be applied using a 4th versioning identifier (e.g. 0.14.6.1-SNAPSHOT).

Generate plugin templates through Leiningen with onyx-plugin.

3rd Party plugin listing

Unofficial plugins have not been vetted.

Need help?

Check out the Onyx Google Group.

Want the logo?

Feel free to use it anywhere. You can find a few different versions here.

Running the tests

A simple lein test will run the full suite for Onyx core.

Contributor list

Acknowledgements

Some code has been incorporated from the following projects:

License

Copyright © 2017 Michael Drogalis

Distributed under the Eclipse Public License, the same as Clojure.

Comments
  • Generatively test flow condition validation logic

    Generatively test flow condition validation logic

    We could use some help with this one if anyone has a bit of time to spare.

    We'd like to have a test.check suite to tighten the semantic validation of flow conditions, seen here. I already see at least one bug. It would be great to get better coverage of this entire section.

    enhancement newbie 
    opened by MichaelDrogalis 15
  • onyx log problem for exception

    onyx log problem for exception

    when i use onyx to create some realtime job, it seems it hang up before submit job when encounter some exception. i hope i can display the exception, but i don’t know what is the point.

    for example, when i put the (/ 3 0) after start-env and start-peer-group, it hang up. i can catch the exception, but i think it’s not convenient.

    thanks for your great works.

    (def env-config
      {:zookeeper/address "127.0.0.1:4188"
       :zookeeper/server? true
       :zookeeper.server/port 4188
       :onyx.bookeeper/server? true
       :onyx.bookeeper/local-quorum? true
       :onyx.bookeeper/local-quorum-ports [4196 4197 4198]
       :onyx/tenancy-id id
       :onyx.log/config {}})
    
    (def peer-config
      {:zookeeper/address "127.0.0.1:4188"
       :onyx/tenancy-id id
       :onyx.peer/job-scheduler :onyx.job-scheduler/balanced
       :onyx.messaging/impl :aeron
       :onyx.messaging/peer-port 40200
       :onyx.messaging/bind-addr "localhost"
       :onyx.log/config {}})
    
    (def env (onyx.api/start-env env-config))
    (def peer-group (onyx.api/start-peer-group peer-config))
    (timbre/set-config! {})
    (println "hello1")
    (/ 3 0)
    (println "hello2")
    
    bug severe 
    opened by ghost 13
  • Bookkeeper fails to start if cookie already exists in zookeeperBookkeeper fails to start if cookie already exists in zookeeper:

    Bookkeeper fails to start if cookie already exists in zookeeperBookkeeper fails to start if cookie already exists in zookeeper:

    Bookkeeper fails to start if cookie already exists in zookeeper:

    Caused by: org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /onyx/test/ledgers/cookies/xxx.xxx.xxx.xxx:3196
        at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:366)
        at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:466)
        at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:107)
        at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:95)
        at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:87)
        at onyx.state.bookkeeper.Bookie.start(bookkeeper.clj:35)
        at onyx.state.bookkeeper.BookieMonitor.start(bookkeeper.clj:71)
        at onyx.state.bookkeeper.BookieServers$fn__28021.invoke(bookkeeper.clj:105)
        at clojure.core$mapv$fn__6727.invoke(core.clj:6616)
        at clojure.lang.PersistentVector.reduce(PersistentVector.java:333)
        at clojure.core$reduce.invoke(core.clj:6518)
        at clojure.core$mapv.invoke(core.clj:6616)
        at onyx.state.bookkeeper.BookieServers.start(bookkeeper.clj:104)
        at com.stuartsierra.component$fn__16299$G__16293__16301.invoke(component.clj:4)
        at com.stuartsierra.component$fn__16299$G__16292__16304.invoke(component.clj:4)
        at clojure.lang.Var.invoke(Var.java:379)
        at clojure.lang.AFn.applyToHelper(AFn.java:154)
        at clojure.lang.Var.applyTo(Var.java:700)
        at clojure.core$apply.invoke(core.clj:632)
        at com.stuartsierra.component$try_action.invoke(component.clj:116)
        at com.stuartsierra.component$update_system$fn__16350.invoke(component.clj:138)
        at clojure.lang.ArraySeq.reduce(ArraySeq.java:114)
        at clojure.core$reduce.invoke(core.clj:6518)
        at com.stuartsierra.component$update_system.doInvoke(component.clj:134)
        at clojure.lang.RestFn.invoke(RestFn.java:445)
        at com.stuartsierra.component$start_system.invoke(component.clj:162)
        at onyx.system.OnyxDevelopmentEnv$fn__30707.invoke(system.clj:64)
        at onyx.system$rethrow_component.invoke(system.clj:53)
        at onyx.system.OnyxDevelopmentEnv.start(system.clj:63)
        at onyx.api$start_env.invoke(api.clj:327)
        at onyx.api$start_env.invoke(api.clj:324)
    

    :onyx.bookkeeper/delete-server-data? only deletes :onyx.bookkeeper/base-journal-dir and :onyx.bookkeeper/base-ledger-dir see https://github.com/onyx-platform/onyx/blob/5a967332e4b2707b585ac464b673eb6918b10307/src/onyx/state/bookkeeper.clj and doesn't clear zookeeper data.

    we'll also have this problem in case of unclean shutdown

    opened by stefanbacon 12
  • Validations offer suggestions

    Validations offer suggestions

    I've added suggestions for workflow keys without matching catalog entries based on existing catalog entries and values of task maps based on the information model.

    opened by oleschoenburg 12
  • Clojure 1.10 fixes

    Clojure 1.10 fixes

    With Clojure 1.10 being released, there were two additional namespace issues that this PR resolves. This makes Onyx fully compatible with Clojure 1.10.

    opened by solatis 11
  • Add timeout to take-segments! and await-job-completion

    Add timeout to take-segments! and await-job-completion

    fixes #219 For await-job-completion I'm not sure the argument ordering. It would be nice to be able to specify quasi-builder-style (await-job-completion peer-config job-id :monitoring-config {:monitoring :no-op} :timeout 100)

    But for now I decided to keep it consistent with the rest of onyx.

    P.S. if anyone knows how to stop GIT from highlighting whitespace changes, I can amend this. Emacs is not loving whatever is going on here. Cant save the file without the lines getting funky.

    Thanks!

    opened by gardnervickers 11
  • Monitoring dashboard

    Monitoring dashboard

    This issue serves as a placeholder for the creation of another repository - onyx-dashboard. This dashboard will serve as a point of monitoring the status of what's happening inside Onyx by querying ZooKeeper. The data in ZooKeeper is immutable, and compressed with Fressian.

    medium feature newbie in-progress 
    opened by MichaelDrogalis 11
  • Peer group http endpoint

    Peer group http endpoint

    I wanted to get initial feedback on the direction and approach I'm taking in implementing #621. So far I have copied the http endpoints for replica queries from lib-onyx and made the peer-group-manager update a replica atom on :apply-log-entry

    opened by oleschoenburg 10
  • Allow Double values as window attribute

    Allow Double values as window attribute

    Using Doubles as window attribute caused an exception:

    java.lang.IllegalArgumentException: No implementation of method: :coerce-key of protocol: #'onyx.windowing.units/ICoerceKey found for class: java.lang.Double
    
    opened by jocrau 10
  • Run without Zookeeper?

    Run without Zookeeper?

    Is it possible to run Onyx without zookeeper? If yes, than how? (even the samples and the templates are using it)

    For many usage scenarios (I would say most that I've encountered so far), a single instance should be more than enough for all the data, so there's no need to distribute the processing.

    Thank you.

    question 
    opened by hansgru 10
  • handle-exception lifecycle doesn't appear to apply to write-batch

    handle-exception lifecycle doesn't appear to apply to write-batch

    Found by jepsen. I switched out :onyx/restart-pred-fn for a restart lifecycle and the job still seems to be being killed under certain scenarios. The main one I noticed is in write-batch. I'll look into this.

    bug 
    opened by lbradstreet 9
  • IndexOutOfBoundsException from aeron

    IndexOutOfBoundsException from aeron

    I tried upgrading our Onyx system to 0.14.6 this morning and I'm getting errors on startup, it looks to be every task blowing up. There are two versions of the error:

    19-10-22 20:16:14 robert-downey-jr-master-5c869697c5-dkm7w WARN [onyx.messaging.aeron.status-publisher:40] - Aeron status channel error
                                             java.lang.Thread.run                      Thread.java:  748
                            org.agrona.concurrent.AgentRunner.run                 AgentRunner.java:  164
                    org.agrona.concurrent.AgentRunner.doDutyCycle                 AgentRunner.java:  283
                                  io.aeron.ClientConductor.doWork             ClientConductor.java:  191
                                 io.aeron.ClientConductor.service             ClientConductor.java:  896
                             io.aeron.DriverEventsAdapter.receive         DriverEventsAdapter.java:   63
    org.agrona.concurrent.broadcast.CopyBroadcastReceiver.receive       CopyBroadcastReceiver.java:  116
                           io.aeron.DriverEventsAdapter.onMessage         DriverEventsAdapter.java:  123
       io.aeron.command.ImageBuffersReadyFlyweight.sourceIdentity  ImageBuffersReadyFlyweight.java:  239
                org.agrona.concurrent.UnsafeBuffer.getStringAscii                UnsafeBuffer.java: 1085
                org.agrona.concurrent.UnsafeBuffer.getStringAscii                UnsafeBuffer.java: 1134
                  org.agrona.concurrent.UnsafeBuffer.boundsCheck0                UnsafeBuffer.java: 1716
    java.lang.IndexOutOfBoundsException: index=124 length=822083584 capacity=4096
    

    and specific task versions in poll-recover:

    19-10-22 20:16:14 robert-downey-jr-master-5c869697c5-dkm7w WARN [onyx.peer.task-lifecycle:177] -
                                               java.lang.Thread.run                      Thread.java:  748
                 java.util.concurrent.ThreadPoolExecutor$Worker.run          ThreadPoolExecutor.java:  624
                  java.util.concurrent.ThreadPoolExecutor.runWorker          ThreadPoolExecutor.java: 1149
                                                                ...
                                  clojure.core.async/thread-call/fn                        async.clj:  434
                  onyx.peer.task-lifecycle/start-task-lifecycle!/fn               task_lifecycle.clj: 1155
                       onyx.peer.task-lifecycle/run-task-lifecycle!               task_lifecycle.clj:  551
            onyx.peer.task-lifecycle.TaskStateMachine/next-replica!               task_lifecycle.clj:  961
               onyx.messaging.messenger-state/next-messenger-state!              messenger_state.clj:   92
                onyx.messaging.messenger-state/transition-messenger              messenger_state.clj:   83
    onyx.messaging.aeron.messenger.AeronMessenger/update-publishers                    messenger.clj:  112
               onyx.messaging.aeron.messenger/transition-publishers                    messenger.clj:   51
                                              clojure.core/group-by                         core.clj: 7146
                                                clojure.core/reduce                         core.clj: 6828
                                        clojure.core.protocols/fn/G                    protocols.clj:   13
                                          clojure.core.protocols/fn                    protocols.clj:   75
                                  clojure.core.protocols/seq-reduce                    protocols.clj:   24
                                                   clojure.core/seq                         core.clj:  137
                                                                ...
                                               clojure.core/keep/fn                         core.clj: 7341
            onyx.messaging.aeron.messenger/transition-publishers/fn                    messenger.clj:   50
                       onyx.messaging.aeron.publisher/reconcile-pub                    publisher.clj:  291
                     onyx.messaging.aeron.publisher.Publisher/start                    publisher.clj:  198
          onyx.messaging.aeron.endpoint-status.EndpointStatus/start              endpoint_status.clj:   79
                                     io.aeron.Aeron.addSubscription                       Aeron.java:  263
                           io.aeron.ClientConductor.addSubscription             ClientConductor.java:  495
                           io.aeron.ClientConductor.addSubscription             ClientConductor.java:  521
                             io.aeron.ClientConductor.awaitResponse             ClientConductor.java:  945
                                   io.aeron.ClientConductor.service             ClientConductor.java:  896
                               io.aeron.DriverEventsAdapter.receive         DriverEventsAdapter.java:   63
      org.agrona.concurrent.broadcast.CopyBroadcastReceiver.receive       CopyBroadcastReceiver.java:  116
                             io.aeron.DriverEventsAdapter.onMessage         DriverEventsAdapter.java:  123
         io.aeron.command.ImageBuffersReadyFlyweight.sourceIdentity  ImageBuffersReadyFlyweight.java:  239
                  org.agrona.concurrent.UnsafeBuffer.getStringAscii                UnsafeBuffer.java: 1085
                  org.agrona.concurrent.UnsafeBuffer.getStringAscii                UnsafeBuffer.java: 1134
                    org.agrona.concurrent.UnsafeBuffer.boundsCheck0                UnsafeBuffer.java: 1716
    java.lang.IndexOutOfBoundsException: index=124 length=808517632 capacity=4096
             clojure.lang.ExceptionInfo: Handling uncaught exception thrown inside task lifecycle :lifecycle/poll-recover. Killing the job. -> Exception type: java.lang.IndexOutOfBoundsException. Exception message: index=124 length=808517632 capacity=4096
           job-id: #uuid "00000000-0000-0000-0000-000000000003"
         metadata: {:job-id #uuid "00000000-0000-0000-0000-000000000003", :job-hash "7ba27abbd73fa66ec2351c328b997173d84067333d334ca41584c39e0669f"}
          peer-id: #uuid "3236c6dd-c980-caac-12e3-d339f7c564ad"
        task-name: :prepare-pending-state-tx
    
    
    

    I'm not even quite sure how to begin figuring out what's going wrong here. I started poking around but haven't made much headway.

    The problem doesn't occur in 0.14.5, and I see aeron was upgraded in 0.14.6. We do set a large term buffer length (-Daeron.term.buffer.length=8388608) which is less than the length in the thrown exception. I've tried without that setting as well, but the errors still occur.

    Does anyone have any advice?

    opened by jgerman 3
  • Feature request: Do not try to recover output checkpoints for plugins that don't use it

    Feature request: Do not try to recover output checkpoints for plugins that don't use it

    Many output plugins do not use checkpoint state. It would be nice if the system did not even bother to read/write these checkpoint files if they are just going to be empty anyways. This should give us a couple benefits:

    1. Anti-fragility in resuming jobs (especially when the output/structure of the job changes but the input stays the same)
    2. Reduced load and costs of S3

    I imagine to keep the interface flexible, we could add to the plugin protocol to check at runtime for various features of the plugin, such as if output checkpointing is supported.

    Thoughts?

    (Typical stack trace when I resume a job and output checkpointing fails b/c I changed the job definition around)

    ERROR 2019-05-23 09:23:29,085 service.data.job.core: {:message Onyx lifecycle exception,  :phase :lifecycle/recover-output}
    com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: A07B7892DD6BE81F; S3 Extended Request ID: AOGX9hab+QAnQIuIk9C1gVVVSOPTZRUeBNkzRIbUE/vxnk7wlDS/OWqqquH/M9GNnWNUr4DWyF8=), S3 Extended Request ID: AOGX9hab+QAnQIuIk9C1gVVVSOPTZRUeBNkzRIbUE/vxnk7wlDS/OWqqquH/M9GNnWNUr4DWyF8=
            at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
            at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
            at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
            at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
            at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
            at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
            at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
            at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
            at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
            at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
            at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
            at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1409)
            at onyx.storage.s3$read_checkpointed_bytes.invokeStatic(s3.clj:102)
            at onyx.storage.s3$read_checkpointed_bytes.invoke(s3.clj:100)
            at onyx.storage.s3$eval49560$fn__49562$fn__49564.invoke(s3.clj:241)
            at onyx.storage.s3$eval49560$fn__49562.invoke(s3.clj:239)
            at clojure.lang.MultiFn.invoke(MultiFn.java:284)
            at onyx.peer.resume_point$read_checkpoint.invokeStatic(resume_point.clj:56)
            at onyx.peer.resume_point$read_checkpoint.invoke(resume_point.clj:51)
            at onyx.peer.resume_point$recover_output.invokeStatic(resume_point.clj:112)
            at onyx.peer.resume_point$recover_output.invoke(resume_point.clj:106)
            at onyx.peer.task_lifecycle$recover_output.invokeStatic(task_lifecycle.clj:486)
            at onyx.peer.task_lifecycle$recover_output.invoke(task_lifecycle.clj:479)
            at onyx.peer.task_lifecycle.TaskStateMachine.exec(task_lifecycle.clj:1070)
            at onyx.peer.task_lifecycle$run_task_lifecycle_BANG_.invokeStatic(task_lifecycle.clj:550)
            at onyx.peer.task_lifecycle$run_task_lifecycle_BANG_.invoke(task_lifecycle.clj:540)
            at onyx.peer.task_lifecycle$start_task_lifecycle_BANG_$fn__43880.invoke(task_lifecycle.clj:1155)
            at clojure.core.async$thread_call$fn__11217.invoke(async.clj:442)
            at clojure.lang.AFn.run(AFn.java:22)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    
    opened by sundbry 3
  • Fix running an output plugin with :onyx.core/params longer than 0

    Fix running an output plugin with :onyx.core/params longer than 0

    When :onyx.core/params is set, the (identity) function in operations.cljc throws because it can't handle more than one argument. Instead of usinging identity, we return the last argument (the segment).

    This is sort of an unusual thing to do, to attach params to a lifecycle affecting an output task, but it is for a 'generic' set of lifecycle calls in my use case in the mongodb plugin I am developing. We provide the :mongo connection as a parameter so you can do queries in normal :function tasks, and use the same lifecycles for the :output tasks to open/close the connection.

    #error {
     :cause "Segment threw exception"
     :data {:exception #error {
     :cause "Wrong number of args (2) passed to: clojure.core/identity"
     :via
     [{:type clojure.lang.ArityException
       :message "Wrong number of args (2) passed to: clojure.core/identity"
       :at [clojure.lang.AFn throwArity "AFn.java" 429]}]
     :trace
     [[clojure.lang.AFn throwArity "AFn.java" 429]
      [clojure.lang.AFn invoke "AFn.java" 36]
      [clojure.core$partial$fn__5824 invoke "core.clj" 2624]
      [onyx.peer.transform$collect_next_segments$fn__38085 invoke "transform.clj" 8]
      [onyx.peer.transform$collect_next_segments invokeStatic "transform.clj" 8]
      [onyx.peer.transform$collect_next_segments invoke "transform.clj" 7]
      [onyx.peer.transform$apply_fn_single$fn__38090 invoke "transform.clj" 17]
      [clojure.core$map$fn__5851 invoke "core.clj" 2753]
      [clojure.lang.LazySeq sval "LazySeq.java" 42]
      [clojure.lang.LazySeq seq "LazySeq.java" 51]
      [clojure.lang.RT seq "RT.java" 531]
      [clojure.core$seq__5387 invokeStatic "core.clj" 137]
      [clojure.core$dorun invokeStatic "core.clj" 3133]
      [clojure.core$doall invokeStatic "core.clj" 3148]
      [clojure.core$doall invoke "core.clj" 3148]
      [onyx.peer.transform$apply_fn_single invokeStatic "transform.clj" 17]
      [onyx.peer.transform$apply_fn_single invoke "transform.clj" 14]
      [onyx.peer.transform$apply_fn invokeStatic "transform.clj" 54]
      [onyx.peer.transform$apply_fn invoke "transform.clj" 48]
      [onyx.peer.task_lifecycle$build_apply_fn$fn__64006 invoke "task_lifecycle.clj" 620]
      [onyx.peer.task_lifecycle$wrap_lifecycle_metrics$fn__64159 invoke "task_lifecycle.clj" 1098]
      [onyx.peer.task_lifecycle.TaskStateMachine exec "task_lifecycle.clj" 1071]
      [onyx.peer.task_lifecycle$run_task_lifecycle_BANG_ invokeStatic "task_lifecycle.clj" 550]
      [onyx.peer.task_lifecycle$run_task_lifecycle_BANG_ invoke "task_lifecycle.clj" 540]
      [onyx.peer.task_lifecycle$start_task_lifecycle_BANG_$fn__64180 invoke "task_lifecycle.clj" 1156]
      [clojure.core.async$thread_call$fn__11217 invoke "async.clj" 442]
      [clojure.lang.AFn run "AFn.java" 22]
      [java.util.concurrent.ThreadPoolExecutor runWorker "ThreadPoolExecutor.java" 1149]
      [java.util.concurrent.ThreadPoolExecutor$Worker run "ThreadPoolExecutor.java" 624]
      [java.lang.Thread run "Thread.java" 748]]
    
    opened by sundbry 4
  • Aeron Reliability

    Aeron Reliability

    We're having trouble with aeron exceptions in the onyx client. They are most often Client Conductor Timeouts though occasionally we see other aeron related exceptions. These exceptions kill the job between 1-4x per day (it's a long running job).

    We can't seem to make these exceptions go away. GC does not appear to be an issue, nor do we see CPU usage spikes (our systems are running in GKE). Increasing CPU limits doesn't appear to help. The threads just seem to not be woken up in time to conduct their checks.

    I'm pretty much stuck at this point trying various fixes while planning backup plans not involving Onyx. Any help to point me in the right direction would be appreciated.

    opened by jgerman 4
  • Project maintanence going forward

    Project maintanence going forward

    Hey,

    It seems with the recent acquihire of the main developers of Distributed Masonry the project development has slowed down. With the advent of better maintenance options for open-source projects from the Clojure community (Clojurists Together/Patreon) it would make sense to apply to those programs to fund development of Onyx.

    We just integrated Onyx in a production system and it would be nice to keep up dev efforts as I think this project is still the de-facto Clojure data processing platform with it's unique approach.

    I would like to hear some thoughts from the devs what the future holds for this project realistically and what we can do together to keep it going forward.

    @lbradstreet @MichaelDrogalis

    opened by thenonameguy 28
  • Onyx hangs when provide

    Onyx hangs when provide ":onyx.messaging.aeron/media-driver-dir" setting in peer-config

    I tried to run two onyx instances in the same machine. So I needed to change Aeron media driver directory on both instances. After digging through source code, I found out I could pass the directory via :onyx.messaging.aeron/media-driver-dir setting. It was working fine with two onyx instances.

    If I run one onyx instance, Onyx hanged with the following message: Aeron media driver has not started up. Waiting for media driver before starting peers, and backing off for 500ms.

    I think the problem lying in the following code: https://github.com/onyx-platform/onyx/blob/0.14.4/src/onyx/peer/peer_group_manager.clj#L264 https://github.com/onyx-platform/onyx/blob/0.14.4/src/onyx/messaging/aeron/messaging_group.clj#L30 The function media-driver-healthy? initializes CommonContext object with default media driver directory. It fails to consider that the directory might be overridden by :onyx.messaging.aeron/media-driver-dir setting.

    I could submit a PR. However, I'm new to onyx. I'm not sure this is the right direction. Could anyone help guiding me with this issue?

    opened by darongmean 1
Owner
Onyx
Distributed, masterless, high performance, fault tolerant data processing
Onyx
High-level contextual steps in your tests for any reporting tool

Xteps High-level contextual steps in your tests for any reporting tool. License Maven Central Javadoc Xteps Xteps Allure Xteps ReportPortal How to use

Evgenii Plugatar 8 Dec 11, 2022
Puppeteer/Playwright in Java. High-Level headless browser.

HBrowser Another headless browser for Java with Puppeteer and Playwright implemented. Add this to your project with Maven/Gradle/Sbt/Leinigen (Java 8

Osiris-Team 99 Dec 18, 2022
Apache JMeter - An Open Source Java application designed to measure performance and load test applications

An Open Source Java application designed to measure performance and load test applications. By The Apache Software Foundation What Is It? Apache JMete

The Apache Software Foundation 6.7k Jan 1, 2023
Check performance metrics by running Selenium 4 tests with JUnit on LambdaTest cloud.

Run Selenium 4 Tests With JUnit On LambdaTest Blog ⋅ Docs ⋅ Learning Hub ⋅ Newsletter ⋅ Certifications ⋅ YouTube       Learn how to use JUnit framewor

null 11 Jul 11, 2022
A sample repo to help you capture performance logs in Java-TestNG using CDP on LambdaTest. Run Selenium tests with TestNG on LambdaTest platform.

How to capture performance logs in Java-TestNG using CDP on LambdaTest Environment Setup Global Dependencies Install Maven Or Install Maven with Homeb

null 12 Jul 13, 2022
A library for setting up Java objects as test data.

Beanmother Beanmother helps to create various objects, simple and complex, super easily with fixtures for testing. It encourages developers to write m

Jaehyun Shin 113 Nov 7, 2022
Java fake data generator

jFairy by Devskiller Java fake data generator. Based on Wikipedia: Fairyland, in folklore, is the fabulous land or abode of fairies or fays. Try jFair

DevSkiller 718 Dec 10, 2022
Arbitrary test data generator for parameterized tests in Java inspired by AutoFixture.

AutoParams AutoParams is an arbitrary test data generator for parameterized tests in Java inspired by AutoFixture. Sometimes setting all the test data

null 260 Jan 2, 2023
JUnit 5 Parameterized Test Yaml Test Data Source

Yamaledt — JUnit 5 Parameterized Tests Using Yaml and Jamal Introduction and usage Note This is the latest development documentation. This is a SNAPSH

Peter Verhas 4 Mar 23, 2022
Simplified PDF Data Extraction

PDF Mantis Simplified PDF Data Extraction Table of Contents What is PDF Mantis Why was PDF Mantis created and who is it for Requirements Installation

null 5 Dec 1, 2021
Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more

IMPORTANT NOTE!!! Storm has Moved to Apache. The official Storm git repository is now hosted by Apache, and is mirrored on github here: https://github

Nathan Marz 8.9k Dec 26, 2022
Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter

Heron is a realtime analytics platform developed by Twitter. It has a wide array of architectural improvements over it's predecessor. Heron in Apache

The Apache Software Foundation 3.6k Dec 28, 2022
A reactive Java framework for building fault-tolerant distributed systems

Atomix Website | Javadoc | Slack | Google Group A reactive Java framework for building fault-tolerant distributed systems Please see the website for f

Atomix 2.3k Dec 29, 2022
A fault tolerant, protocol-agnostic RPC system

Finagle Status This project is used in production at Twitter (and many other organizations), and is being actively developed and maintained. Releases

Twitter 8.5k Jan 4, 2023
Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low latency stream processing and data analysis framework. Milliseconds latency and 10+ times faster than Flink for complicated use cases.

Table-Computing Welcome to the Table-Computing GitHub. Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low la

Alibaba 34 Oct 14, 2022
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

SeaTunnel SeaTunnel was formerly named Waterdrop , and renamed SeaTunnel since October 12, 2021. SeaTunnel is a very easy-to-use ultra-high-performanc

The Apache Software Foundation 4.4k Jan 2, 2023
A high available,high performance distributed messaging system.

#新闻 MetaQ 1.4.6.2发布。更新日志 MetaQ 1.4.6.1发布。更新日志 MetaQ 1.4.5.1发布。更新日志 MetaQ 1.4.5发布。更新日志 Meta-ruby 0.1 released: a ruby client for metaq. SOURCE #介绍 Meta

dennis zhuang 1.3k Dec 12, 2022
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

Apache Gobblin Apache Gobblin is a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems. Ca

The Apache Software Foundation 2.1k Jan 4, 2023
Netflix, Inc. 23.1k Jan 5, 2023