Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Overview

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine learning. It is a framework for building applications, but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering.

Proceed to the Oryx 2 site for full documentation.

Just looking to deploy a ready-made, end-to-end application for collaborative filtering, clustering or classification? Easy. Proceed directly to:

Developers can consume Oryx 2 as a framework for building custom applications as well. Following the architecture overview below, proceed to Making an Oryx App to learn how to create a new application. You can review a module diagram as well to understand the project structure.


Build Status Coverity codecov.io

Comments
  • Spark 2.3 and java.lang.NoClassDefFoundError: kafka/admin/RackAwareMode

    Spark 2.3 and java.lang.NoClassDefFoundError: kafka/admin/RackAwareMode

    It looks like running Oryx 2.6 against Spark 2.3 yields errors like "java.lang.NoClassDefFoundError: kafka/admin/RackAwareMode".

    Spark's Kafka integration used to bring in org.apache.kafka:kafka as a dependency. It's needed by Oryx but this direct dependency wasn't really correctly directly expressed. When Spark changed this was no longer available at runtime.

    The fix is easy enough; ensure a direct compile-time dependency on this artifact, and hope that doesn't cause any other issues.

    This may necessitate a new 2.7.x branch for Spark 2.3, in order to carefully avoid disturbing what has been working for Spark 2.2 in 2.6.x.

    bug BatchLayer SpeedLayer LambdaTier 
    opened by srowen 64
  • /knownItems/uid doesn't return all items

    /knownItems/uid doesn't return all items

    I log all items, that were sent to oryx to a text file:

    image

    But if I submit this url to oryx, I get shorter list of IDS:

    image

    How is it possible that some items are missing?

    opened by stiv-yakovenko 50
  • Kafka 0.11.x support

    Kafka 0.11.x support

    Currently Oryx is not supporting Kafka message format 0.11.x

    Details: https://community.cloudera.com/t5/Data-Science-and-Machine/NoClassDefFoundError-in-Oryx-batch-and-speed-layers/m-p/59235#M765

    Any suggestions how this can be fixed/added support @srowen, I will be happy to help with this.

    enhancement BatchLayer SpeedLayer LambdaTier 
    opened by cimox 27
  • Adjust Jersey deps

    Adjust Jersey deps

    @kkasravi @smarthi I'd like to examine the Jersey dependencies in more detail.

    Overall, my concern is that Jersey has become a compile-time dependency. I think it is best to avoid this, and only use JAX-RS APIs, unless there is a strong reason not to.

    Right now it looks like we force users to use the package-style application definition, instead of the normal JAX-RS Application class style. I can understand the convenience, but is it only possible to support both by hard-coding dependency on Jersey? Then I don't think it's worth it.

    I'm not concerned about using Jersey in test classes if that helps write tests.

    If you guys agree I'll further update this to take out compile-time Jersey deps.

    opened by srowen 23
  •  java.lang.ClassCastException: java.lang.Object cannot be cast to java.lang.String

    java.lang.ClassCastException: java.lang.Object cannot be cast to java.lang.String

    Sometimes I observe crashes like this:

    Sep 18 19:41:29 htm-psycho-401.zxz.su oryx-run.sh[13885]: java.lang.ClassCastException
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: 2018-09-18 19:41:30,872 WARN  AbstractServingModelManager:71 Error while processing message; continuing
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: java.lang.ClassCastException: java.lang.Object cannot be cast to java.lang.String
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.koloboke.collect.impl.hash.MutableLHashParallelKVObjObjMapGO.removeIf(MutableLHashParallelKVObjObjMapGO.java:1163)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.app.als.FeatureVectorsPartition.retainRecentAndIDs(FeatureVectorsPartition.java:104)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.app.als.PartitionedFeatureVectors.retainRecentAndIDs(PartitionedFeatureVectors.java:205)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.app.serving.als.model.ALSServingModel.retainRecentAndItemIDs(ALSServingModel.java:331)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.app.serving.als.model.ALSServingModelManager.consumeKeyMessage(ALSServingModelManager.java:128)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.app.serving.als.model.ALSServingModelManager.consumeKeyMessage(ALSServingModelManager.java:45)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.api.serving.AbstractServingModelManager.consume(AbstractServingModelManager.java:69)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.lambda.serving.ModelManagerListener.lambda$contextInitialized$0(ModelManagerListener.java:141)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.common.lang.LoggingCallable.lambda$log$0(LoggingCallable.java:48)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at com.cloudera.oryx.common.lang.LoggingCallable.lambda$asRunnable$1(LoggingCallable.java:66)
    Sep 18 19:41:30 htm-psycho-401.zxz.su oryx-run.sh[13885]: at java.lang.Thread.run(Thread.java:748)
    
    
    opened by stiv-yakovenko 22
  • KryoException: Buffer overflow. Available: 1, required: 4

    KryoException: Buffer overflow. Available: 1, required: 4

    Feeding huge data (4G) through serving layer gives crash of speed layer:

    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: acks = 1
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: batch.size = 16384
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: bootstrap.servers = [127.0.0.1:9092]
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: buffer.memory = 33554432
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: client.id =
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: compression.type = gzip
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: connections.max.idle.ms = 540000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: enable.idempotence = false
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: interceptor.classes = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: key.serializer = class org.apache.kafka.common.serialization.StringSerializer
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: linger.ms = 1000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: max.block.ms = 60000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: max.in.flight.requests.per.connection = 5
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: max.request.size = 67108864
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: metadata.max.age.ms = 300000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: metric.reporters = []
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: metrics.num.samples = 2
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: metrics.recording.level = INFO
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: metrics.sample.window.ms = 30000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: receive.buffer.bytes = 32768
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: reconnect.backoff.max.ms = 1000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: reconnect.backoff.ms = 50
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: request.timeout.ms = 30000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: retries = 0
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: retry.backoff.ms = 100
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: sasl.jaas.config = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: sasl.kerberos.kinit.cmd = /usr/bin/kinit
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: sasl.kerberos.min.time.before.relogin = 60000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: sasl.kerberos.service.name = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: sasl.kerberos.ticket.renew.jitter = 0.05
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: sasl.kerberos.ticket.renew.window.factor = 0.8
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: sasl.mechanism = GSSAPI
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: security.protocol = PLAINTEXT
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: send.buffer.bytes = 131072
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.cipher.suites = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.endpoint.identification.algorithm = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.key.password = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.keymanager.algorithm = SunX509
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.keystore.location = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.keystore.password = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.keystore.type = JKS
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.protocol = TLS
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.provider = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.secure.random.implementation = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.trustmanager.algorithm = PKIX
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.truststore.location = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.truststore.password = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: ssl.truststore.type = JKS
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: transaction.timeout.ms = 60000
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: transactional.id = null
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: value.serializer = class org.apache.kafka.common.serialization.StringSerializer
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:42:59 INFO  AppInfoParser:83 - Kafka version : 0.11.0.1
    Sep 03 09:42:59 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:42:59 INFO  AppInfoParser:84 - Kafka commitId : c2a0d5f9b1f45bf5
    Sep 03 09:43:03 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:03 INFO  ContextCleaner:54 - Cleaned shuffle 9
    Sep 03 09:43:03 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:03 INFO  ContextCleaner:54 - Cleaned shuffle 8
    Sep 03 09:43:03 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:03 INFO  KafkaProducer:1017 - Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  JobScheduler:54 - Finished job streaming job 1535956800000 ms.0 from job set of time 1535956800000 ms
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  JobScheduler:54 - Starting job streaming job 1535956800000 ms.1 from job set of time 1535956800000 ms
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  UpdateOffsetsFn:60 - Updating offsets: {OryxInput,0=2477630, OryxInput,1=2478826, OryxInput,2=2477739, OryxInput,3=2478777}
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ZooKeeper:438 - Initiating client connection, connectString=127.0.0.1:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@401052bb
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ZkEventThread:65 - Starting ZkClient event thread.
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ZkClient:936 - Waiting for keeper state SyncConnected
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ClientCnxn:975 - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ClientCnxn:852 - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ClientCnxn:1235 - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x10160d98a790136, negotiated timeout = 30000
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ZkClient:713 - zookeeper state changed (SyncConnected)
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ZkEventThread:83 - Terminate ZkClient event thread.
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ZooKeeper:684 - Session: 0x10160d98a790136 closed
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ClientCnxn:512 - EventThread shut down
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  JobScheduler:54 - Finished job streaming job 1535956800000 ms.1 from job set of time 1535956800000 ms
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  JobScheduler:54 - Total delay: 184.087 s for time 1535956800000 ms (execution: 161.298 s)
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  MapPartitionsRDD:54 - Removing RDD 293 from persistence list
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  BlockManager:54 - Removing RDD 293
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  KafkaRDD:54 - Removing RDD 292 from persistence list
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  BlockManager:54 - Removing RDD 292
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  ReceivedBlockTracker:54 - Deleting batches:
    Sep 03 09:43:04 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:43:04 INFO  InputInfoTracker:54 - remove old batch metadata: 1535956200000 ms
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  JobScheduler:54 - Added jobs for time 1535957100000 ms
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  JobScheduler:54 - Starting job streaming job 1535957100000 ms.0 from job set of time 1535957100000 ms
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  SparkContext:54 - Starting job: isEmpty at SpeedLayerUpdate.java:53
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Got job 264 (isEmpty at SpeedLayerUpdate.java:53) with 1 output partitions
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Final stage: ResultStage 274 (isEmpty at SpeedLayerUpdate.java:53)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Parents of final stage: List()
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Missing parents: List()
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Submitting ResultStage 274 (MapPartitionsRDD[321] at mapToPair at SpeedLayer.java:97), which has no missing parents
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  MemoryStore:54 - Block broadcast_274 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  MemoryStore:54 - Block broadcast_274_piece0 stored as bytes in memory (estimated size 2.4 KB, free 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  BlockManagerInfo:54 - Added broadcast_274_piece0 in memory on htm-psycho-401.zxz.su:45401 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  SparkContext:54 - Created broadcast 274 from broadcast at DAGScheduler.scala:1039
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Submitting 1 missing tasks from ResultStage 274 (MapPartitionsRDD[321] at mapToPair at SpeedLayer.java:97) (first 15 tasks are for partitions Vector(0))
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  YarnScheduler:54 - Adding task set 274.0 with 1 tasks
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Starting task 0.0 in stage 274.0 (TID 586, htm-psycho-401.zxz.su, executor 1, partition 0, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  BlockManagerInfo:54 - Added broadcast_274_piece0 in memory on htm-psycho-401.zxz.su:45269 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Finished task 0.0 in stage 274.0 (TID 586) in 38 ms on htm-psycho-401.zxz.su (executor 1) (1/1)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  YarnScheduler:54 - Removed TaskSet 274.0, whose tasks have all completed, from pool
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - ResultStage 274 (isEmpty at SpeedLayerUpdate.java:53) finished in 0.049 s
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Job 264 finished: isEmpty at SpeedLayerUpdate.java:53, took 0.052823 s
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  SparkContext:54 - Starting job: isEmpty at SpeedLayerUpdate.java:53
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Got job 265 (isEmpty at SpeedLayerUpdate.java:53) with 3 output partitions
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Final stage: ResultStage 275 (isEmpty at SpeedLayerUpdate.java:53)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Parents of final stage: List()
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Missing parents: List()
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Submitting ResultStage 275 (MapPartitionsRDD[321] at mapToPair at SpeedLayer.java:97), which has no missing parents
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  MemoryStore:54 - Block broadcast_275 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  MemoryStore:54 - Block broadcast_275_piece0 stored as bytes in memory (estimated size 2.4 KB, free 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  BlockManagerInfo:54 - Added broadcast_275_piece0 in memory on htm-psycho-401.zxz.su:45401 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  SparkContext:54 - Created broadcast 275 from broadcast at DAGScheduler.scala:1039
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Submitting 3 missing tasks from ResultStage 275 (MapPartitionsRDD[321] at mapToPair at SpeedLayer.java:97) (first 15 tasks are for partitions Vector(1, 2, 3))
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  YarnScheduler:54 - Adding task set 275.0 with 3 tasks
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Starting task 0.0 in stage 275.0 (TID 587, htm-psycho-401.zxz.su, executor 1, partition 1, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  BlockManagerInfo:54 - Added broadcast_275_piece0 in memory on htm-psycho-401.zxz.su:45269 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Starting task 1.0 in stage 275.0 (TID 588, htm-psycho-401.zxz.su, executor 1, partition 2, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Finished task 0.0 in stage 275.0 (TID 587) in 17 ms on htm-psycho-401.zxz.su (executor 1) (1/3)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Starting task 2.0 in stage 275.0 (TID 589, htm-psycho-401.zxz.su, executor 1, partition 3, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Finished task 1.0 in stage 275.0 (TID 588) in 8 ms on htm-psycho-401.zxz.su (executor 1) (2/3)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  TaskSetManager:54 - Finished task 2.0 in stage 275.0 (TID 589) in 7 ms on htm-psycho-401.zxz.su (executor 1) (3/3)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  YarnScheduler:54 - Removed TaskSet 275.0, whose tasks have all completed, from pool
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - ResultStage 275 (isEmpty at SpeedLayerUpdate.java:53) finished in 0.047 s
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  DAGScheduler:54 - Job 265 finished: isEmpty at SpeedLayerUpdate.java:53, took 0.050518 s
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  JobScheduler:54 - Finished job streaming job 1535957100000 ms.0 from job set of time 1535957100000 ms
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  JobScheduler:54 - Starting job streaming job 1535957100000 ms.1 from job set of time 1535957100000 ms
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  UpdateOffsetsFn:60 - Updating offsets: {OryxInput,0=2477630, OryxInput,1=2478826, OryxInput,2=2477739, OryxInput,3=2478777}
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ZooKeeper:438 - Initiating client connection, connectString=127.0.0.1:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@382c4378
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ZkEventThread:65 - Starting ZkClient event thread.
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ZkClient:936 - Waiting for keeper state SyncConnected
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ClientCnxn:975 - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ClientCnxn:852 - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ClientCnxn:1235 - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x10160d98a790137, negotiated timeout = 30000
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ZkClient:713 - zookeeper state changed (SyncConnected)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ZkEventThread:83 - Terminate ZkClient event thread.
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ZooKeeper:684 - Session: 0x10160d98a790137 closed
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ClientCnxn:512 - EventThread shut down
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  JobScheduler:54 - Finished job streaming job 1535957100000 ms.1 from job set of time 1535957100000 ms
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  JobScheduler:54 - Total delay: 0.695 s for time 1535957100000 ms (execution: 0.189 s)
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  MapPartitionsRDD:54 - Removing RDD 307 from persistence list
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  KafkaRDD:54 - Removing RDD 306 from persistence list
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  BlockManager:54 - Removing RDD 307
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  ReceivedBlockTracker:54 - Deleting batches:
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  InputInfoTracker:54 - remove old batch metadata: 1535956500000 ms
    Sep 03 09:45:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:45:00 INFO  BlockManager:54 - Removing RDD 306
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  RecurringTimer:54 - Stopped timer for JobGenerator after time 1535957400000
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobScheduler:54 - Added jobs for time 1535957400000 ms
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobScheduler:54 - Starting job streaming job 1535957400000 ms.0 from job set of time 1535957400000 ms
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  SparkContext:54 - Starting job: isEmpty at SpeedLayerUpdate.java:53
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Got job 266 (isEmpty at SpeedLayerUpdate.java:53) with 1 output partitions
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Final stage: ResultStage 276 (isEmpty at SpeedLayerUpdate.java:53)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Parents of final stage: List()
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Missing parents: List()
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Submitting ResultStage 276 (MapPartitionsRDD[323] at mapToPair at SpeedLayer.java:97), which has no missing parents
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  MemoryStore:54 - Block broadcast_276 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobGenerator:54 - Stopped generation timer
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobGenerator:54 - Waiting for jobs to be processed and checkpoints to be written
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 WARN  JobGenerator:66 - Timed out while stopping the job generator (timeout = 300000)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobGenerator:54 - Waited for jobs to be processed and checkpoints to be written
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobGenerator:54 - Stopped JobGenerator
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  MemoryStore:54 - Block broadcast_276_piece0 stored as bytes in memory (estimated size 2.4 KB, free 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  BlockManagerInfo:54 - Added broadcast_276_piece0 in memory on htm-psycho-401.zxz.su:45401 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  SparkContext:54 - Created broadcast 276 from broadcast at DAGScheduler.scala:1039
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Submitting 1 missing tasks from ResultStage 276 (MapPartitionsRDD[323] at mapToPair at SpeedLayer.java:97) (first 15 tasks are for partitions Vector(0))
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnScheduler:54 - Adding task set 276.0 with 1 tasks
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Starting task 0.0 in stage 276.0 (TID 590, htm-psycho-401.zxz.su, executor 1, partition 0, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  BlockManagerInfo:54 - Added broadcast_276_piece0 in memory on htm-psycho-401.zxz.su:45269 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Finished task 0.0 in stage 276.0 (TID 590) in 26 ms on htm-psycho-401.zxz.su (executor 1) (1/1)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnScheduler:54 - Removed TaskSet 276.0, whose tasks have all completed, from pool
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - ResultStage 276 (isEmpty at SpeedLayerUpdate.java:53) finished in 0.036 s
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Job 266 finished: isEmpty at SpeedLayerUpdate.java:53, took 0.040172 s
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  SparkContext:54 - Starting job: isEmpty at SpeedLayerUpdate.java:53
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Got job 267 (isEmpty at SpeedLayerUpdate.java:53) with 3 output partitions
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Final stage: ResultStage 277 (isEmpty at SpeedLayerUpdate.java:53)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Parents of final stage: List()
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Missing parents: List()
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Submitting ResultStage 277 (MapPartitionsRDD[323] at mapToPair at SpeedLayer.java:97), which has no missing parents
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  MemoryStore:54 - Block broadcast_277 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  MemoryStore:54 - Block broadcast_277_piece0 stored as bytes in memory (estimated size 2.4 KB, free 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  BlockManagerInfo:54 - Added broadcast_277_piece0 in memory on htm-psycho-401.zxz.su:45401 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  SparkContext:54 - Created broadcast 277 from broadcast at DAGScheduler.scala:1039
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Submitting 3 missing tasks from ResultStage 277 (MapPartitionsRDD[323] at mapToPair at SpeedLayer.java:97) (first 15 tasks are for partitions Vector(1, 2, 3))
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnScheduler:54 - Adding task set 277.0 with 3 tasks
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Starting task 0.0 in stage 277.0 (TID 591, htm-psycho-401.zxz.su, executor 1, partition 1, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  BlockManagerInfo:54 - Added broadcast_277_piece0 in memory on htm-psycho-401.zxz.su:45269 (size: 2.4 KB, free: 912.3 MB)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Starting task 1.0 in stage 277.0 (TID 592, htm-psycho-401.zxz.su, executor 1, partition 2, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Finished task 0.0 in stage 277.0 (TID 591) in 13 ms on htm-psycho-401.zxz.su (executor 1) (1/3)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Starting task 2.0 in stage 277.0 (TID 593, htm-psycho-401.zxz.su, executor 1, partition 3, PROCESS_LOCAL, 7746 bytes)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Finished task 1.0 in stage 277.0 (TID 592) in 6 ms on htm-psycho-401.zxz.su (executor 1) (2/3)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  TaskSetManager:54 - Finished task 2.0 in stage 277.0 (TID 593) in 8 ms on htm-psycho-401.zxz.su (executor 1) (3/3)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnScheduler:54 - Removed TaskSet 277.0, whose tasks have all completed, from pool
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - ResultStage 277 (isEmpty at SpeedLayerUpdate.java:53) finished in 0.033 s
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  DAGScheduler:54 - Job 267 finished: isEmpty at SpeedLayerUpdate.java:53, took 0.035632 s
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobScheduler:54 - Finished job streaming job 1535957400000 ms.0 from job set of time 1535957400000 ms
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobScheduler:54 - Starting job streaming job 1535957400000 ms.1 from job set of time 1535957400000 ms
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  UpdateOffsetsFn:60 - Updating offsets: {OryxInput,0=2477630, OryxInput,1=2478826, OryxInput,2=2477739, OryxInput,3=2478777}
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ZooKeeper:438 - Initiating client connection, connectString=127.0.0.1:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@30c667e9
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ZkEventThread:65 - Starting ZkClient event thread.
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ZkClient:936 - Waiting for keeper state SyncConnected
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ClientCnxn:975 - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ClientCnxn:852 - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ClientCnxn:1235 - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x10160d98a790139, negotiated timeout = 30000
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ZkClient:713 - zookeeper state changed (SyncConnected)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ZkEventThread:83 - Terminate ZkClient event thread.
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ZooKeeper:684 - Session: 0x10160d98a790139 closed
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobScheduler:54 - Finished job streaming job 1535957400000 ms.1 from job set of time 1535957400000 ms
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ClientCnxn:512 - EventThread shut down
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobScheduler:54 - Total delay: 0.605 s for time 1535957400000 ms (execution: 0.099 s)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  JobScheduler:54 - Stopped JobScheduler
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ContextHandler:910 - Stopped o.s.j.s.ServletContextHandler@5a06904{/streaming,null,UNAVAILABLE,@Spark}
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ContextHandler:910 - Stopped o.s.j.s.ServletContextHandler@286866cb{/streaming/batch,null,UNAVAILABLE,@Spark}
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ContextHandler:910 - Stopped o.s.j.s.ServletContextHandler@14b95942{/static/streaming,null,UNAVAILABLE,@Spark}
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  StreamingContext:54 - StreamingContext stopped successfully
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  AbstractConnector:318 - Stopped Spark@18324f97{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  SparkUI:54 - Stopped Spark web UI at http://htm-psycho-401.zxz.su:4041
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnClientSchedulerBackend:54 - Interrupting monitor thread
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnClientSchedulerBackend:54 - Shutting down all executors
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnSchedulerBackend$YarnDriverEndpoint:54 - Asking each executor to shut down
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  SchedulerExtensionServices:54 - Stopping SchedulerExtensionServices
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: (serviceOption=None,
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: services=List(),
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: started=false)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  YarnClientSchedulerBackend:54 - Stopped
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  MemoryStore:54 - MemoryStore cleared
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  BlockManager:54 - BlockManager stopped
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  SparkContext:54 - Successfully stopped SparkContext
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 268.0 failed 4 times, most recent failure: Lost task 0.3 in stage 268.0 (TID 567, htm-psycho-401.zxz.su, executor 1): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 1, required: 4. To avoid this, increase spark.kryoserializer.buffer.max value.
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:350)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:393)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at java.lang.Thread.run(Thread.java:748)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 1, required: 4
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.io.Output.require(Output.java:163)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.io.Output.writeInt(Output.java:254)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.io.Output.writeFloat(Output.java:496)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.UnsafeCacheFields$UnsafeFloatField.write(UnsafeCacheFields.java:80)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:366)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:307)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:347)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: ... 4 more
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: Driver stacktrace:
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1599)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1587)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1586)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1586)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at scala.Option.foreach(Option.scala:257)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1769)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1758)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.SparkContext.runJob(SparkContext.scala:2027)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.SparkContext.runJob(SparkContext.scala:2048)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.SparkContext.runJob(SparkContext.scala:2067)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.SparkContext.runJob(SparkContext.scala:2092)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:939)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.rdd.RDD.collect(RDD.scala:938)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:361)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.cloudera.oryx.app.speed.als.ALSSpeedModelManager.buildUpdates(ALSSpeedModelManager.java:182)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.cloudera.oryx.lambda.speed.SpeedLayerUpdate.call(SpeedLayerUpdate.java:56)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.cloudera.oryx.lambda.speed.SpeedLayerUpdate.call(SpeedLayerUpdate.java:37)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:272)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:272)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at scala.util.Try$.apply(Try.scala:192)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at java.lang.Thread.run(Thread.java:748)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: Caused by: org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 1, required: 4. To avoid this, increase spark.kryoserializer.buffer.max value.
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:350)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:393)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: ... 3 more
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 1, required: 4
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.io.Output.require(Output.java:163)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.io.Output.writeInt(Output.java:254)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.io.Output.writeFloat(Output.java:496)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.UnsafeCacheFields$UnsafeFloatField.write(UnsafeCacheFields.java:80)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:366)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:307)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:347)
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: ... 4 more
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ShutdownHookManager:54 - Shutdown hook called
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-4d5c56ac-ccfc-4c15-ba08-6411e35a1e5e
    Sep 03 09:50:00 htm-psycho-401.zxz.su bash[31144]: 2018-09-03 09:50:00 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-fab670a2-b75d-4494-a3c1-d49046e1c765
    Sep 03 09:50:01 htm-psycho-401.zxz.su systemd[1]: oryx-speed.service: main process exited, code=exited, status=1/FAILURE
    Sep 03 09:50:01 htm-psycho-401.zxz.su systemd[1]: Unit oryx-speed.service entered failed state.
    Sep 03 09:50:01 htm-psycho-401.zxz.su systemd[1]: oryx-speed.service failed.
    
    

    how can I fix it?

    opened by stiv-yakovenko 19
  • Oryx serving layer fails to start

    Oryx serving layer fails to start

    Issue: It appears that the oryx-serving layer from release 2.1.2 fails to start with an exception (using ALS config example). A new build of the latest version does not have this class path issue but throws a different exception related to Kafka connectivity. I can only assume there is something that I am missing here?

    Context: CDH 5.7, Kafka 0.8.2.0, Scala library 2.10.6 running inside docker container.

    Exception on startup with oryx-serving 2.1.2:

    SEVERE: Exception sending context initialized event to listener instance of class com.cloudera.oryx.lambda.serving.ModelManagerListener
    java.lang.NoSuchMethodError: kafka.admin.AdminUtils.topicExists(Lorg/I0Itec/zkclient/ZkClient;Ljava/lang/String;)Z
        at com.cloudera.oryx.kafka.util.KafkaUtils.topicExists(KafkaUtils.java:93)
        at com.cloudera.oryx.lambda.serving.ModelManagerListener.contextInitialized(ModelManagerListener.java:113)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4812)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5255)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
        at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1408)
        at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1398)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    
    Apr 20, 2016 2:14:43 PM org.apache.catalina.core.StandardContext startInternal
    SEVERE: One or more listeners failed to start. Full details will be found in the appropriate container log file
    Apr 20, 2016 2:14:43 PM org.apache.catalina.core.StandardContext startInternal
    SEVERE: Context [Oryx] startup failed due to previous errors
    

    I have placed the necessary jars in /opt/cloudera/parcels/CDH/jars and added anything causing exception to the compute-classpath (e.g. Scala library):

    commons-cli-1.2.jar
    commons-collections-3.2.2.jar
    commons-configuration-1.6.jar
    hadoop-auth.jar
    hadoop-common.jar
    hadoop-hdfs.jar
    hadoop-yarn-applications-distributedshell-2.6.0-cdh5.7.0.jar
    htrace-core4-4.0.1-incubating.jar
    protobuf-java-2.5.0.jar
    scala-library-2.10.6.jar
    snappy-java-1.0.4.1.jar
    spark-examples-1.6.0-cdh5.7.0-hadoop2.6.0-cdh5.7.0.jar
    zookeeper-copy.jar
    

    Exception with latest build of oryx-serving: When a new build is run for oryx-serving, we instead get an exception connecting with Kafka before data is ingested (1) and a new exception after an attempt to ingest data is made (2): (1): 2016-04-20 14:19:44,104 WARN ConsumerFetcherThread:83 [ConsumerFetcherThread-OryxGroup-ServingLayer-1461161982826_quickstart.cloudera-1461161982997-dfd4c392-0-0], Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest@24a09ae2. Possible cause: java.nio.BufferUnderflowException

    (2):

    2016-04-20 14:23:29,941 WARN  ConsumerFetcherThread:83 [ConsumerFetcherThread-OryxGroup-ServingLayer-1461162155849_quickstart.cloudera-1461162155987-293080da-0-0], Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest@1ff07ee6. Possible cause: java.nio.BufferUnderflowException
    2016-04-20 14:23:29,949 WARN  DefaultEventHandler:89 Failed to send producer request with correlation id 101 to broker 0 with data for partitions [OryxInput,3],[OryxInput,1],[OryxInput,2],[OryxInput,0]
    java.nio.BufferUnderflowException
        at java.nio.Buffer.nextGetIndex(Buffer.java:506)
        at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:361)
        at kafka.api.ProducerResponse$.readFrom(ProducerResponse.scala:43)
        at kafka.producer.SyncProducer.send(SyncProducer.scala:110)
        at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:259)
        at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:110)
        at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:102)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
        at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        at kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:102)
        at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:75)
        at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
        at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
        at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
        at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
    2016-04-20 14:23:29,962 INFO  DefaultEventHandler:68 Back off for 100 ms before retrying send. Remaining retries = 3
    

    Command used to run oryx-serving ./oryx-run.sh serving --conf als-example.conf --app-jar oryx-serving-2.x.0.jar

    Config used

    kafka-brokers = "localhost:9092"
    zk-servers = "localhost:2181"
    hdfs-base = "hdfs:///user/example/Oryx"
    
    oryx {
      id = "ALSExample"
      als {
        rescorer-provider-class = null
      }
      input-topic {
       broker = ${kaka-brokers}
        lock = {
      master = ${zk-servers}
      }
       }
      update-topic {
       broker = ${kaka-brokers}
      lock = {
        master = ${zk-servers}
      }
    }
     batch {
       streaming {
      generation-interval-sec = 300
      num-executors = 4
      executor-cores = 8
      executor-memory = "4g"
    }
    update-class = "com.cloudera.oryx.app.batch.mllib.als.ALSUpdate"
     storage {
      data-dir =  ${hdfs-base}"/data/"
      model-dir = ${hdfs-base}"/model/"
     }
      ui {
       port = 4040
     }
       }
        speed { model-manager-class =            "com.cloudera.oryx.app.speed.als.ALSSpeedModelManager"
           ui {
        port = 4041
    }
     }
     serving { model-manager-class =   "com.cloudera.oryx.app.serving.als.model.ALSServingModelManager"
       application-resources = "com.cloudera.oryx.app.serving,com.cloudera.oryx.app.serving.als"
       api {
         port = 8080
    }
     }
     }
    
    bug LambdaTier DataTransport 
    opened by msumner91 19
  • ALS app: java.lang.ClassCastException: java.lang.Object cannot be cast to java.lang.String

    ALS app: java.lang.ClassCastException: java.lang.Object cannot be cast to java.lang.String

    Reports of a strange ClassCastException in ALS in master / 2.3:

    2016-08-18 17:17:18,768 INFO  ALSServingModelManager:97 ALSServingModel[features:30, implicit:true, X:(7640666 users), Y:(1613282 items, partitions: [0:104911, 1:10022, 2:44695, 3:26323, 4:50937, 5:36393, 6:99643, 7:54777, 8:17366, 9:28681, 10:131557, 11:31438, 12:33617, 13:24153, 14:111447, 15:43643, ...]...), fractionLoaded:0.99938]
    2016-08-18 17:17:18,867 INFO  ALSServingModelManager:104 Loading new model
    2016-08-18 17:17:24,141 INFO  AbstractOryxResource:86 Model loaded fraction: 0.9996865
    2016-08-18 17:17:24,460 INFO  ALSServingModelManager:115 Updating model
    2016-08-18 17:17:49,084 ERROR ModelManagerListener:144 Error while consuming updates
    java.lang.ClassCastException: java.lang.Object cannot be cast to java.lang.String
            at net.openhft.koloboke.collect.impl.hash.MutableSeparateKVObjLHashGO.removeIf(MutableSeparateKVObjLHashGO.java:275)
            at com.cloudera.oryx.app.serving.als.model.ALSServingModel.lambda$retainRecentAndKnownItems$7(ALSServingModel.java:437)
            at net.openhft.koloboke.collect.impl.hash.MutableLHashParallelKVObjObjMapGO$ValueView.forEach(MutableLHashParallelKVObjObjMapGO.java:2228)
            at com.cloudera.oryx.app.serving.als.model.ALSServingModel.retainRecentAndKnownItems(ALSServingModel.java:435)
            at com.cloudera.oryx.app.serving.als.model.ALSServingModelManager.consume(ALSServingModelManager.java:119)
            at com.cloudera.oryx.lambda.serving.ModelManagerListener.lambda$contextInitialized$1(ModelManagerListener.java:142)
            at com.cloudera.oryx.common.lang.LoggingCallable.lambda$log$0(LoggingCallable.java:48)
            at com.cloudera.oryx.common.lang.LoggingCallable.lambda$asRunnable$1(LoggingCallable.java:66)
            at java.lang.Thread.run(Thread.java:745)
    2016-08-18 17:17:49,086 INFO  ModelManagerListener:177 ModelManagerListener closing
    2016-08-18 17:17:49,086 INFO  ModelManagerListener:179 Shutting down model manager
    2016-08-18 17:17:49,086 INFO  ModelManagerListener:184 Shutting down input producer
    2016-08-18 17:17:49,086 INFO  Producer:68 Shutting down producer
    
    bug ServingLayer Apps 
    opened by srowen 11
  • Oryx 2.1.2:

    Oryx 2.1.2: "org.apache.hadoop.fs.FSDataInputStream" not found on CDH 5.5.1 Cluster

    After reading release and bug-fixed notes, I think 2.1.2 is pretty good version, think for @srowen's contribution for this.

    Since I have to remain a CDH 5.4.x cluster with Oryx 2.0.x running on it, I deploy a new cluster for 2.1.x branch.Unfortunately, this try seemed to be more difficult than I imagined. Some dependency library has changed on CDH 5.5.1 while compute-classpath.sh and oryx-run.sh on 2.1.2 are a few different from those on 2.0.x.

    Everything was still OK until I started to run Oryx with old als.conf for 2.0.0. Exceptions were thrown by batch and speed :

    Batch

    spark-submit --master yarn-client --name OryxBatchLayer-ALSTest --class com.cloudera.oryx.batch.Main      --jars /opt/cloudera/parcels/CDH/jars/spark-examples-1.5.0-cdh5.5.1-hadoop2.6.0-cdh5.5.1.jar --files test.conf --driver-memory 512m      --driver-java-options "-Dconfig.file=test.conf" --executor-memory 1g --executor-cores 1      --conf spark.executor.extraJavaOptions="-Dconfig.file=test.conf" --conf spark.ui.port=20000 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --num-executors=3 oryx-batch-2.jar
    
    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
        at org.apache.spark.deploy.SparkSubmitArguments.handle(SparkSubmitArguments.scala:394)
        at org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:163)
        at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:97)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:113)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 5 more
    
    

    Speed

    spark-submit --master yarn-client --name OryxSpeedLayer-ALSTest --class com.cloudera.oryx.speed.Main      --jars /opt/cloudera/parcels/CDH/jars/spark-examples-1.5.0-cdh5.5.1-hadoop2.6.0-cdh5.5.1.jar --files test.conf --driver-memory 256m      --driver-java-options "-Dconfig.file=test.conf" --executor-memory 512m --executor-cores 1      --conf spark.executor.extraJavaOptions="-Dconfig.file=test.conf" --conf spark.ui.port=20001 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --num-executors=2 oryx-speed-2.jar
    
    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
        at org.apache.spark.deploy.SparkSubmitArguments.handle(SparkSubmitArguments.scala:394)
        at org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:163)
        at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:97)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:113)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 5 more
    

    I guessed Oryx could find some hadoop's libs but I have no idea how to solve this problem. I found something similar on stackoverflow, something wrong on my Spark's configurations?

    NoClassDefFoundError com.apache.hadoop.fs.FSDataInputStream when execute spark-shell

    opened by bash-horatio 11
  • Use reflow for maven site.

    Use reflow for maven site.

    I've switched the skin oryx uses to reflow. Reflow allows us to easily render markdown into html and steal nice looking themes from bootswatch. We can use any of the theme avaiable from bootswatch and configure the site using the site.xml file. It is kind of plain and has rough edges but I thought I would throw it out as a start to build on.

    enhancement 
    opened by hougs 11
  • Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

    Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

    Hello Everybody I am trying to running oryx 2 in Cloudera Express 5.3.0, with a single node but when i execute the BatchLayer, this return me the next warning

    "WARN YarnClient Cluster Scheduler:71 Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory"

    I look at my yarn, It has sufficient memory of 8GB, and 1Vcore.

    I changed my conf file accordingly:

    Number of executors to start

    num-executors = 1
    
    # Cores per executor
    executor-cores = 1
    
    # Memory per executor
    executor-memory = "512m"
    
    # Heap size for the Speed driver process
    driver-memory = "512m"
    

    How can i solve this problem?

    Thanks, Maruthi.

    opened by MaruthiD 9
  • Shading of PMML breaks apps that extend MLUpdate

    Shading of PMML breaks apps that extend MLUpdate

    Quoting from the mailing list:

        I meet a problem when running batch layer.
        I write a batch layer LRScalaUpdate  with scala extends MLUpdate, override  buildModel() and evaluate() method. then i get an exception when running the batch layer.
        I'm wondering why it call MLUpdate.buildModel  instead of my LRScalaUpdate.buildModel. 
        can you give me some suggestions? thank you
    
    17/07/12 14:55:06 INFO cluster.YarnClusterScheduler: Removed TaskSet 7.0, whose tasks have all completed, from pool 
    17/07/12 14:55:06 INFO scheduler.DAGScheduler: ResultStage 7 (isEmpty at MLUpdate.java:360) finished in 0.093 s
    17/07/12 14:55:06 INFO scheduler.DAGScheduler: Job 7 finished: isEmpty at MLUpdate.java:360, took 0.109474 s
    Exception in thread "streaming-job-executor-0" java.lang.AbstractMethodError: com.cloudera.oryx.ml.MLUpdate.buildModel(Lorg/apache/spark/api/java/JavaSparkContext;
    Lorg/apache/spark/api/java/JavaRDD;Ljava/util/List;Lorg/apache/hadoop/fs/Path;)Loryx/org/dmg/pmml/PMML;
    	at com.cloudera.oryx.ml.MLUpdate.buildAndEval(MLUpdate.java:314)
    	at com.cloudera.oryx.ml.MLUpdate.lambda$findBestCandidatePath$0(MLUpdate.java:259)
    	at java.util.stream.IntPipeline$4$1.accept(IntPipeline.java:250)
    	at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
    ...
    

    Oryx shades its use of PMML classes to avoid classpath conflict with Spark. That's fine as it's internal to Oryx.

    Except, one key thing I overlooked: MLUpdate actually forms a sort of API outside of the api package, and it does use one PMML class in its signature.

    bug BatchLayer MLTier 
    opened by srowen 2
Releases(oryx-2.8.0)
Owner
Oryx Project
Realization of the lambda architecture on Spark and Kafka with specialization for real-time large scale machine learning
Oryx Project
Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

The Apache Software Foundation 34.7k Jan 2, 2023
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.

Datumbox Machine Learning Framework The Datumbox Machine Learning Framework is an open-source framework written in Java which allows the rapid develop

Vasilis Vryniotis 1.1k Dec 9, 2022
Java time series machine learning tools in a Weka compatible toolkit

UEA Time Series Classification A Weka-compatible Java toolbox for time series classification, clustering and transformation. For the python sklearn-co

Machine Learning and Time Series Tools and Datasets 140 Nov 7, 2022
Archinsight project tends to implement architecture-as-code definition of a standard c4 architecture model

Archinsight project tends to implement architecture-as-code definition of a standard c4 architecture model. This project offers a new Insight language designed in such way that an Architect can focus on architecture definition, not visualization. Compared to UML, the Insight language is more specific and is unable to describe an arbitrary entity, but shorter and probably easier to use.

null 25 Nov 24, 2022
Java Statistical Analysis Tool, a Java library for Machine Learning

Java Statistical Analysis Tool JSAT is a library for quickly getting started with Machine Learning problems. It is developed in my free time, and made

null 752 Dec 20, 2022
Statistical Machine Intelligence & Learning Engine

Smile Smile (Statistical Machine Intelligence and Learning Engine) is a fast and comprehensive machine learning, NLP, linear algebra, graph, interpola

Haifeng Li 5.7k Jan 1, 2023
A machine learning package built for humans.

aerosolve Machine learning for humans. What is it? A machine learning library designed from the ground up to be human friendly. It is different from o

Airbnb 4.8k Dec 30, 2022
statistics, data mining and machine learning toolbox

Disambiguation (Italian dictionary) Field of turnips. It is also a place where there is confusion, where tricks and sims are plotted. (Computer scienc

Aurelian Tutuianu 63 Jun 11, 2022
Tribuo - A Java machine learning library

Tribuo - A Java prediction library (v4.2) Tribuo is a machine learning library in Java that provides multi-class classification, regression, clusterin

Oracle 1.1k Dec 28, 2022
MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

null 900 Jan 2, 2023
Model import deployment framework for retraining models (pytorch, tensorflow,keras) deploying in JVM Micro service environments, mobile devices, iot, and Apache Spark

The Eclipse Deeplearning4J (DL4J) ecosystem is a set of projects intended to support all the needs of a JVM based deep learning application. This mean

Eclipse Foundation 12.7k Dec 30, 2022
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote servers

What is Firestorm Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote ser

Tencent 246 Nov 29, 2022
Flink/Spark Connectors for Apache Doris(Incubating)

Apache Doris (incubating) Connectors The repository contains connectors for Apache Doris (incubating) Flink Doris Connector More information about com

The Apache Software Foundation 30 Dec 7, 2022
Word Count in Apache Spark using Java

Word Count in Apache Spark using Java

Arjun Gautam 2 Feb 24, 2022
A scale demo of Neo4j Fabric spanning up to 1129 machines/shards running a 100TB (LDBC) dataset with 1.2tn nodes and relationships.

Demo application instructions Overview This repository contains the code necessary to reproduce the results for the Trillion Entity demonstration that

Neo4j 84 Nov 23, 2022
Sparkling Water provides H2O functionality inside Spark cluster

Sparkling Water Sparkling Water integrates H2O's fast scalable machine learning engine with Spark. It provides: Utilities to publish Spark data struct

H2O.ai 939 Jan 2, 2023