27 Repositories
Java spark Libraries
Example Project which uses spark mongo connector !
mongo-spark-connector-springboot Example Project which uses spark mongo connector to read/aggregate & convert into Spark DataSet/Java RDDs Connects to
REST API for Apache Spark on K8S
Lighter Lighter is an opensource application for interacting with Apache Spark on Kubernetes or Apache Hadoop YARN. It is hevily inspired by Apache Li
Flink/Spark Connectors for Apache Doris(Incubating)
Apache Doris (incubating) Connectors The repository contains connectors for Apache Doris (incubating) Flink Doris Connector More information about com
spark client is a utility mod for anarchy server made by dvd, geza3d and me
Spark-Client spark client is a utility mod for anarchy server made by dvd, geza3d and me.
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Sparkler A web crawler is a bot program that fetches resources from the web for the sake of building applications like search engines, knowledge bases
Running compute-intense parts of BigStitcher distributed
BigStitcher-Spark Running compute-intense parts of BigStitcher distributed. For now we support fusion with affine transformation models (including tra
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote servers
What is Firestorm Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote ser
Spark interface for Drsti
Drsti for Spark (ai.jgp.drsti-spark) Spark interface for Drsti Resources Bringing vision to Apache Spark (2021-09-21) introduces Drsti and explains ho
SparkFE is the LLVM-based and high-performance Spark native execution engine which is designed for feature engineering.
Spark has rapidly emerged as the de facto standard for big data processing. However, it is not designed for machine learning which has more and more limitation in AI scenarios. SparkFE rewrite the execution engine in C++ and achieve more than 6x performance improvement for feature extraction. It guarantees the online-offline consistency which makes AI landing much easier. For further details, please refer to SparkFE Documentation.
Example code from Learning Spark book
Examples for Learning Spark Examples for the Learning Spark book. These examples require a number of libraries and as such have long build files. We h
:herb: 基于springboot的快速学习示例,整合自己遇到的开源框架,如:rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等:pushpin:
欢迎大家留言和PR~ Tip: 技术更新换代太快,本仓库仅做参考,自己的项目具体使用哪个版本还需谨慎思考~(不推荐使用最新的版本,推荐使用(最新-1|2)的版本,会比较稳定) spring-boot-quick 前言 自己很早就想搞一个总的仓库就是将自己平时遇到的和学习到的东西整合在一起,方便后
汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
技术心得 原文地址:https://github.com/aalansehaiyang/technology-talk 微信公众号 新开了个微信公众号:微观技术,分享各个行业优秀的架构设计方案、技术心得、心路历程等,欢迎各位技术达人关注、经验交流 前言 有人认为编程是一门技术活,要有一定的天赋,非天
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine l
Machine Learning Platform and Recommendation Engine built on Kubernetes
Update January 2018 Seldon Core open sourced. Seldon Core focuses purely on deploying a wide range of ML models on Kubernetes, allowing complex runtim
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Apache Zeppelin Documentation: User Guide Mailing Lists: User and Dev mailing list Continuous Integration: Contributing: Contribution Guide Issue Trac
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine l
Serverless proxy for Spark cluster
Hydrosphere Mist Hydrosphere Mist is a serverless proxy for Spark cluster. Mist provides a new functional programming framework and deployment model f
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
H2O H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Fl
A simple expressive web framework for java. Spark has a kotlin DSL https://github.com/perwendel/spark-kotlin
Spark - a tiny web framework for Java 8 Spark 2.9.3 is out!! Changeset dependency groupIdcom.sparkjava/groupId artifactIdspark-core/a
A better compressed bitset in Java
RoaringBitmap Bitsets, also called bitmaps, are commonly used as fast data structures. Unfortunately, they can use too much memory. To compensate, we
Sparkling Water provides H2O functionality inside Spark cluster
Sparkling Water Sparkling Water integrates H2O's fast scalable machine learning engine with Spark. It provides: Utilities to publish Spark data struct
Model import deployment framework for retraining models (pytorch, tensorflow,keras) deploying in JVM Micro service environments, mobile devices, iot, and Apache Spark
The Eclipse Deeplearning4J (DL4J) ecosystem is a set of projects intended to support all the needs of a JVM based deep learning application. This mean
Apache Spark - A unified analytics engine for large-scale data processing
Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine l
A better compressed bitset in Java
RoaringBitmap Bitsets, also called bitmaps, are commonly used as fast data structures. Unfortunately, they can use too much memory. To compensate, we