47 Repositories
Java big-tech-interviews Libraries
Split into data blocks,In this format, efficient reading can be realized,Avoid unnecessary data reading operations.
dataTear 切换至:中文文档 knowledge base dataTear Split into data fragments for data management. In this format, efficient reading can be achieved to avoid un
Modded Minecraft client for 1.7.10 tech server.
Tomrum – 1.7.10 ForgeMod for Nullspace Adds some useful features for mainly creative dev: Simple implementation of Protocol4 which lets you connect to
Curated Collection of all Low level design Questions and implementation asked in major Tech companies , Get yourself prepared for the LLD round and ace the interview.
Low level Design / Machine Coding Question Collections What is Machine Coding Round ? Machine Coding Round has become very popular interview round in
F5 BIG-IP iControl REST vulnerability RCE exploit with Java including a testing LAB
CVE-2022-1388 F5 BIG-IP iControl REST vulnerability RCE exploit with Java and ELF. Included Scan a single target Scan many targets Exploit with a shel
This repository contains solutions to all the Bit Manipulations problems and coding challenges
This repository contains solutions to all the Bit Manipulations problems and coding challenges. I have also written a course on how to solve problems using bit manipulation. You can visit it here: https://www.educative.io/courses/bit-manipulation (Grokking Bit Manipulation For Coding Interviews)
Jarm is a small Tech Minecraft Mod I'm coding
jarm Jarm is a small Tech Minecraft Mod I'm coding. This is currently WIP and may get discontinued. Also, I'm not a professional coder. #Installation
a proxy for http & https ,write by java,no dependences for other tech
申明 本项目只是作者记录和分享Java网络编程学习心得,请勿用于非法用途,否则后果自负! 原理介绍博客: https://blog.csdn.net/wang382758656/article/details/123098032 https://juejin.cn/post/706921880022
Some recent questions asked in interviews of companies like Google, TCS, Amazon etc.
Some recent questions asked in interviews of companies like Google, TCS, Amazon etc.
Tech Elevator Bootcamp Exercises and Projects
TechElevatorExercises A folder for all my Tech Elevator endeavors as I learn full-stack development bootcamp-style.
Parquet-MR contains the java implementation of the Parquet format
Parquet MR Parquet-MR contains the java implementation of the Parquet format. Parquet is a columnar storage format for Hadoop; it provides efficient s
Apache Drill is a distributed MPP query layer for self describing data
Apache Drill Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage sys
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Sparkler A web crawler is a bot program that fetches resources from the web for the sake of building applications like search engines, knowledge bases
MockNeat - the modern faker lib.
Mockneat is an arbitrary data-generator open-source library written in Java. It provides a simple but powerful (fluent) API that enables developers to
Generate and read big Excel files quickly
fastexcel fastexcel-writer There are not many alternatives when you have to generate xlsx Excel workbooks in Java. The most popular one (Apache POI) i
IoTDB (Internet of Things Database) is a data management system for time series data
English | 中文 IoTDB Overview IoTDB (Internet of Things Database) is a data management system for time series data, which can provide users specific ser
Dremio - the missing link in modern data
Dremio Dremio enables organizations to unlock the value of their data. Documentation Documentation is available at https://docs.dremio.com. Quickstart
Repository with LeetCode Solutions and Dedicated Index to prepare for your FAANGM Interviews.
Repository with LeetCode Solutions and Dedicated Index to prepare for your FAANGM Interviews. Feel free to share and Contribute to this repository.
Repository for FIRST Tech Challenge team 3916 Apex Robotics for the 2021-2022 game year (Freight Frenzy)
FTC Team 3916 - Apex Robotics This is our repo for the 2020-2021 game year - Ultimate Goal Installation Clone this repo. You can do this through the w
Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low latency stream processing and data analysis framework. Milliseconds latency and 10+ times faster than Flink for complicated use cases.
Table-Computing Welcome to the Table-Computing GitHub. Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low la
Welcome to the EHS robotics club's GitHub repository, this will also be used as our primary community center and means of communication. Also be sure to join our remind for on the go updates @EHSFTC21
NOTICE This repository contains the public FTC SDK for the Ultimate Goal (2020-2021) competition season. Formerly this software project was hosted her
Jornada Big Tech: I will have 3 months to study and prepare myself for the Big Tech interviews. Repository containing all my study material.
Jornada Big Tech (Big Tech Journey) Jornada Big Tech: I will have 3 months to study and prepare myself for the Big Tech interviews. Repository contain
RU-collab, expands endgame content
Prometheus Technologies Java mod for Mindustry, biggest RU megacollab Compiling JDK 8. Task dexify requires d8 from Android build-tools 28.0.1. Plai
Preparation and practice for coding interviews
Coding Interviews Preparation and practice for coding interviews Hope you enjoy and help is more than welcome :) Problems by Dificulty A1 1D problems,
The official home of the Presto distributed SQL query engine for big data
Presto Presto is a distributed SQL query engine for big data. See the User Manual for deployment instructions and end user documentation. Requirements
Apache Hive
Apache Hive (TM) The Apache Hive (TM) data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storag
SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.
SAMOA: Scalable Advanced Massive Online Analysis. This repository is discontinued. The development of SAMOA has moved over to the Apache Software Foun
Apache Flink
Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Apache Gobblin Apache Gobblin is a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems. Ca
Mirror of Apache Storm
Master Branch: Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processi
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Apache Zeppelin Documentation: User Guide Mailing Lists: User and Dev mailing list Continuous Integration: Contributing: Contribution Guide Issue Trac
Serverless proxy for Spark cluster
Hydrosphere Mist Hydrosphere Mist is a serverless proxy for Spark cluster. Mist provides a new functional programming framework and deployment model f
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
H2O H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Fl
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.
About CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time. CrateDB offers the
Apache Hive
Apache Hive (TM) The Apache Hive (TM) data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storag
The official home of the Presto distributed SQL query engine for big data
Presto Presto is a distributed SQL query engine for big data. See the User Manual for deployment instructions and end user documentation. Requirements
Distributed Stream and Batch Processing
What is Jet Jet is an open-source, in-memory, distributed batch and stream processing engine. You can use it to process large volumes of real-time eve
Mirror of Apache Storm
Master Branch: Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processi
CogComp's Natural Language Processing libraries and Demos:
CogCompNLP This project collects a number of core libraries for Natural Language Processing (NLP) developed by Cognitive Computation Group. How to use
An extensible Java framework for building XML and non-XML streaming applications
Smooks Framework This is the Git source code repository for the Smooks Project. Build Status Building Pre-requisites JDK 8 Apache Maven 3.2.x Maven gi
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Datumbox Machine Learning Framework The Datumbox Machine Learning Framework is an open-source framework written in Java which allows the rapid develop
Apache Flink
Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin
Apache Spark - A unified analytics engine for large-scale data processing
Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op
Open Source In-Memory Data Grid
Hazelcast Hazelcast is an open-source distributed in-memory data store and computation platform. It provides a wide variety of distributed data struct
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Trino is a fast distributed SQL query engine for big data analytics. See the User Manual for deployment instructions and end user documentation. Devel
Apache Calcite
Apache Calcite Apache Calcite is a dynamic data management framework. It contains many of the pieces that comprise a typical database management syste
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Apache ORC ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. It is optimized for large streaming reads, but with
A big, fast and persistent queue based on memory mapped file.
Big Queue A big, fast and persistent queue based on memory mapped file. Notice, bigqueue is just a standalone library, for a high-throughput, persiste