A cloud-native, serverless, scalable, cheap key-value store

Related tags

Spring Boot sleeper
Overview

Sleeper

Introduction

Sleeper is a serverless, cloud-native, log-structured merge tree based, scalable key-value store. It is designed to allow the ingest of very large volumes of data at low cost. Data is stored in rows in tables. Each row has a key field, and an optional sort field, and some value fields. Queries for rows where the key takes a given value takes around 1-2 seconds, but many thousands can be run in parallel. Each individual query has a negligible cost.

Sleeper can be thought of as a cloud-native reimagining of systems such as Hbase and Accumulo. The architecture is very different to those systems. Sleeper has no long running servers. This means that if there is no work to be done, i.e. no data is being ingested and no background operations such as compactions are in progress, then the only cost is the cost of the storage. There are no wasted compute cycles, i.e. it is "serverless".

The current codebase can only be deployed to AWS, but there is nothing in the design that limits it to AWS. In time we would like to be able to deploy Sleeper to other public cloud environments such as Microsoft Azure or to a Kubernetes cluster.

Note that Sleeper is currently a prototype. Further development and testing is needed before it can be considered to be ready for production use.

Functionality

Sleeper stores records in tables. A table is a collection of records that conform to a schema. A record is a map from a field name to value. For example, a schema might have a row key field called 'id' of type string, a sort field called 'timestamp' of type long, and a value field called 'name' of type string. Each record in a table with that schema is a map with keys of id, timestamp and name. Data in the table is stored range-partitioned by the key field. Within partitions, records are stored in Parquet files in S3. These files contain records in sorted order (sorted by the key field and then by the sort field).

Sleeper is deployed using CDK. Each bit of functionality is deployed using a separate CDK substack of one main stack.

  • State store stacks: Each table has a state store that stores metadata about the table such as the files that are in the table, the partitions they are in and information about the partitions themselves.
  • Compaction stack: As the number of files in a partition increases, their contents must be merged ("compacted") into a single sorted file. The compaction stack performs this task using lambda and SQS for job creation and queueing and Fargate for execution of the tasks.
  • Garbage collector stack: After compaction jobs have completed, the input files are deleted (after a user configurable delay to ensure the files are not in use by queries). The garbage collector stack performs this task using a lambda function.
  • Partition splitting stack: Partitions need to be split once they reach a user configurable size. This only affects the metadata about partitions in the state store. This task is performed using a lambda.
  • Ingest stack: This allows the ingestion of data from Parquet files. These files need to consist of data that matches the schema of the table. They do not need to be sorted or partitioned in any way. Ingest job requests are submitted to an SQS queue. They are then executed by tasks running on an ECS cluster. These tasks are scaled up using a lambda that periodically monitors the number of items on the queue. The tasks scale down naturally as if there are no jobs on the queue a task will terminate.
  • Query stack: This stack allows queries to be executed in lambdas. Queries are submitted to an SQS queue. This triggers a lambda that executes the queries. It can then write the results to files in an S3 bucket, or send the results to an SQS queue. A query can also be executed from a websocket client. In this case the results will be returned directly to a client. Note that queries can also be submitted directly from a Java client, and this does not require the query stack.
  • Topic stack: Notifications of errors and failures will appear on an SNS topic.
  • EMR bulk import stack: This allows large volumes of data in Parquet files to be ingested. A bulk import job request is sent to an SQS queue. This triggers a lambda that creates an EMR cluster. This cluster runs a Spark job to import the data. Once the job is finished the EMR cluster shuts down.
  • Persistent EMR bulk import stack: Similar to the above stack, but the EMR cluster is persistent, i.e. it never shuts down. This is appropriate if there is a steady stream of import jobs. The cluster can either be of fixed size or it can use EMR managed scaling.
  • Dashboard stack: This displays properties of the system in a Cloudwatch dashboard.

The following functionality is experimental:

  • Athena: There is experimental integration with Athena to allow SQL analytics to be run against a table.
  • Bulk import using Spark running on EKS: This stack allows data to be ingested by running Spark on EKS. Currently this runs the Spark executors in Fargate. Further work is required to enable the executors to be run on an ECS cluster consisting of EC2 instances.

Sleeper provides the tools to implement fine-grained security on the data, although further work is needed to make these easier to use. Briefly, the following steps are required:

  • Decide how to store the security information in the table, e.g. there might be one security label per row, or two per row, or one per cell. These fields must be added to the schema.
  • Write an iterator that will run on the results of every query to filter out records that a user is not permitted to see. This takes as input a user's authorisations and uses those to make a decision as to whether the user can see the data.
  • Ensure the Sleeper instance is deployed such that the boundary of the system is protected.
  • Ensure that queries are submitted to the query queue via a service that authenticates users, and passes their authorisations into the query time iterator configuration.

License

Sleeper is licensed under the Apache 2 license.

Documentation

See the documentation contained in the docs folder:

  1. Getting started
  2. Deployment guide
  3. Creating a schema
  4. Tables
  5. Ingesting data
  6. Checking the status of the system
  7. Retrieving data
  8. Python API
  9. Developer guide
  10. Design
  11. Common problems and their solutions
  12. Performance test
Comments
  • First run through of documentation to deploy

    First run through of documentation to deploy

    First run through of code and small documentation tweaks.

    General

    • Small fixes (/usr/bin/env, avoid use of sed -i where possible or give temp file extension and document versions of tools required) to run on MacOS

    01-getting-started

    • Fix broken link to 02-deployment-guide
    • Add note on bash version (MacOS comes with v3 and declare -A needs v4
    • Add note on cdk bootstrap. Although instructions are in 02-deployment-guide, I hadn't read that yet
    • Add note on needed S3 VPC - stopped cdk deploy from working on my first attempt.

    02-deployment-guide

    • Fix typo
    opened by ultrablob01 6
  • Hitting rate limits in GitHub Actions

    Hitting rate limits in GitHub Actions

    Here are some builds that failed because we hit a rate limit:

    https://github.com/gchq/sleeper/actions/runs/3235165786/jobs/5299217395 https://github.com/gchq/sleeper/actions/runs/3235165799/jobs/5299217705 https://github.com/gchq/sleeper/actions/runs/3235165823/jobs/5299217643

    It's not yet clear what exactly caused this. It looks like the failure has only shown up when publishing check results (test report, checkstyle report, spotbugs report). It's possible we've hit a limit just with those, or it might be related to our queries in the Check Build Status workflow.

    The checks we've used haven't logged out the full response from GitHub. We could try making a call explicitly to see the full output.

    If we move to a single workflow with modules on different jobs instead of workflows, we could maybe aggregate the checks together into a single report at the end. That would be covered by another issue: https://github.com/gchq/sleeper/issues/186

    build-pipeline 
    opened by patchwork01 3
  • Cache Maven plugin dependencies in GitHub Actions

    Cache Maven plugin dependencies in GitHub Actions

    We've had a number of builds fail to pull Maven dependencies, and it seems to always be for plugins.

    https://github.com/gchq/sleeper/actions/runs/3053630151/jobs/4924497172

    Looking at recent builds, it always seems to re-download the plugin dependencies. We can make sure these are cached.

    bug build-pipeline 
    opened by patchwork01 3
  • Remove comment about needing bash version >=4 from 01-getting-started.md

    Remove comment about needing bash version >=4 from 01-getting-started.md

    01-getting-started.md says that bash version >=4 is required ("At least version 4: use bash --version. The default version on a Mac is 3.2, which is too old"). This should no longer be the case. Verify that this deploys on a Mac which has not had its version of bash updated, and then remove this comment.

    Scripts to check

    /scripts/build

    • [x] build.sh

    /scripts/deploy

    • [x] buildAndDeploy.sh
    • [x] deploy.sh (called by buildAndDeploy.sh)
    • [x] pre-deployment.sh (called by deploy.sh, buildDeployTest.sh)
    • [x] removeUploads.sh (called by tearDown.sh)
    • [x] tearDown.sh
    • [x] uploadDockerImages.sh (called by pre-deployment.sh)
    • [x] uploadJars.sh (called by pre-deployment.sh)

    /scripts/dev

    • [x] updateVersionNumber.sh

    /scripts/test

    • [x] buildDeployTest.sh
    • [x] systemTestStatusReport.sh
    • [x] tearDown.sh

    /scripts/utility

    • [x] adminClient.sh
    • [x] deadLettersStatusReport.sh
    • [x] ecsTasksStatusReport.sh
    • [x] filesStatusReport.sh
    • [x] fullStatusReport.sh
    • [x] jobsStatusReport.sh
    • [x] lambdaQuery.sh
    • [x] partitionsStatusReport.sh
    • [x] pauseSystem.sh
    • [x] query.sh
    • [x] reinitialiseTable.sh
    • [x] restartSystem.sh
    • [x] webSocketQuery.sh
    enhancement docs 
    opened by gaffer01 3
  • Conditional check around ecr create repo CLI command

    Conditional check around ecr create repo CLI command

    In response to #29

    Modify the checking (in the form of an "if" now) around the "aws ecr describe-repositories" and conditional docker repository creation and hide confusing error message from the user.

    opened by ultrablob01 3
  • whitespace breaks sed command

    whitespace breaks sed command

    Apologies, the whitespace in the sed command -i "" broke the pre-deploy.sh script. The pull request fixes that issue.

    JAR_DIR:/home/ec2-user/environment/sleeper/scripts/jars
    VERSION:0.11.1-SNAPSHOT
    Generating properties
    sed: can't read : No such file or directory
    
    opened by ultrablob01 3
  • Handle compaction jobs running more than once in status update store

    Handle compaction jobs running more than once in status update store

    Compaction jobs are run as the result of an SQS message. SQS can deliver the message more than once. A job can fail in the middle if it's terminated or some other failure happens.

    It would be nice to report when a job was run more than once via the client.

    It will help to tell which status updates relate to the same execution of a job when we have the compaction task ID held against the job, as described in #163.

    compactions-module 
    opened by patchwork01 2
  • Split S3 state store into smaller classes

    Split S3 state store into smaller classes

    In preparation for putting compaction job status into the state store (issue https://github.com/gchq/sleeper/issues/9), split the S3 state store into smaller classes that we can add to more easily.

    enhancement statestore-module 
    opened by patchwork01 2
  • Issue 6 - Check that field names are unique in a schema

    Issue 6 - Check that field names are unique in a schema

    Note that this PR is stacked on top of #123, please merge that one first

    Used this IntelliJ plugin to autogenerate the builder before adding validation: https://plugins.jetbrains.com/plugin/7354-innerbuilder

    Resolves #6

    pr-base-for-stacking 
    opened by patchwork01 2
  • Make properties on S3AsyncClient configurable

    Make properties on S3AsyncClient configurable

    IngestRecordsUsingPropertiesSpecifiedMethod calls S3AsyncClient.create(). We should allow properties to be set on this client, specifically to control the chunk size used for uploads. It probably defaults to 8MB, which is too small. Suggest default value of 128MB instead.

    enhancement ingest-module 
    opened by gaffer01 2
  • Improve observability of compaction jobs

    Improve observability of compaction jobs

    There is not one place to go to to find out all the information about the lifecycle of a compaction job (i.e. when it was created, when it was pulled off the queue, how long it took to run, whether it succeeded, etc). This information is scattered around various logs in Cloudwatch. We should record this information in a Dynamo table.

    Suggested design:

    • Have one DynamoDB table that will be used to record information about the lifecycle of all compaction jobs (for all Sleeper tables, i.e. not one Dynamo table per Sleeper table).
    • Key design: hash key of compaction job id, sort key of timestamp of update.
    • Record the following stages of the lifecycle of a compaction job:
      • Job creation
      • Job pulled off the queue
      • Job finish time
      • Job finish status - success? total number of records read, number written (these 2 are not necessarily the same as an iterator may filter out records), the rate at which records were written).

    Note that it is possible that a compaction job may be pulled off the queue twice as SQS does not guarantee that a message will only be delivered once, so maybe each compaction task should get a unique id so that we can separate the updates from the different tasks?

    We could also record the lifecycle of compaction ECS tasks - creation time, total run time, total number of records processed, etc.

    We will need a Java class that can report the status of a particular compaction job (by querying Dynamo for the relevant information), and a script in scripts/utility to make that class easy to use.

    We will also need to update the documentation to explain how to use that script.

    enhancement compactions-module 
    opened by gaffer01 2
  • Use human-readable duration format in report tables

    Use human-readable duration format in report tables

    There may be a good way to do this with the Java Duration class. Its toString method returns an ISO-8601 duration format that just formats hours, minutes, fractional seconds. There might be a way to specify a more human readable format.

    clients-module 
    opened by patchwork01 0
  • Upgrade AWS EMR, Spark and Hadoop

    Upgrade AWS EMR, Spark and Hadoop

    Split from https://github.com/gchq/sleeper/issues/467

    Hadoop 3.3 and onwards support java 11: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions

    AWS EMR 6.9.0 includes Hadoop 3.3.3 and Spark 3.3.0: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-690-release.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop.html https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark.html

    We need to update the Maven dependencies, and the Docker image for bulk import.

    opened by patchwork01 3
  • Issue 438 - Build on JDK 17 targeting Java 11

    Issue 438 - Build on JDK 17 targeting Java 11

    Resolves #438

    Hadoop also needs upgrading (along with AWS EMR & Spark), because Hadoop only supports Java 11 starting in version 3.3:

    https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions

    There's a separate issue for that, although we could combine it back in: https://github.com/gchq/sleeper/issues/472

    Tested with system test, so it doesn't seem like there are any immediate issues with the old Hadoop version.

    opened by kr565370 1
  • Update Maven dependencies after Java 11 upgrade

    Update Maven dependencies after Java 11 upgrade

    Follow-up from #438

    Hopefully we can also remove the configuration to allow access to private classes for GSON.

    The version of Spark has to be compatible with EMR. Hadoop also has to be compatible with those. We should add a comment in the pom explaining this.

    opened by patchwork01 1
  • Deploy system test paused, wait for ingest to finish

    Deploy system test paused, wait for ingest to finish

    For performance tests like in #340, we need a way to wait for the ingest to finish. We'd like to be able to watch the status as the ingest happens. The system needs to be paused so that the test can take control over what happens next.

    system-test-module 
    opened by patchwork01 0
Owner
GCHQ
GCHQ
SecureDB is an extension for Ai2 Appinventor and its distros which stores the data in the form of key and value just like TinyDB but in a more secure manner.

SecureDB SecureDB is an extension for Ai2 Appinventor and its distros which stores data for your app in a secure format locally on user's device. Expl

Akshat Developer 3 Sep 24, 2022
Sample serverless application written in Java compiled with GraalVM native-image

Serverless GraalVM Demo This is a simple serverless application built in Java and uses the GraalVM native-image tool. It consists of an Amazon API Gat

AWS Samples 143 Dec 22, 2022
Joyce is a highly scalable event-driven Cloud Native Data Hub.

Joyce Component docker latest version Build Import Gateway sourcesense/joyce-import-gateway Joyce Kafka Connect sourcesense/joyce-kafka-connect Mongod

Sourcesense 37 Oct 6, 2022
G&C (Good & Cheap) is a web application with the objective of ensuring sustainable consumption and production patterns in our cities.

MUBISOFT ECO Table of Contents G&C, Keep It Fresh! Sustainable Development Goals Application Requirements G&C, Keep It Fresh! G&C (Good & Cheap) is a

null 4 May 2, 2022
A tool to bypass portforwarding using some cheap VServer

Reverse-PortForward This tool bypasses port restrictions of your router using some not-very-powerful server (those are really cheap.) How to set it up

Daniel H. 3 Jan 9, 2022
Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logic in a scalable and resilient way.

Cadence This repo contains the source code of the Cadence server and other tooling including CLI, schema tools, bench and canary. You can implement yo

Uber Open Source 6.5k Jan 4, 2023
A demo shopping cart Java Akka Serverless

shopping-cart-java-akka-serverless This project is an Akka Serverless service that demonstrates a simple shopping cart implemented as an Akka Serverle

Hugh McKee 7 Dec 3, 2021
循序渐进,学习Spring Boot、Spring Boot & Shiro、Spring Batch、Spring Cloud、Spring Cloud Alibaba、Spring Security & Spring Security OAuth2,博客Spring系列源码:https://mrbird.cc

Spring 系列教程 该仓库为个人博客https://mrbird.cc中Spring系列源码,包含Spring Boot、Spring Boot & Shiro、Spring Cloud,Spring Boot & Spring Security & Spring Security OAuth2

mrbird 24.8k Jan 6, 2023
一个涵盖六个专栏:Spring Boot 2.X、Spring Cloud、Spring Cloud Alibaba、Dubbo、分布式消息队列、分布式事务的仓库。希望胖友小手一抖,右上角来个 Star,感恩 1024

友情提示:因为提供了 50000+ 行示例代码,所以艿艿默认注释了所有 Maven Module。 胖友可以根据自己的需要,修改 pom.xml 即可。 一个涵盖六个主流技术栈的正经仓库: 《Spring Boot 专栏》 《Spring Cloud Alibaba 专栏》 《Spring Clou

芋道源码 15.7k Dec 31, 2022
一套涵盖大部分核心组件使用的Spring Cloud教程,包括Spring Cloud Alibaba及分布式事务Seata,基于Spring Cloud Greenwich及SpringBoot 2.1.7。22篇文章,篇篇精华,32个Demo,涵盖大部分应用场景。

springcloud-learning 简介 一套涵盖大部分核心组件使用的Spring Cloud教程,包括Spring Cloud Alibaba及分布式事务Seata,基于Spring Cloud Greenwich及SpringBoot 2.1.7。22篇文章,篇篇精华,32个Demo,涵盖

macro 5.6k Dec 30, 2022
Demo microservice architecture with Spring ,Spring Cloud Gateway , Spring Cloud config server , Eureuka , keycloak and Docker.

spring-microservice Demo microservice architecture with Spring ,Spring Cloud Gateway , Spring Cloud config server , Eureuka , keycloak and Docker. Arc

null 4 Sep 13, 2022
A spring cloud infrastructure provides various of commonly used cloud components and auto-configurations for high project consistency

A spring cloud infrastructure provides various of commonly used cloud components and auto-configurations for high project consistency.

Project-Hephaestus 2 Feb 8, 2022
A high availability shopping(ecommerce) system using SpringBoot, Spring Cloud, Eureka Server, Spring Cloud Gateway, resillience4j, Kafka, Redis and MySQL.

High-availability-shopping-system A high availability shopping(ecommerce) system using SpringBoot, Spring Cloud, Eureka Server, Spring Cloud Gateway,

LeiH 1 Oct 26, 2022
Cloud Native and Low Code Platform to create FullStack web Admin applications in minutes

Cloud Native and Low Code Platform to create FullStack web Admin applications in minutes ✨ Features & Technologies REST API generator Low Code CRUD &

Gemini Framework 171 Dec 26, 2022
Cloud native multi-runtime microservice framework

Femas: Cloud native multi-runtime microservice framework The repository address has been transferred to PolarisMesh English | 简体中文 Introduction abilit

Tencent 17 Sep 5, 2022
PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

中文文档 What is PolarDB-X ? PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage and complex querying scen

null 1.2k Dec 31, 2022
A fast, light and cloud native OAuth 2.0 authorization microservices based on light-4j

A fast, light weight and cloud native OAuth 2.0 Server based on microservices architecture built on top of light-4j and light-rest-4j frameworks. Stac

null 291 Dec 17, 2022
SpringBoot based return value types are supported by browsers

SpringBoot based return value types are supported by browsers

Elone Hoo 5 Jun 24, 2022
Budget Proof Key for Code Exchange (PKCE) implementation using Java Spring-boot

Low Budget Proof Key for Code Exchange (PKCE) Implementation using Java Spring-boot Just for fun, low budget implementation of PKCE Auth Flow using a

Hardik Singh Behl 10 Dec 11, 2022