Aggregation query proxy is a scalable sidecar application that sits between a customer application and Amazon Keyspaces/DynamoDB

Overview

Aggregation-query-proxy

There are use-cases when you need to aggregate bounded result sets for a short time period from DynamoDB or Keyspaces, for example, an hourly or daily report, including all hourly or daily sales. However, Amazon DynamoDB and Keyspaces do not support the commonly seen SQL aggregation constructs such as COUNT, SUM, MIN, MAX, and GROUP BY, as aggregation queries with unbound number of partitions might take unpredictable time to execute. Because of this constraint, it is better to preprocess the operational data to do the aggregation, and storage of the processed data in Amazon DynamoDB/Keyspaces. This pattern provides a solution by placing a scalable aggregation proxy (sidecar) between your application and DynamoDB/Keyspaces.

The aggregation-query-proxy (AQP) consists of a scalable proxy layer that sits between your application and Amazon Keyspaces/DynamoDB.

It provides intermediate aggregation logic which allows existing application to execute aggregation queries against Amazon DynamoDB/Keyspaces.

The AQP converts the provided aggregation query (SQL-92) to a plain request (CQL/DDBPartiQL). After the plain response (json) has been received the AQP uses IonEngine to aggregate the plain response into the final result set in json format.

alt text

Create your yaml based on the template:

cp conf/keyspaces-aggregation-query-proxy.yaml.template conf/keyspaces-aggregation-query-proxy.yaml

Configure and build the app with Amazon Keyspaces

Set dataBaseName to KEYSPACES Set pathToKeyspacesConfigFile to /usr/app

Configure DataStax conf file

Prepare DataStax java driver conf file

Build the project

mvn install build.sh

Start a docker container with the app

docker run -it -p 8080:8080 simple-aggregation-query-app

Limitations

As a best practice we recommend executing bounded Amazon Keyspaces (CQL) or DynamoDB (PartiQL) requests against the Aggregation Query Proxy. In all cases, avoid unbounded aggregations queries (without WHERE clause). Unbounded aggregation queries might lead to unpredictable execution time, high JVM memory pressure on AQP nodes (OOM), or high Amazon DynamoDB/Keyspaces RCUs consumption.

Configure and build the app with Amazon DynamoDB

Set dataBaseName to DYNAMODB Set dynamoRegion to us-east-1

Build the project

mvn install build.sh

Start a docker container with the app

docker run -it -p 8080:8080 --env AWS_REGION="us-east-1" --env AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID --env AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY --env AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN simple-aggregation-query-app

Let's execute a simple aggregation query against Amazon Keyspaces

AUTH_BASIC=$(echo -n large-query-app:your_secret | base64)

http --follow --timeout 3600 GET 'http://0.0.0.0:8080/query-aggregation/select count(book_title) as books, award, avg(rank) as avg_rang from keyspaces_sample.keyspaces_sample_table GROUP BY award' Authorization:'Basic '$(AUTH_BASIC)

HTTP/1.1 200 OK Cache-Control: no-transform, max-age=60 Content-Encoding: gzip Content-Length: 195 Content-Type: application/json Date: Wed, 04 May 2022 01:59:16 GMT Vary: Accept-Encoding

{
    "response": [{"resultSet":[{"books":3,"award":"Kwesi Manu Prize","avg_rang":2e0},
                                 {"books":3,"award":"Richard Roe","avg_rang":2e0},
                                 {"books":3,"award":"Wolf","avg_rang":2e0}]}],
    "stats": {
        "elapsedTimeToAggregateDataInMs": 361,
        "elapsedTimeToRetrieveDataInMs": 120,
        "payloadSizeBytes": 626
    }
}

Let's execute a simple aggregation query against Amazon DynamoDB

http --auth-type basic -a large-query-app:secretEXAMPLE --follow --timeout 3600 GET 'http://0.0.0.0:8080/query-aggregation/select zipcode,pk,sum(amount) as total from "your_table" where pk in (%27ACCOUNT%23ACCOUNT40%23CUSTOMER%23CUSTOMER33%27, %27ACCOUNT%23ACCOUNT1%23CUSTOMER%23CUSTOMER61%27) group by zipcode, pk'

HTTP/1.1 200 OK Cache-Control: no-transform, max-age=60 Content-Encoding: gzip Content-Length: 229 Content-Type: application/json Date: Tue, 14 Jun 2022 14:57:15 GMT Vary: Accept-Encoding

{
    "response": [
        {
            "resultSet": [
                {
                    "pk": "ACCOUNT#ACCOUNT1#CUSTOMER#CUSTOMER61",
                    "total": 133.9894140356888,
                    "zipcode": 74545
                },
                {
                    "pk": "ACCOUNT#ACCOUNT40#CUSTOMER#CUSTOMER33",
                    "total": 4321.055836431855,
                    "zipcode": 56624
                }
            ]
        }
    ],
    "stats": {
        "elapsedTimeToAggregateDataInMs": 1210,
        "elapsedTimeToRetrieveDataInMs": 1346,
        "payloadSizeBytes": 194
    }
}

Local DynamoDB for JUnit test

Set localDDB to true to run all JUnit tests against DynamoDB

License

This project is licensed under the MIT-0

You might also like...

Scalable Time Series Data Analytics

Scalable Time Series Data Analytics

Time Series Data Analytics Working with time series is difficult due to the high dimensionality of the data, erroneous or extraneous data, and large d

Dec 7, 2022

Realtime SOS Android Application. Location (GPS + Cellular Network) tracing application by alerting guardians of the User.

WomenSaftey Women Safety Android Application: Realtime SOS Android Application. Designed a Location (GPS + Cellular Network) tracing application by al

Nov 19, 2022

eXist Native XML Database and Application Platform

eXist Native XML Database and Application Platform

eXist-db Native XML Database eXist-db is a high-performance open source native XML database—a NoSQL document database and application platform built e

Dec 30, 2022

The application is a PoC that helps in identifying modern bankers, potentially malicious and remote controlling applications abusing Android AccessibilityService.

Motivation Project aims to help in: identifying keyloggers and events hijacking malicious applications such as Anubis/TeaBot, identifying a "fake bank

Dec 9, 2022

Clone of real world Chatting application Whatsapp built on Android Studio and Firebase

Clone of real world Chatting application Whatsapp built on Android Studio and Firebase

WhatsappChatApp About This Project Clone of real world Chatting application Whatsapp built on Android Studio and Firebase Programming Language Used :

May 23, 2022

Connecting Java Application With a TDengine Database.

Connecting Java Application With a TDengine Database.

TDengine Java Connector English | 简体中文 'taos-jdbcdriver' is TDengine's official Java language connector, which allows Java developers to develop appli

Dec 10, 2022

MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.

MapDB: database engine MapDB combines embedded database engine and Java collections. It is free under Apache 2 license. MapDB is flexible and can be u

Dec 30, 2022
Comments
  • Bump kotlin-stdlib from 1.4.20 to 1.6.0 in /query-app

    Bump kotlin-stdlib from 1.4.20 to 1.6.0 in /query-app

    Bumps kotlin-stdlib from 1.4.20 to 1.6.0.

    Release notes

    Sourced from kotlin-stdlib's releases.

    Kotlin 1.6.0

    Changelog

    Android

    • KT-48019 Bundle Kotlin Tooling Metadata into apk artifacts
    • KT-47733 JVM / IR: Android Synthetic don't generate _findCachedViewById function

    Compiler

    New Features

    • KT-47984 In-place arguments inlining for @​InlineOnly functions
    • KT-12794 Allow runtime retention repeatable annotations when compiling under Java 8
    • KT-43714 Support annotations on class type parameters (AnnotationTarget.TYPE_PARAMETER)
    • KT-45949 Kotlin/Native: Improve bound check elimination
    • KT-43919 Support loading Java annotations on base classes and implementing interfaces' type arguments
    • KT-48194 Try to resolve calls where we don't have enough type information, using the builder inference despite the presence of the annotation
    • KT-47736 Support conversion from regular functional types to suspending ones in JVM IR
    • KT-39055 Support property delegate created via synthetic method instead of field

    Performance Improvements

    • KT-45185 FIR2IR: get rid of IrBuiltIns usages
    • KT-47918 JVM / IR: Performance degradation with const-bound for-cycles
    • KT-33835 Bytecode including unnecessary null checks for safe calls where left-hand side is non-nullable
    • KT-41510 Compilation of kotlin html DSL is still too slow
    • KT-48211 We spend a lot of time in ExpectActual declaration checker when there is very small amount of actual/expect declaration
    • KT-39054 Optimize delegated properties which call get/set on the given KProperty instance on JVM
    • KT-46615 Don't generate nullability assertions in methods for directly invoked lambdas

    Fixes

    • KT-49613 JVM / IR: "Exception during IR lowering" with java fun interface and it's non-trivial usage
    • KT-49548 "ClassCastException: java.util.ArrayList$Itr cannot be cast to kotlin.collections.IntIterator" with Iterable inside let
    • KT-22562 Deprecate calls to "suspend" named functions with single dangling lambda argument
    • KT-47120 JVM IR: NoClassDefFoundError when there are an extension and a regular function with the same name
    • KT-49477 Has ran into recursion problem with two interdependant delegates
    • KT-49442 ClassCastException on reporting [EXPOSED_FROM_PRIVATE_IN_FILE] Deprecation: private-in-file class should not expose 'private-in-class'
    • KT-49371 JVM / IR: "NoSuchMethodError" with multiple inheritance
    • KT-44843 PSI2IR: "org.jetbrains.kotlin.psi2ir.generators.ErrorExpressionException: null: KtCallExpression" with delegate who has name or parameter with the same name as a property
    • KT-49294 Turning FlowCollector into 'fun interface' leads to AbstractMethodError
    • KT-18282 Companion object referencing it's own method during construction compiles successfully but fails at runtime with VerifyError
    • KT-25289 Prohibit access to class members in the super constructor call of its companion and nested object
    • KT-32753 Prohibit @​JvmField on property in primary constructor that overrides interface property
    • KT-43433 Suspend conversion is disabled message in cases where it is not supported and quickfix to update language version is suggested
    • KT-49399 Building repeatable annotation with Container nested class fails with ISE: "Repeatable annotation class should have a container generated"
    • KT-49209 Default upper bound for type variables should be non-null
    • KT-49335 NPE in RepeatedAnnotationLowering.wrapAnnotationEntriesInContainer when using @Repeatable annotation from different file
    • KT-48876 java.lang.UnsupportedOperationException: org.jetbrains.kotlin.ir.expressions.impl.IrReturnableBlockImpl@4a729df2

    ... (truncated)

    Changelog

    Sourced from kotlin-stdlib's changelog.

    1.6.0

    Android

    • KT-48019 Bundle Kotlin Tooling Metadata into apk artifacts
    • KT-47733 JVM / IR: Android Synthetic don't generate _findCachedViewById function

    Compiler

    New Features

    • KT-47984 In-place arguments inlining for @​InlineOnly functions
    • KT-12794 Allow runtime retention repeatable annotations when compiling under Java 8
    • KT-43714 Support annotations on class type parameters (AnnotationTarget.TYPE_PARAMETER)
    • KT-45949 Kotlin/Native: Improve bound check elimination
    • KT-43919 Support loading Java annotations on base classes and implementing interfaces' type arguments
    • KT-48194 Try to resolve calls where we don't have enough type information, using the builder inference despite the presence of the annotation
    • KT-47736 Support conversion from regular functional types to suspending ones in JVM IR
    • KT-39055 Support property delegate created via synthetic method instead of field

    Performance Improvements

    • KT-45185 FIR2IR: get rid of IrBuiltIns usages
    • KT-47918 JVM / IR: Performance degradation with const-bound for-cycles
    • KT-33835 Bytecode including unnecessary null checks for safe calls where left-hand side is non-nullable
    • KT-41510 Compilation of kotlin html DSL is still too slow
    • KT-48211 We spend a lot of time in ExpectActual declaration checker when there is very small amount of actual/expect declaration
    • KT-39054 Optimize delegated properties which call get/set on the given KProperty instance on JVM
    • KT-46615 Don't generate nullability assertions in methods for directly invoked lambdas

    Fixes

    • KT-49613 JVM / IR: "Exception during IR lowering" with java fun interface and it's non-trivial usage
    • KT-49548 "ClassCastException: java.util.ArrayList$Itr cannot be cast to kotlin.collections.IntIterator" with Iterable inside let
    • KT-22562 Deprecate calls to "suspend" named functions with single dangling lambda argument
    • KT-47120 JVM IR: NoClassDefFoundError when there are an extension and a regular function with the same name
    • KT-49477 Has ran into recursion problem with two interdependant delegates
    • KT-49442 ClassCastException on reporting [EXPOSED_FROM_PRIVATE_IN_FILE] Deprecation: private-in-file class should not expose 'private-in-class'
    • KT-49371 JVM / IR: "NoSuchMethodError" with multiple inheritance
    • KT-44843 PSI2IR: "org.jetbrains.kotlin.psi2ir.generators.ErrorExpressionException: null: KtCallExpression" with delegate who has name or parameter with the same name as a property
    • KT-49294 Turning FlowCollector into 'fun interface' leads to AbstractMethodError
    • KT-18282 Companion object referencing it's own method during construction compiles successfully but fails at runtime with VerifyError
    • KT-25289 Prohibit access to class members in the super constructor call of its companion and nested object
    • KT-32753 Prohibit @​JvmField on property in primary constructor that overrides interface property
    • KT-43433 Suspend conversion is disabled message in cases where it is not supported and quickfix to update language version is suggested
    • KT-49399 Building repeatable annotation with Container nested class fails with ISE: "Repeatable annotation class should have a container generated"
    • KT-49209 Default upper bound for type variables should be non-null
    • KT-49335 NPE in RepeatedAnnotationLowering.wrapAnnotationEntriesInContainer when using @Repeatable annotation from different file
    • KT-48876 java.lang.UnsupportedOperationException: org.jetbrains.kotlin.ir.expressions.impl.IrReturnableBlockImpl@4a729df2
    • KT-48131 IAE "Repeatable annotation container value must be a class reference" on using Kotlin-repeatable annotation from dependency

    ... (truncated)

    Commits
    • 829d1d8 Add changelog for 1.6.0
    • 99b69ae Merge KT-MR-4942: Mark packages for relocation to fix classpath interferring ...
    • 583488e [scripting] Fix NPE in aether.kt
    • 0d1f362 Fix PureAndroidAndJavaConsumeMppLibIT working with test project
    • 46af453 JVM KT-49613 don't generate indy reference to protected constructor
    • d5275aa Mark packages for relocation to fix classpath interferring in main-kts
    • a3820d4 JVM KT-49548 progression iterators can be tainted
    • 63044b1 Update -Xjvm-default description
    • e8e3c72 Update INTERFACE_CANT_CALL_DEFAULT_METHOD_VIA_SUPER message
    • ddd02fe JvmDefault. Allow non default inheritance with special flag
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump jackson-databind from 2.13.0 to 2.13.2.1 in /query-app

    Bump jackson-databind from 2.13.0 to 2.13.2.1 in /query-app

    Bumps jackson-databind from 2.13.0 to 2.13.2.1.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • the AQP filters out duplicated items

    the AQP filters out duplicated items

    the DDB connector ignores duplicated items, for example,

    Input: {"col2": 1, "col0": "0", "col1": "0"} {"col2": 1, "col0": "0", "col1": "0"} {"col2": 1, "col0": "0", "col1": "0"} {"col2": 1, "col0": "0", "col1": "0"} {"col2": 1, "col0": "0", "col1": "0"} {"col2": 1, "col0": "0", "col1": "0"} {"col2": 1, "col0": "0", "col1": "0"}

    Output: "response": [ { "resultSet": [ { "cnt": 1 } ] } ]

    good first issue 
    opened by nwheeler81 0
Owner
AWS Samples
AWS Samples
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Trino is a fast distributed SQL query engine for big data analytics. See the User Manual for deployment instructions and end user documentation. Devel

Trino 6.9k Dec 31, 2022
requery - modern SQL based query & persistence for Java / Kotlin / Android

A light but powerful object mapping and SQL generator for Java/Kotlin/Android with RxJava and Java 8 support. Easily map to or create databases, perfo

requery 3.1k Jan 5, 2023
The official home of the Presto distributed SQL query engine for big data

Presto Presto is a distributed SQL query engine for big data. See the User Manual for deployment instructions and end user documentation. Requirements

Presto 14.3k Dec 30, 2022
blockchain database, cata metadata query

Drill Storage Plugin for IPFS 中文 Contents Introduction Compile Install Configuration Run Introduction Minerva is a storage plugin of Drill that connec

null 145 Dec 7, 2022
Apache Drill is a distributed MPP query layer for self describing data

Apache Drill Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage sys

The Apache Software Foundation 1.8k Jan 7, 2023
A Java library to query pictures with SQL-like language

PicSQL A Java library to query pictures with SQL-like language. Features : Select and manipulate pixels of pictures in your disk with SQL-like dialect

Olivier Cavadenti 16 Dec 25, 2022
A Java library to query pictures with SQL-like language.

PicSQL A Java library to query pictures with SQL-like language. Features : Select and manipulate pixels of pictures in your disk with SQL-like dialect

null 16 Dec 25, 2022
DbLoadgen: A Scalable Solution for Generating Transactional Load Against a Database

DbLoadgen: A Scalable Solution for Generating Transactional Load Against a Database DbLoadgen is scalable solution for generating transactional loads

Qlik Partner Engineering 4 Feb 23, 2022
Fast scalable time series database

KairosDB is a fast distributed scalable time series database written on top of Cassandra. Documentation Documentation is found here. Frequently Asked

null 1.7k Dec 17, 2022
A scalable, distributed Time Series Database.

___ _____ ____ ____ ____ / _ \ _ __ ___ _ _|_ _/ ___|| _ \| __ ) | | | | '_ \ / _ \ '_ \| | \___ \| | | | _ \

OpenTSDB 4.8k Dec 26, 2022