A Hybrid Serving & Analytical Processing Database.

Overview

DingoDB

DingoDB is a real-time Hybrid Serving & Analytical Processing (HSAP) Database. It can execute high-frequency queries and upsert, interactive analysis, multi-dimensional analysis in extremely low latency.

Key Features

  1. Compliant with MySQL-Compatible Based on the popular Apache Calcite SQL engine, DingoDB can parse, optimize and execute standard SQL statements, and is capable to run part of TPC-H and TPC-DS (See TPC) queries. DingoDB is also compliant with JDBC and can be seamlessly integrated with web services, BI tools, etc.
  2. Support high frequency write operation
    By using the log-structured key-value storage RocksDB, DingoDB support high frequency write operations like INSERT, UPDATE, DELETE.
  3. Support point query and multi-dimensional analysis simultaneously
    DingoDB can store table data in both row-oriented and column-oriented format, providing capability of fast point query and fast multi-dimensional analysis in low latency.
  4. Easily integrated with streaming data and other DBMS's
    By providing dedicated APIs for popular streaming data processing engine, e.g. Apache Flink, DingoDB can easily accept data from them, and support more analysis working or web serving that is not applicable to be done in stream. DingoDB can also access databases of many types, using pluggable connectors for each of them.
  5. A distributed architecture with flexible and elastic scalability
    DingoDB stores and processes data in a distributed manner with strong cluster and resource management functionality, which make it easy to expand the capacity.
  6. Supports high availability with automatic failover when a minority of replicas fail; transparent to applications.

Documentation

The documentation of DingoDB is located on the website: https://dingodb.readthedocs.io or in the docs/ directory of the source code.

Developing DingoDB

We recommend IntelliJ IDEA to develop the DingoDB codebase. Minimal requirements for an IDE are:

  • Support for Java
  • Support for Gradle

How to make a clean pull request

  • Create a personal fork of dingo on GitHub.
  • Clone the fork on your local machine. Your remote repo on GitHub is called origin.
  • Add the original repository as a remote called upstream.
  • If you created your fork a while ago be sure to pull upstream changes into your local repository.
  • Create a new branch to work on. Branch from develop.
  • Implement/fix your feature, comment your code.
  • Follow the code style of Google code style, including indentation.
  • If the project has tests run them!
  • Add unit tests that test your new code.
  • In general, avoid changing existing tests, as they also make sure the existing public API is unchanged.
  • Add or change the documentation as needed.
  • Squash your commits into a single commit with git's interactive rebase.
  • Push your branch to your fork on GitHub, the remote origin.
  • From your fork open a pull request in the correct branch. Target the Dingo's develop branch.
  • Once the pull request is approved and merged you can pull the changes from upstream to your local repo and delete your branch.
  • Last but not least: Always write your commit messages in the present tense. Your commit message should describe what the commit, when applied, does to the code – not what you did to the code.

IntelliJ IDEA

The IntelliJ IDE supports Java and Gradle out of the box. Download it at IntelliJ IDEA website.

Special Thanks

DataCanvas

DingoDB is Sponsored by DataCanvas, a new platform to do data science and data process in real-time.

Java Profiler tools: YourKit

YourKit

I highly recommend YourKit Java Profiler for any preformance critical application you make.

Check it out at https://www.yourkit.com/

DingoDB is an open-source project licensed in Apache License Version 2.0, welcome any feedback from the community. For any support or suggestion, please contact us.

Comments
  • [dingo-server-coordinator] Fix drop table cause the column query for the

    [dingo-server-coordinator] Fix drop table cause the column query for the

    Fix drop table cause the column query for the first table return null.

    This bug only affect online cache of coordinator, there is no data loss.

    Signed-off-by: Ketor [email protected]

    opened by ketor 1
  • [dingo-store-mpu, dingo-net-netty] Use RocksDB checkpoint to implement backup mechanism and fix a FileReceiver bug.

    [dingo-store-mpu, dingo-net-netty] Use RocksDB checkpoint to implement backup mechanism and fix a FileReceiver bug.

    1. Add checkpoint code in RocksStorage , also add unit-tests for new checkpoint code.
    2. Remove deprecated backup code.
    3. Remove all backup calls in RocksDB event listener.
    4. Fix FileReceiver bug, now call registerTagMessageListener in FileSender.

    Best wishes to dingo maintainers!

    And Dingo is an amazing database project!!!

    opened by ketor 1
  • [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    opened by guojn1 1
  • [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    opened by guojn1 1
  • [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    opened by guojn1 1
  • [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    [dingo-calcite,dingo-common,dingo-ddl,dingo-exec,dingo-mirror-processing-unit,dingo-server,dingo-store-api,dingo-store-mpu,dingo-store-raft] create table with partition

    opened by guojn1 1
  • [dingo-ddl] Extend calcite SQL syntax

    [dingo-ddl] Extend calcite SQL syntax

    Apache calcite supports limited SQL syntax, such as truncate, create table with(), and so on. We need to enrich SQL syntax to support the function of extending a qualified database

    opened by guojn1 1
  • [DIP-0] DingoDB Improvement Proposals

    [DIP-0] DingoDB Improvement Proposals

    DingoDB Improvement Proposals

    Purpose

    The purpose of a DingoDB Improvement Proposal (DIP) is to introduce any major change into DingoDB. This is required in order to balance the need to support new features, uses cases, while avoiding accidentally introducing half thought-out interfaces that cause needless problems when changed.

    What is considered a major change that needs a DIP?

    Any of the following should be considered a major change:

    • Any major new feature, subsystem, or piece of functionality
    • Any change that impacts the public interfaces of the project

    What are the "public interfaces" of the project? All of the following are public interfaces that people build around:

    • Data types
    • SQL
    • REST endpoints
    • Data passed between backend and frontend
    • Configuration
    • Command line tools and arguments

    What should be included in a DIP?

    A DIP should contain the following sections:

    • Motivation: describe the problem to be solved.
    • Proposed Change: describe the new thing you want to do. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences, depending on the scope of the change.
    • New or Changed Public Interfaces: impact to any of the "compatibility commitments" described above. We want to call these out in particular so everyone thinks about them.
    • New dependencies: describe any third-party libraries that the feature will require. In particular, make sure their license is compatible with the [Apache License v2.0] (https://www.apache.org/licenses/LICENSE-2.0).
    • Migration Plan and Compatibility: if this feature requires additional support for seamless upgrades describe how that will work. In particular, it’s important to mention if:
      • The feature requires a database migration;
      • The feature will coexist with similar functionality for some period of time, allowing for a deprecation period.
    • Rejected Alternatives: What are the other alternatives you considered and why are they worse? The goal of this section is to help people understand why this is the best solution now, and also to prevent churn in the future when old alternatives are reconsidered.

    Who should initiate the DIP?

    Anyone can initiate a DIP, but preferably someone with the intention of implementing it.

    Process

    1. Create an issue with the prefix “[DIP]” in the title. The issue will be tagged as “dip” by a committer, and the title will be updated with the current DIP number.
    2. Notify the dingodb@zetyun mailing list that the DIP has been created, use the subject line [DISCUSS] DIP-0 DingoDB Improvement Proposals, the body of the email should look something like Please discuss & subscribe here: https://github.com/dingodb/dingo/issues/1
    3. When writing the issue, fill in the sections as described above in “What should be included in a DIP?”. You can use the template included at the end of this document.
    4. A committer will initiate the discussion, and ensure that there’s enough time to analyze the proposal. Before accepting the DIP, a committer should call for a vote, requiring 3 votes and no vetoes from committers. Votes are timebox at 1 week, and conducted through email (with the subject [VOTE]).
    5. Create a pull request implementing the DIP, and referencing the issue.

    Template

    [DIP] Proposal for _

    Motivation

    Description of the problem to be solved.

    Proposed Change

    Describe how the feature will be implemented, or the problem will be solved. If possible, include mocks, screenshots, or screencasts (even if from different tools).

    New or Changed Public Interfaces

    Describe any new additions to the model, views or REST endpoints. Describe any changes to existing sub modules.

    New dependencies

    Describe any packages that are required. Are they actively maintained? What are their licenses?

    Migration Plan and Compatibility

    Describe any database migrations that are necessary, or updates to stored URLs.

    Rejected Alternatives

    Describe alternative approaches that were considered and rejected.

    opened by huzx0608 0
Releases(dingo-v0.4.1)
  • dingo-v0.4.1(Oct 12, 2022)

    1. Feature and Optimization about SQL

    1.1 Features about SQL

    1.1.1 Extended SQL Syntax

    • Support TTL when create table using options
    • Support to assign partitions when create table

    1.1.2 Features about Complex Data Type

    • Support Operations about MAP
    • Support Operations about MultiSet
    • Support Operations about Array

    1.1.3 Support to use variables in SQL statement, such as insert, select, delete.

    1.1.4 Support stratagy to control messages transmitted between operators in execution plan

    1.1.5 Support new SQL function

    | No | Function Name | Description about Function | |----|---------------|----------------------------------------------------------------------------------------------------| | 1 | pow(x,y) | The POW() function returns the value of a number raised to the power of another number | | 2 | round(x,y) | The ROUND() function rounds a number to a specified number of decimal places | | 3 | ceiling(x) | The CEILING() function returns the smallest integer value that is bigger than or equal to a number | | 4 | floor(x) | The FLOOR() function returns the largest integer value that is smaller than or equal to a number | | 5 | mod(x,y) | The MOD() function returns the remainder of a number divided by another number | | 6 | abs(x) | The ABS() function returns the absolute (positive) value of a number. |

    1.2 Optimization about SQL

    • Optimizate query using range filter
    • Optimizate query about range scan
    • Optimizate type system about dingo internally
    • Optimization about SQL date/time/timestamp function

    2. Operation of Key-Value

    2.1 Equivalent operation of Key-Value and SQL

    • Support to do table operation using Key-Value API, such as create table, drop table
    • Support to insert, update, delete record in table using Key-Value API
    • Support to do table operation using Annotation API
    • Operations about table and record are equivalent between Key-Value API and SQL

    2.2 Operation lists about Key-Value SQL

    2.2.1 Basic Key-Value Operation

    | No | Function Name | Description about Function | |------|:-------|------------------------------| | 1 | put | insert or update records in table | | 2 | get | query records by user key | | 3 | delete | delete records by user key |

    2.2.2 Numerical operations

    | No | Funcation Name | Description about Function | |----|----------------|---------------------------------------------------| | 1 | add | add values on same data type | | 2 | sum | calculate the summary of columns filtered by keys | | 3 | max | calculate the max of columns filtered by keys | | 4 | min | calculate the min of columns filtered by keys |

    2.2.3 Compound operation

    | No | Function Name | Description about Function | |----|---------------|-------------------------------------------------------------------------------------------------------------| | 1 | Operate | do multiple operations on a single record, the operation list can be numerical operation or basic operation | | 2 | OperateList | do multiple operations on a single record | | 3 | UDF | defined using LUA script to implement user define function |

    2.2.4 Collection operations

    | No | Type | Function Name | Description about Function | |----|:------|---------------------|---------------------------------------------------| | 1 | read | size | get size of the elements | | 2 | read | get_all | get all the elements of collection | | 3 | read | get_by_key | get all the elements of collection by input key | | 4 | read | get_by_value | get all the elements of collection by input value | | 5 | read | get_by_index_range | get all the elements of collection by range index | | 6 | write | put | append a element to the end | | 7 | write | clear | clear all the elements of collection | | 8 | write | remove_by_key | remove the key from collection | | 9 | write | remove_all_by_value | remove all records match the value | | 10 | write | remove_by_index | remove record by index |

    2.2.5 Filter operations

    • DateFilter

    Query records using range filter with Date type.

    • NumberRange

    Query records using range filter with Numberic type.

    • StringRange

    Query records using range filter with String type

    • ValueEquals

    Query records with specifiy record value.

    3. Optimization about Storage

    3.1 Distributed Consistency Protocol

    • Refactor the implements of raft protocol to replace sofa-jraft
    • Refactor the implements about log replication and leader selection
    • Support new serialization about key and value

    3.2 Improvement about Rocksdb

    • Rocksdb can load configuration by files
    • Support TTL features using user timestamp
    • Update Rocksdb version and release package about io.dingodb. on maven central

    4. Other features

    • Support parameters using JDBC connection such as timeout
    • Support explain to view plan about Dingo SQL
    • Support to release related package to maven-central

    | No | Module | Description about module | |----|---------------------|----------------------------------------------------------| | 1 | dingo-driver-client | the jdbc driver client used by sql | | 2 | dingo-sdk | the key-value sdk client to do operation about key-value | | 3 | dingo-rocksdb | Extended features on rocksdb |

    Source code(tar.gz)
    Source code(zip)
  • dingo-v0.3.0(Jul 2, 2022)

    1.Semantics and Function of SQL

    1.1 New data type

    • Boolean
    • Date: default format yyyy-MM-dd
    • Time: default format HH:mm:ss
    • Timestamp: default format yyyy-MM-dd HH:mm:ss.SSS

    1.2 Allow assigning a default value to column, either constant or internal functions

    1.3 Support Join operation

    • Inner Join
    • Left Join
    • Right Join
    • Full Join
    • Cross Join

    1.4 Function list about String

    | No | Function Names | Notes about Function | |:--:|:---------------:|:-------------------------------------------------------------------------------------------------:| | 1 | Concat | Adds two or more expressions together | | 2 | Format | Formats a number to a format like "#,###,###.##", rounded to a specified number of decimal places | | 3 | Locate | The LOCATE() function returns the position of the first occurrence of a substring in a string | | 4 | Lower | Converts a string to lower-case | | 5 | Lcase | Converts a string to lower-case | | 6 | Upper | Converts a string to upper-case | | 7 | Ucase | Converts a string to upper-case | | 8 | Left | Extracts a number of characters from a string (starting from left) | | 9 | Right | Extracts a number of characters from a string (starting from right) | | 10 | Repeat | Repeats a string as many times as specified | | 11 | Replace | Replaces all occurrences of a substring within a string, with a new substring | | 12 | Trim | Removes leading and trailing spaces from a string | | 13 | Ltrim | Removes leading spaces from a string | | 14 | Rtrim | Removes trailing spaces from a string | | 15 | Mid | Extracts a substring from a string (starting at any position) | | 16 | Substring | Extracts a substring from a string (starting at any position) | | 17 | Reverse | Reverses a string and returns the result |

    1.5 Function list about Date and Time

    | No | Function Names | Notes about Function | |:--:|:-----------------:|:--------------------------------------------------:| | 1 | Now | Return current date and time | | 2 | CurrentDate | Return the current date | | 3 | Current_date | Return the current date | | 4 | CurTime | Return the current time | | 5 | Current_time | Return the current time | | 6 | Current_timestamp | Return the current date and time | | 7 | From_UnixTime | Convert unix time to timestamp | | 8 | Unix_Timestamp | Format the time to unix timestamp | | 9 | Date_Format | Formats a date | | 10 | DateDiff | Returns the number of days between two date values | | 11 | Time_Format | Formats a time by a specified format |

    2. Management of Replicator

    2.1 Management of metadata

    • Physical table can be split into N partitions based on data size
    • Management of physical tables such as table creation time, table status, partition strategy, split conditions, etc

    2.2 Scheduler of partition replicator

    • Support multiple partition modes, such as One table with one partition, One table with multiple partitions
    • Support multiple split strategies, such as auto-split or manually split by API
    • Support resource isolation between physical tables

    2.3 Tools of partition management

    • Support to view status about partition, such as leader, follower, etc
    • Support to migrate, split partition by internal API
    • Support to view metrics about partition, such as write, read latency, size, record count

    3. The data access method for DingoDB

    3.1 JDBC mode

    • Support to connect to dingo by JDBC

    3.2 SDK client mode

    • Support to put, get, and delete records to tables in dingo
    • Support to batch write records to tables in dingo

    3.3 Import data from external

    • Support to import data from local files in CSV, JSON format
    • Support to import data from Kafka in JSON and Avro format

    4. Tools and Monitor

    • Support to monitor dingo cluster by grafana and prometheus
    • Support to management partitions of the cluster by API
    • Support to adjust log level dynamically by tools
    • Support to deploy cluster by ansible or docker-compose
    • Newly add autotests more than 1300+
    Source code(tar.gz)
    Source code(zip)
  • dingo-v0.2.0(Jul 1, 2022)

    • Architecture

      1. Refactor DingoDB architecture abandon Zookeeper, Kafka and Helix.
      2. Using raft as the consensus protocol to make agreement across multiple nodes on membership selection and data replication.
      3. Region is proposed as the unit of data replication, it can be scheduled, split, managed by coordinator.
      4. The distributed file system is replaced by distributed key-value implemented by raft and rocksdb.
    • Distributed Storage

      1. Support region to replicate across multiple nodes.
      2. Support region to split based on policies such as key counts or region size.
      3. Support Region to perform periodic snapshot.
    • SQL

      1. Support more aggregation functions, such as min,max,avg, etc.
      2. Support insert into ... select.
    • Client Tools

      1. thin jdbc driver
    Source code(tar.gz)
    Source code(zip)
  • dingo-v0.1.0(Jul 1, 2022)

    DingoDB 0.1.0 Release Notes

    • Cluster
      1. Distributed computing. Cluster nodes are classified into coordinator role and executor role.
      2. Distributed meta data storage. Support creating and dropping meta data of tables.
      3. Coordinators support SQL parsing and optimizing, job creating and distributing, result collecting.
      4. Executors support task executing.
    • Data store
      1. Using RocksDB storage.
      2. Encoding and decoding in Apache Avro format.
      3. Table partitioning by hash of primary columns.
    • SQL parsing and executing
      1. Create (CREATE TABLE) and drop table (DROP TABLE).
      2. Supporting common SQL data types: TINYINT, INT, BIGINT, CHAR, VARCHAR, FLOAT, DOUBLE, BOOLEAN
      3. Insert into (INSERT INTO TABLE) and delete from (DELETE FROM TABLE) table.
      4. Query table (SELECT).
      5. Support filtering and projecting in query.
      6. Support expressions in filter conditions and projecting columns.
      7. Support point query.
    • User interface
      1. Command line interface (CLI)
      2. Support SQL input and executing in CLI.
      3. Output query results in table format in CLI.
      4. Output time consumed by query in CLI.
    Source code(tar.gz)
    Source code(zip)
Owner
DingoDB
A Hybrid Serving & Analytical Processing Database.
DingoDB
Apache Druid: a high performance real-time analytics database.

Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download Apache Druid Druid is a high performance real-time a

The Apache Software Foundation 12.3k Jan 1, 2023
eXist Native XML Database and Application Platform

eXist-db Native XML Database eXist-db is a high-performance open source native XML database—a NoSQL document database and application platform built e

eXist-db.org 363 Dec 30, 2022
Flyway by Redgate • Database Migrations Made Easy.

Flyway by Redgate Database Migrations Made Easy. Evolve your database schema easily and reliably across all your instances. Simple, focused and powerf

Flyway by Boxfuse 6.9k Jan 9, 2023
MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.

MapDB: database engine MapDB combines embedded database engine and Java collections. It is free under Apache 2 license. MapDB is flexible and can be u

Jan Kotek 4.6k Dec 30, 2022
Realm is a mobile database: a replacement for SQLite & ORMs

Realm is a mobile database that runs directly inside phones, tablets or wearables. This repository holds the source code for the Java version of Realm

Realm 11.4k Jan 5, 2023
Transactional schema-less embedded database used by JetBrains YouTrack and JetBrains Hub.

JetBrains Xodus is a transactional schema-less embedded database that is written in Java and Kotlin. It was initially developed for JetBrains YouTrack

JetBrains 1k Dec 14, 2022
Flyway by Redgate • Database Migrations Made Easy.

Flyway by Redgate Database Migrations Made Easy. Evolve your database schema easily and reliably across all your instances. Simple, focused and powerf

Flyway by Boxfuse 6.9k Jan 5, 2023
MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.

MapDB: database engine MapDB combines embedded database engine and Java collections. It is free under Apache 2 license. MapDB is flexible and can be u

Jan Kotek 4.6k Jan 1, 2023
ObjectBox is a superfast lightweight database for objects

ObjectBox Java (Kotlin, Android) ObjectBox is a superfast object-oriented database with strong relation support. ObjectBox is embedded into your Andro

ObjectBox 4.1k Dec 30, 2022
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.

About CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time. CrateDB offers the

Crate.io 3.6k Jan 2, 2023
Transactional schema-less embedded database used by JetBrains YouTrack and JetBrains Hub.

JetBrains Xodus is a transactional schema-less embedded database that is written in Java and Kotlin. It was initially developed for JetBrains YouTrack

JetBrains 858 Mar 12, 2021
Java implementation of Condensation - a zero-trust distributed database that ensures data ownership and data security

Java implementation of Condensation About Condensation enables to build modern applications while ensuring data ownership and security. It's a one sto

CondensationDB 43 Oct 19, 2022
R2DBC Driver for Oracle Database

About Oracle R2DBC The Oracle R2DBC Driver is a Java library that supports reactive programming with Oracle Database. Oracle R2DBC implements the R2DB

Oracle 159 Dec 13, 2022
Bu projede Mernis ile Tc kimlik no doğrulanarak database kayıt simülasyonu gerçekleştirildi.

?? CoffeShop Proje Hakkında Nitelikli Yazılımcı Geliştirme kampına aittir. Bu projede Mernis ile Tc kimlik no doğrulanarak database kayıt simülasyonu

Atakan Reyhanioglu 5 Dec 13, 2021
blockchain database, cata metadata query

Drill Storage Plugin for IPFS 中文 Contents Introduction Compile Install Configuration Run Introduction Minerva is a storage plugin of Drill that connec

null 145 Dec 7, 2022
DbLoadgen: A Scalable Solution for Generating Transactional Load Against a Database

DbLoadgen: A Scalable Solution for Generating Transactional Load Against a Database DbLoadgen is scalable solution for generating transactional loads

Qlik Partner Engineering 4 Feb 23, 2022
DB3 class for interacting with postgresql database

DatabaseDB3 - class for interacting with psql database in DB3 scripts database table should use username as the primary key and should be named "main"

Amo3 1 Aug 6, 2022