Apache Calcite

Overview

Maven Central Travis Build Status CI Status AppVeyor Build Status

Apache Calcite

Apache Calcite is a dynamic data management framework.

It contains many of the pieces that comprise a typical database management system but omits the storage primitives. It provides an industry standard SQL parser and validator, a customisable optimizer with pluggable rules and cost functions, logical and physical algebraic operators, various transformation algorithms from SQL to algebra (and the opposite), and many adapters for executing SQL queries over Cassandra, Druid, Elasticsearch, MongoDB, Kafka, and others, with minimal configuration.

For more details, see the home page.

Comments
  • [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

    [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

    By examining a simple query through flame graph (see issue), one interesting point is that I find there are too many calls using reflection, which is not performant, although the total overhead is less than 1%, I still spend some time trying to improve. Most invocations are rooted down to ReflectiveVisitDispatcher, the current implementation creates new instance whenever needed, and looking up methods by reflection per instance, I think by caching methods globally, as the methods count is countable to 68 possible places, different ReflectiveVisitDispatcher in different thread is able to reuse. The fundamental change will benefit other likewise invocations as well.

    opened by neoremind 27
  • [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

    [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

    see:
    https://issues.apache.org/jira/browse/CALCITE-3737 https://issues.apache.org/jira/browse/CALCITE-3780

    Some highlights on this PR:

    1. support HOP as a table function.
    2. support SESSION as a table function.
    3. rename "table-valued function" to "table function" to improve naming.
    LGTM-will-merge-soon needs-a-final-review 
    opened by amaliujia 27
  • [CALCITE-3272] Support TUMBLE as Table Valued Function including an enumerable implementation, stream.iq and DESCRIPTOR

    [CALCITE-3272] Support TUMBLE as Table Valued Function including an enumerable implementation, stream.iq and DESCRIPTOR

    High level speaking, this PR adds the following support:

    SELECT * FROM TABLE(Tumble( TABLE ORDERS , DESCRIPTOR(ROWTIME) , INTERVAL '1' MINUTES))
    

    This PR adds TUMBLE as table value function and also adds stream.iq along with Enumerable implementation. This is a big PR that actually is also related to the following JIRAs:

    https://jira.apache.org/jira/browse/CALCITE-3340 https://jira.apache.org/jira/browse/CALCITE-3501 https://jira.apache.org/jira/browse/CALCITE-3499 https://jira.apache.org/jira/browse/CALCITE-3418 https://jira.apache.org/jira/browse/CALCITE-3339

    Note that DESCRIPTOR support is also included in this PR.

    needs-a-final-review 
    opened by amaliujia 27
  • [CALCITE-2913] Adapter for Apache Kafka

    [CALCITE-2913] Adapter for Apache Kafka

    Add an adapter to expose Kafka topics as STREAM tables.

    KafkaTableFactory is used here so end users need to specify table-topic mapping one-by-one.

    JIRA: https://issues.apache.org/jira/browse/CALCITE-2913

    CC: @danny0405

    LGTM-will-merge-soon 
    opened by mingmxu 26
  • [CALCITE-2808] Add the JSON_LENGTH function

    [CALCITE-2808] Add the JSON_LENGTH function

    JSON_LENGTH(**json_doc**[, *path*])
    

    Returns the length of a JSON document, or, if a path argument is given, the length of the value within the document identified by the path. Returns NULL if any argument is NULL or the path argument does not identify a value in the document. An error occurs if the json_doc argument is not a valid JSON document or the path argument is not a valid path expression or contains a {} or }}{{* wildcard.

    The length of a document is determined as follows:

    • The length of a scalar is 1.

    • The length of an array is the number of array elements.

    • The length of an object is the number of object members.

    • The length does not count the length of nested arrays or objects.

    Example Sql:

    SELECT JSON_LENGTH(v) AS c1
    ,JSON_LENGTH(v, 'lax $.a') AS c2
    ,JSON_LENGTH(v, 'strict $.a[0]') AS c3
    ,JSON_LENGTH(v, 'strict $.a[1]') AS c4
    FROM (VALUES ('{"a": [10, true]}')) AS t(v)
    LIMIT 10;
    

    Result:

    | c1 | c2 | c3 | c4 | | ---- | ---- | ---- | ---- | | 1 | 2 | 1 | 1 |

    LGTM-will-merge-soon 
    opened by XuQianJin-Stars 24
  • [CALCITE-2601] Add REVERSE function

    [CALCITE-2601] Add REVERSE function

    Fix ISSUE #2601

    mysql

    mysql> SELECT REVERSE('hello');
    +------------------+
    | REVERSE('hello') |
    +------------------+
    | olleh            |
    +------------------+
    1 row in set (0.00 sec)
    

    sql server

    
    DECLARE @str NVARCHAR(100) 
    
    SET @str='ABCD'
    
    SELECT REVERSE(@str)
    
    

    pg

    testdb=# SELECT REVERSE('abcd');
     reverse
    ---------
     dcba
    (1 row)
    

    oracle

    
    SQL> select reverse('12345') from dual;
    REVER
    
    54321
    
    

    doc:

    https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_reverse

    returned-with-feedback 
    opened by ambition119 23
  • [CALCITE-4368] TopDownOptTest fails if applying non-substitution rule first

    [CALCITE-4368] TopDownOptTest fails if applying non-substitution rule first

    Usually O_INPUTS are only applied for groups with physical convention. But when enabling AbstractConverter, the input of AbstractConverter might be a group with NONE convention. In that case, no need to apply O_INPUTS. Otherwise, it might throw an exception due to impossible transformation( physical convention -> none convention).

    Other change:

    • some cosmetic fix-ups
    • print the upper bound of the RelSubSet
    returned-with-feedback tests-missing 
    opened by chunweilei 22
  • [CALCITE-4787] Replace ImmutableBeans with Immutables

    [CALCITE-4787] Replace ImmutableBeans with Immutables

    Replace the use of reflection/dynamic proxies with the AnnotationProcessor provided by Immutables

    NOTE: This is an initial patch that only changes one ImmutableBean to Immutables to show what the changes look like

    opened by jacques-n 19
  • [CALCITE-1128] Implement JDBC batch update methods in remote driver

    [CALCITE-1128] Implement JDBC batch update methods in remote driver

    This commit provides an implementation for:

    • Statement.addBatch(String)
    • PreparedStatement.addBatch()
    • PreparedStatement.executeBatch()

    The implementation is fairly straightforward except for the addition of a new server interface: ProtobufMeta. This is a new interface which the Meta implementation can choose to also implement to provide a "native" implementation on top of Protobuf objects instead of the Avatica POJOs.

    During the investigations Avatica performance pre-1.7.0, it was found that converting protobufs to POJOs was a very hot code path. This short-circuit helps us avoid extra objects on the heap and computation to create them in what should be a very hot code path for write-workloads.

    opened by joshelser 19
  • CALCITE-1386 ITEM operator seems to ignore the value type of collection and assign the value to Object

    CALCITE-1386 ITEM operator seems to ignore the value type of collection and assign the value to Object

    Observed behavior is described here: https://issues.apache.org/jira/browse/CALCITE-1386

    Below is the description of this patch:

    • Modify MethodImplementor to cast return value to desired return type when necessary
    • Change ItemImplementor to use NullPolicy.ANY since ITEM can still return null even though both operands are not null
    • Fix Types.castIfNecessary to handle RecordType as an exceptional case (toClass() doesn't handle RecordType and throws Exception)
    • Change Csv tests to test its behavior

    Please let me know if I'm encouraged to do additional works. Thanks in advance!

    opened by HeartSaVioR 18
  • [CALCITE-4898] Upgrading Elasticsearch version from 7.0.1 to 7.15.2

    [CALCITE-4898] Upgrading Elasticsearch version from 7.0.1 to 7.15.2

    This PR upgrades embedded Elasticsearch version from 7.0.1 to 7.15.2.

    Description:

    • New dependencies: org.codelibs.elasticsearch.module:scripting-painless-spi, as module "org.elasticsearch.painless.spi" is removed after ES 7.15.0 in lang-painless
    • Third maven repo: org.codelibs.elasticsearch.module:lang-painless is no longer maintained after ES 7.10.2, which is migrated to https://maven.codelibs.org/
    • RestClient Upgrading: the low level rest client in ES has good compatibilities(just http request) among 7.X, which is also upgraded to 7.15.2

    Self-verification: I've run some tests locally to make sure new feature can be applied(not added in unit test).

    • Supported: new features like RareTerms、minimun_interval in auto_date_histogram can be successfully applied, which are not supported in ES 7.0.1
    • Not supported: top_metrics、multi_terms、rate and other features in x-pack are not supported currently. Those features can be registered when AnalyticsPlugin is loaded to build ES Node(test environment), however, dependency org.elasticsearch.plugin:x-pack-analytics cannot be reached through maven central at present
    opened by ILuffZhe 17
  • [CALCITE-5452] Add BigQuery LENGTH() as synonym for CHAR_LENGTH()

    [CALCITE-5452] Add BigQuery LENGTH() as synonym for CHAR_LENGTH()

    Add LENGTH() as a library function as an alias for the standard CHAR_LENGTH(). Some dependencies for soon-to-be deprecated standard functions refactored to avoid null pointer exceptions as a result of circular dependencies between the standard and library operators. This decision was made with the help of @mkou .

    opened by tanclary 1
  • [CALCITE-5436] Implement DATE_SUB, TIME_SUB, TIMESTAMP_SUB (compatible w/ BigQuery)

    [CALCITE-5436] Implement DATE_SUB, TIME_SUB, TIMESTAMP_SUB (compatible w/ BigQuery)

    Add support for BigQuery's DATE_SUB, TIME_SUB, and TIMESTAMP_SUB functions. Create MINUS_DATE2 operator to handle subtracting an interval from a timestamp, time, or date expression. This differs from the standard MINUS_DATE operator which takes 3 arguments which is designed to subtract one time expression from another. Add week and quarter as valid time units for intervals in the parser.

    opened by tanclary 0
Owner
The Apache Software Foundation
The Apache Software Foundation
Apache Druid: a high performance real-time analytics database.

Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download Apache Druid Druid is a high performance real-time a

The Apache Software Foundation 12.3k Jan 1, 2023
Apache Hive

Apache Hive (TM) The Apache Hive (TM) data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storag

The Apache Software Foundation 4.6k Dec 28, 2022
The Chronix Server implementation that is based on Apache Solr.

Chronix Server The Chronix Server is an implementation of the Chronix API that stores time series in Apache Solr. Chronix uses several techniques to o

Chronix 262 Jul 3, 2022
Apache Pinot - A realtime distributed OLAP datastore

What is Apache Pinot? Features When should I use Pinot? Building Pinot Deploying Pinot to Kubernetes Join the Community Documentation License What is

The Apache Software Foundation 4.4k Dec 30, 2022
Apache Ant is a Java-based build tool.

Apache Ant What is it? ----------- Ant is a Java based build tool. In theory it is kind of like "make" without makes wrinkles and with

The Apache Software Foundation 355 Dec 22, 2022
Apache Aurora - A Mesos framework for long-running services, cron jobs, and ad-hoc jobs

NOTE: The Apache Aurora project has been moved into the Apache Attic. A fork led by members of the former Project Management Committee (PMC) can be fo

The Apache Software Foundation 627 Nov 28, 2022
Apache Drill is a distributed MPP query layer for self describing data

Apache Drill Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage sys

The Apache Software Foundation 1.8k Jan 7, 2023
Flink Connector for Apache Doris(incubating)

Flink Connector for Apache Doris (incubating) Flink Doris Connector More information about compilation and usage, please visit Flink Doris Connector L

The Apache Software Foundation 115 Dec 20, 2022
HurricaneDB a real-time distributed OLAP engine, powered by Apache Pinot

HurricaneDB is a real-time distributed OLAP datastore, built to deliver scalable real-time analytics with low latency. It can ingest from batch data sources (such as Hadoop HDFS, Amazon S3, Azure ADLS, Google Cloud Storage) as well as stream data sources (such as Apache Kafka).

GuinsooLab 4 Dec 28, 2022
Calcite Clojure wrapper / integration

calcite-clj - Use Apache Calcite from Clojure Small library to facilitate the implementation of calcite adapters in clojure. It implements org.apache.

Eugen Stan 24 Nov 5, 2022
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine l

Oryx Project 1.8k Dec 28, 2022
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine l

Oryx Project 1.7k Mar 12, 2021
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine l

Oryx Project 1.8k Dec 28, 2022
Equivalent Exchange 3 Apache 2 Equivalent Exchange 3 pahimar Equivalent-Exchange-3. Mods for Minecraft. License: Apache 2 , .

Welcome to Equivalent Exchange 3! All versions are available here Minecraft Forums page Compiling EE3 - For those that want the latest unreleased feat

Rob Davis 709 Dec 15, 2022
Apache Solr is an enterprise search platform written in Java and using Apache Lucene.

Apache Solr is an enterprise search platform written in Java and using Apache Lucene. Major features include full-text search, index replication and sharding, and result faceting and highlighting.

The Apache Software Foundation 630 Dec 28, 2022
FLiP: StreamNative: Cloud-Native: Streaming Analytics Using Apache Flink SQL on Apache Pulsar

StreamingAnalyticsUsingFlinkSQL FLiP: StreamNative: Cloud-Native: Streaming Analytics Using Apache Flink SQL on Apache Pulsar Running on NVIDIA XAVIER

Timothy Spann 5 Dec 19, 2021
Apache Cayenne is an open source persistence framework licensed under the Apache License

Apache Cayenne is an open source persistence framework licensed under the Apache License, providing object-relational mapping (ORM) and remoting services.

The Apache Software Foundation 284 Dec 31, 2022