The official home of the Presto distributed SQL query engine for big data

Overview

Presto Build Status

Presto is a distributed SQL query engine for big data.

See the User Manual for deployment instructions and end user documentation.

Requirements

  • Mac OS X or Linux
  • Java 8 Update 151 or higher (8u151+), 64-bit. Both Oracle JDK and OpenJDK are supported.
  • Maven 3.3.9+ (for building)
  • Python 2.4+ (for running with the launcher script)

Building Presto

Presto is a standard Maven project. Simply run the following command from the project root directory:

./mvnw clean install

On the first build, Maven will download all the dependencies from the internet and cache them in the local repository (~/.m2/repository), which can take a considerable amount of time. Subsequent builds will be faster.

Presto has a comprehensive set of unit tests that can take several minutes to run. You can disable the tests when building:

./mvnw clean install -DskipTests

Running Presto in your IDE

Overview

After building Presto for the first time, you can load the project into your IDE and run the server. We recommend using IntelliJ IDEA. Because Presto is a standard Maven project, you can import it into your IDE using the root pom.xml file. In IntelliJ, choose Open Project from the Quick Start box or choose Open from the File menu and select the root pom.xml file.

After opening the project in IntelliJ, double check that the Java SDK is properly configured for the project:

  • Open the File menu and select Project Structure
  • In the SDKs section, ensure that a 1.8 JDK is selected (create one if none exist)
  • In the Project section, ensure the Project language level is set to 8.0 as Presto makes use of several Java 8 language features

Presto comes with sample configuration that should work out-of-the-box for development. Use the following options to create a run configuration:

  • Main Class: com.facebook.presto.server.PrestoServer
  • VM Options: -ea -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:+UseGCOverheadLimit -XX:+ExplicitGCInvokesConcurrent -Xmx2G -Dconfig=etc/config.properties -Dlog.levels-file=etc/log.properties
  • Working directory: $MODULE_DIR$
  • Use classpath of module: presto-main

The working directory should be the presto-main subdirectory. In IntelliJ, using $MODULE_DIR$ accomplishes this automatically.

Additionally, the Hive plugin must be configured with location of your Hive metastore Thrift service. Add the following to the list of VM options, replacing localhost:9083 with the correct host and port (or use the below value if you do not have a Hive metastore):

-Dhive.metastore.uri=thrift://localhost:9083

Using SOCKS for Hive or HDFS

If your Hive metastore or HDFS cluster is not directly accessible to your local machine, you can use SSH port forwarding to access it. Setup a dynamic SOCKS proxy with SSH listening on local port 1080:

ssh -v -N -D 1080 server

Then add the following to the list of VM options:

-Dhive.metastore.thrift.client.socks-proxy=localhost:1080

Running the CLI

Start the CLI to connect to the server and run SQL queries:

presto-cli/target/presto-cli-*-executable.jar

Run a query to see the nodes in the cluster:

SELECT * FROM system.runtime.nodes;

In the sample configuration, the Hive connector is mounted in the hive catalog, so you can run the following queries to show the tables in the Hive database default:

SHOW TABLES FROM hive.default;

Code Style

We recommend you use IntelliJ as your IDE. The code style template for the project can be found in the codestyle repository along with our general programming and Java guidelines. In addition to those you should also adhere to the following:

  • Alphabetize sections in the documentation source files (both in table of contents files and other regular documentation files). In general, alphabetize methods/variables/sections if such ordering already exists in the surrounding code.
  • When appropriate, use the Java 8 stream API. However, note that the stream implementation does not perform well so avoid using it in inner loops or otherwise performance sensitive sections.
  • Categorize errors when throwing exceptions. For example, PrestoException takes an error code as an argument, PrestoException(HIVE_TOO_MANY_OPEN_PARTITIONS). This categorization lets you generate reports so you can monitor the frequency of various failures.
  • Ensure that all files have the appropriate license header; you can generate the license by running mvn license:format.
  • Consider using String formatting (printf style formatting using the Java Formatter class): format("Session property %s is invalid: %s", name, value) (note that format() should always be statically imported). Sometimes, if you only need to append something, consider using the + operator.
  • Avoid using the ternary operator except for trivial expressions.
  • Use an assertion from Airlift's Assertions class if there is one that covers your case rather than writing the assertion by hand. Over time we may move over to more fluent assertions like AssertJ.
  • When writing a Git commit message, follow these guidelines.

Building the Web UI

The Presto Web UI is composed of several React components and is written in JSX and ES6. This source code is compiled and packaged into browser-compatible Javascript, which is then checked in to the Presto source code (in the dist folder). You must have Node.js and Yarn installed to execute these commands. To update this folder after making changes, simply run:

yarn --cwd presto-main/src/main/resources/webapp/src install

If no Javascript dependencies have changed (i.e., no changes to package.json), it is faster to run:

yarn --cwd presto-main/src/main/resources/webapp/src run package

To simplify iteration, you can also run in watch mode, which automatically re-compiles when changes to source files are detected:

yarn --cwd presto-main/src/main/resources/webapp/src run watch

To iterate quickly, simply re-build the project in IntelliJ after packaging is complete. Project resources will be hot-reloaded and changes are reflected on browser refresh.

Release Notes

When authoring a pull request, the PR description should include its relevant release notes. Follow Release Notes Guidelines when authoring release notes.

Comments
  • Fix optimized parquet reader complex hive types processing

    Fix optimized parquet reader complex hive types processing

    • Fix reading repeated fields, when parquet consists of multiple pages, so the beginning of the field can be on one page and it's ending on the next page.

    • Support empty arrays read

    • Determine null values of optional fields

    • Add tests for hive complex types: arrays, maps and structs

    • Rewrite tests to read parquets consising of multiple pages

    • Add TestDataWritableWriter with patch for empty array and empty map because the bug https://issues.apache.org/jira/browse/HIVE-13632 is already fixed in current hive version, so presto should be able to read empty arrays too

    CLA Signed 
    opened by kgalieva 77
  • Add support for prepared statements in JDBC driver

    Add support for prepared statements in JDBC driver

    I'm using presto-jdbc-0.66-SNAPSHOT.jar, and trying to execute presto query to presto-server on my java application.

    Below sample code, using jdbc statement, is working well.

        Class.forName("com.facebook.presto.jdbc.PrestoDriver");
        Connection connection = DriverManager.getConnection("jdbc:presto://192.168.33.33:8080/hive/default", "hive", "hive");
    
        Statement statement = connection.createStatement();
        ResultSet rs = statement.executeQuery("SHOW TABLES");
        while(rs.next()) {
            System.out.println(rs.getString(1));
        }
    

    However, using jdbc preparedstatement, throw exception. Is presto-jdbc not support yet "preparedstatement" ? Here's my test code and exception info.

    Test Code :

        Class.forName("com.facebook.presto.jdbc.PrestoDriver");
        Connection connection = DriverManager.getConnection("jdbc:presto://192.168.33.33:8080/hive/default", "hive", "hive");
    
        PreparedStatement ps = connection.prepareStatement("SHOW TABLES");
        ResultSet rs = ps.executeQuery();
        while(rs.next()) {
            System.out.println(rs.getString(1));
        }
    

    Exception Info :

        java.lang.UnsupportedOperationException: PreparedStatement
    at com.facebook.presto.jdbc.PrestoPreparedStatement.<init>(PrestoPreparedStatement.java:44)
    at com.facebook.presto.jdbc.PrestoConnection.prepareStatement(PrestoConnection.java:93)
    at com.nsouls.frescott.hive.mapper.PrestoConnectionTest.testShowTable(PrestoConnectionTest.java:37)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
    at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:74)
    at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:83)
    at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:72)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:231)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:88)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
    at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
    at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:71)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:174)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:202)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:65)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
    
    opened by felika 46
  • Add support for query pushdown to S3 using S3 Select

    Add support for query pushdown to S3 using S3 Select

    This change will allow Presto users to improve the performance of their queries using S3SelectPushdown. It pushes down projections and predicate evaluations to S3. As a result Presto doesn't need to download full S3 objects and only data required to answer the user's query is returned to Presto, thereby improving performance.

    S3SelectPushdown Technical Document: S3SelectPushdown.pdf

    This PR is a continuation of https://github.com/prestodb/presto/pull/11033.

    CLA Signed 
    opened by same3r 42
  • Implement EXPLAIN ANALYZE

    Implement EXPLAIN ANALYZE

    This should work similarly to Postgresql (http://www.postgresql.org/docs/9.4/static/sql-explain.html), by executing the query, recording stats, and then rendering the stats along with the plan. A first pass at implementing this could probably be to render similarly to EXPLAIN (TYPE DISTRIBUTED) with the stage & operator stats inserted

    enhancement 
    opened by cberner 42
  • Performance Regressions in Presto 0.206?

    Performance Regressions in Presto 0.206?

    I was recently benchmarking Presto 0.206 vs 0.172. The tests are run on Parquet datasets stored on S3.

    We found that Presto 0.206 was generally faster on smaller datasets, there were some significant performance regressions on larger datasets. The CPU time reported by EXPLAIN ANALYZE was lower in 0.206 than 0.172, but the wall time was much longer.

    This possibly indicates either stragglers or some sort of scheduling bug that adversely affects parallelism. Note that the concurrency settings like task.concurrency are the same in both clusters.

    For instance, on the TPCH scale 1000 dataset, query#7 slowed down by a factor of 2x in wall time. The query was:

    SELECT supp_nation,
           cust_nation,
           l_year,
           sum(volume) AS revenue
    FROM
      (SELECT n1.n_name AS supp_nation,
              n2.n_name AS cust_nation,
              substr(l_shipdate, 1, 4) AS l_year,
              l_extendedprice * (1 - l_discount) AS volume
       FROM lineitem_parq,
            orders_parq,
            customer_parq,
            supplier_parq,
            nation_parq n1,
            nation_parq n2
       WHERE s_suppkey = l_suppkey
         AND o_orderkey = l_orderkey
         AND c_custkey = o_custkey
         AND s_nationkey = n1.n_nationkey
         AND c_nationkey = n2.n_nationkey
         AND ((n1.n_name = 'KENYA'
               AND n2.n_name = 'PERU')
              OR (n1.n_name = 'PERU'
                  AND n2.n_name = 'KENYA'))
         AND l_shipdate BETWEEN '1995-01-01' AND '1996-12-31' ) AS shipping
    GROUP BY supp_nation,
             cust_nation,
             l_year
    ORDER BY supp_nation,
             cust_nation,
             l_year;
    

    I compared the output of EXPLAIN ANALYZE from both versions of Presto and cannot find anything that could explain this. Here are some observations:

    • The CPU time reported by each stage was usually lower in 0.206. This probably rules out operator performance regressions.
    • Some of the leaf stages were using ScanProject in 0.172, but they use ScanFilterProject in 0.205. This actually reduces the output rows and leads to drastically lower CPU usage in upper stages of the query tree. This is a big improvement and should have led to faster query processing.

    References

    • Explain analyze from 0.206 - https://gist.github.com/anoopj/40eea820c1c310dff72139d495ac98b0
    • Explain analyze from 0.172 - https://gist.github.com/anoopj/01985fe0ad298dad4c22b1444e1f1e21
    opened by anoopj 39
  • [native] PrestoCpp build from source pipeline

    [native] PrestoCpp build from source pipeline

    Fully automated build from source process proposal for presto-native-execution (PrestoCpp and Velox). README file added for clarification. appreciate any and all of the feedback.

    Prestissimo - Dockerfile build

    💡 PrestoDB repository: Presto - https://github.com/prestodb/presto

    💡 Velox repository: Velox - https://github.com/facebookincubator/velox

    Practical Velox implementation using PrestoCpp

    📝 Note: This readme and the build process was adapted from internal pipeline. You can e-mail the author if you've got questions [email protected]

    Prestissimo, marked in PrestoDB GitHub repository as 'presto-native-execution', is effort of making PrestoDB even better using Velox library as a starting point. Both of mentioned - PrestoCpp and Velox - are mainly written using low level C and C++ 17 languages, which makes the build-from-scratch process humongously complicated. To make this process simple, Intel Cloud Native Data Services Team is introducing 3-stage, fully automated Docker build process based on unmodified project GitHub repository.

    Quick Start

    1. Clone this repository

    git clone https://github.com/prestodb/presto prestodb
    

    2. (Optional) Define and export Docker registry, image name and image tag variables

    📝 Note: Remember to end your IMAGE_REGISTRY with / as this is required for full tag generation.

    💡 Tip: Depending on your configuration you may need to run all bellow commands as root user, to switch type as your first command sudo su

    💡 Tip: If IMAGE_REGISTRY is not specified IMAGE_PUSH should be set '0' or docker image push stage will fail.

    Type in you console, changing variables values to meet your needs:

    # defaults to 'avx', more info on Velox GitHub
    export CPU_TARGET="avx"
    # defaults to 'presto/prestissimo-${CPU_TARGET}-centos'
    export IMAGE_NAME='presto/prestissimo-${CPU_TARGET}-centos'
    # defaults to 'latest'
    export IMAGE_TAG='latest'
    # defaults to ''
    export IMAGE_REGISTRY='https://my_docker_registry.com/'
    # defaults to '0'
    export IMAGE_PUSH='0'
    

    3. Make sure Docker daemon is running

    (Ubuntu users) Type in your console:

    systemctl status docker
    

    4. Build Dockerfile repo

    Type in your console:

    cd prestodb/presto-native-execution
    make runtime-container
    

    The process is fully automated and require no interaction for user. The process of building images for the first time can take up to couple of hours (~1-2h using 10 processor cores).

    5. Run container

    📝 Note: Remember that you should start Presto JAVA server first

    Depending on values you have set the container tag is defined as

    PRESTO_CPP_TAG="${IMAGE_REGISTRY}${IMAGE_NAME}:${IMAGE_TAG}"

    for default values this will be just:

    PRESTO_CPP_TAG=presto/prestissimo-avx-centos:latest

    to run container build with default tag execute:

    docker run "presto/prestissimo-avx-centos:latest" \
                --use-env-params \
                --discovery-uri=http://localhost:8080 \
                --http-server-port=8080"
    

    to run container interactively, not executing entrypoint file:

    docker run -it --entrypoint=/bin/bash "presto/prestissimo-avx-centos:latest"
    

    Container manual build

    For manual build outside Intel network or without access to Cloud Native Data Services Poland Docker registry follow the steps bellow. In you terminal - with the same session that you want to build the images - define and export environment variables:

    export CPU_TARGET="avx"
    export IMAGE_NAME='presto/prestissimo-${CPU_TARGET}-centos'
    export IMAGE_TAG='latest'
    export IMAGE_REGISTRY='some-registry.my-domain.com/'
    export IMAGE_PUSH='0'
    export PRESTODB_REPOSITORY=$(git config --get remote.origin.url)
    export PRESTODB_CHECKOUT=$(git show -s --format="%H" HEAD)
    

    Where IMAGE_NAME and IMAGE_TAG will be the prestissimo release image name and tag, IMAGE_REGISTRY will be the registry that the image will be tagged with and witch will be used to download the images from previous stages in case there are no cached images locally. The CPU_TARGET will be unchanged for most of the cases, for more info read the Velox documentation. The PRESTODB_REPOSITORY and PRESTODB_CHECKOUT will be used as a build repository and branch inside the container. You can set them manually or as provided using git commands.

    Then for example to build containers when being behind a proxy server, change dir to and type:

    cd presto-native-execution/scripts/release-centos-dockerfile
    docker build \
        --network=host \
        --build-arg http_proxy  \
        --build-arg https_proxy \
        --build-arg no_proxy    \
        --build-arg CPU_TARGET  \
        --build-arg PRESTODB_REPOSITORY \
        --build-arg PRESTODB_CHECKOUT \
        --tag "${IMAGE_REGISTRY}${IMAGE_NAME}:${IMAGE_TAG}" .
    

    Build process - more info - prestissimo (with artifacts ~35 GB, without ~10 GB)

    Most of runtime and build time dependencies are downloaded, configured and installed in this step. The result from this step is a starting point for both second and third stage. This container will be build 'once per breaking change' in any of repositories. It can be used as starting point for Ci/Cd integrated systems. This step install Maven, Java 8, Python3-Dev, libboost-dev and lots of other massive frameworks, libraries and applications and ensures that all of steps from 2 stage will run with no errors.

    On-top of container from step 1 repository is initialized, Velox and submodules are updated, adapters, connectors and side-dependencies are build and configured. PrestoDB native, full repository build, using Meta wrapper mvnw for Maven is being done. After all of those partial steps, make and build are being run for PrestoCpp and Velox with Parquet, ORC, Hive connector with Thrift with S3-EMRFS filesystem implementation (schema s3://) and Hadoop filesystem implementation.

    ### DIRECTORY AND MAIN BUILD ARTIFACTS
    ## Native Presto JAVA build artifacts:
    /root/.m2/
    
    ## Build, third party dependencies, mostly for adapters
    /opt/dependency/
    /opt/dependency/aws-sdk-cpp
    /opt/dependency/install/
    /opt/dependency/install/run/
    /opt/dependency/install/bin/
    /opt/dependency/install/lib64/
    
    ## Root PrestoDB application directory
    /opt/presto/
    
    ## Root GitHub clone of PrestoDB repository
    /opt/presto/_repo/
    
    ## Root PrestoCpp subdirectory
    /opt/presto/_repo/presto-native-execution/
    
    ## Root Velox GitHub repository directory, as PrestoDB submodule
    /opt/presto/_repo/presto-native-execution/Velox
    
    ## Root build results directory for PrestoCpp with Velox
    /opt/presto/_repo/presto-native-execution/_build/release/
    /opt/presto/_repo/presto-native-execution/_build/release/velox/
    /opt/presto/_repo/presto-native-execution/_build/release/presto_cpp/
    

    Release container build - mostly with only the must-have runtime files, including presto_server build presto executable and some libraries. What will be used in the final released container depends on user needs and can be adjusted.

    Prestissimo - runtime configuration and settings

    ⚠️ _Notice: Presto-native-execution binary requires 32Gb of RAM at runtime to start (default settings). To override this and overcome runtime error add node.memory_gb=8 line in node.properties.

    Presto server with all dependencies can be found inside /opt/presto/, runtime name is presto_server. There are 2 ways of starting PrestoCpp using provided entry point /opt/entrypoint.sh.

    1) Quick start - pass parameters to entrypoint

    This is valid when running using docker and using kubernetes. It is not advised to use this method. User should prefer mounting configuration files using Kubernetes.

    "/opt/entrypoint.sh --use-env-params --discovery-uri=http://presto-coordinaator.default.svc.cluster.local:8080 --http-server-port=8080"
    

    2) Using in Kubernetes environment:

    Mount config file inside a container as /opt/presto/node.properties.template. Replace each variable with you configuration values or leave it as is:

    Notice: set up same values for JAVA coordinator as for prestoCpp - version, location and environment should be the same or you will get connection errors.

    presto.version=0.273.3
    node.location=datacenter-warsaw
    node.environment=test-environment
    node.data-dir=/var/presto/data
    catalog.config-dir=/opt/presto/catalog
    plugin.dir=/opt/presto/plugin
    # node.id is generated and filled during machine startup if not specified
    

    Mount config file inside a container as /opt/presto/config.properties.template. Replace each variable with you configuration values:

    coordinator=false
    http-server.http.port=8080
    discovery.uri=http://presto-coordinaator.default.svc.cluster.local:8080
    

    3) Hive-Metastore connector and S3 configuration:

    For minimum required configuration just mount file /opt/presto/catalog/hive.properties inside container at give path (fill hive.metastore.uri with you metastore endpoint address):

    connector.name=hive-hadoop2
    hive.metastore.uri=thrift://hive-metastore-service.default.svc:9098
    hive.pushdown-filter-enabled=true
    cache.enabled=true
    

    Setting required by S3 connector and Velox query engine, replace with your values, reefer to presto hive connector settings help:

    hive.s3.path-style-access={{ isPathstyle }}
    hive.s3.endpoint={{ scheme }}://{{ serviceFqdnTpl . }}:{{ portRest }}
    hive.s3.aws-access-key={{ accessKey }}
    hive.s3.aws-secret-key={{ secretKey }}
    hive.s3.ssl.enabled={{ sslEnabled }}
    hive.s3select-pushdown.enabled={{ s3selectPushdownFilterEnabled }}
    hive.parquet.pushdown-filter-enabled={{ parquetPushdownFilterEnabled }}
    

    image

    Signed-off-by: Linkiewicz, Milosz [email protected]

    opened by Mionsz 36
  • Add support for query pushdown to S3 using S3Select

    Add support for query pushdown to S3 using S3Select

    This change will allow Presto users to improve the performance of their queries using S3SelectPushdown. It pushes down projections and predicate evaluations to S3. As a result Presto doesn't need to download full S3 objects and only data required to answer the user's query is returned to Presto, thereby improving performance.

    S3SelectPushdown Technical Document: S3SelectPushdown.pdf

    PR UPDATE Closed this PR as it was slow to work with due to large volume of comments. Created a new PR to continue the work https://github.com/prestodb/presto/pull/11970

    CLA Signed 
    opened by same3r 36
  • Support multiple columns in IN predicate

    Support multiple columns in IN predicate

    Support queries like:

    presto:sf1> select count(*) from lineitem where (orderkey, linenumber) IN (SELECT orderkey, linenumber from lineitem);
    Query 20161018_062422_00016_uqzsf failed: line 1:60: Multiple columns returned by subquery are not yet supported. Found 2
    

    It should be easy to implement once https://github.com/prestodb/presto/issues/6384 got implemented.

    opened by kokosing 36
  • Prune Nested Fields for Parquet Columns

    Prune Nested Fields for Parquet Columns

    Read necessary fields only for Parquet nested columns Currently, Presto will read all the fields in a struct for Parquet columns. e.g.

    select s.a, s.b
    from t
    

    if it is a parquet file, with struct column s: {a int, b double, c long, d float} current Presto will read a, b, c, d from s, and output just a and b

    For columnar storage as Parquet or ORC, we could do better, by just reading the necessary fields. In the previous example, just read {a int, b double} from s. Not reading other fields to save IO.

    This patch introduces an optional NestedFields in ColumnHandle. When optimizing the plan, PruneNestedColumns optimizer will visit expressions, and put candidate nested fields into ColumnHandle. When scanning parquet files, the record reader could use NestedFields to specify necessary fields only for parquet files.

    This has an dependency on @jxiang 's https://github.com/prestodb/presto/pull/4714, which gives us the flexibility to specify metastore schemas differently from parquet file schemas.

    @dain @martint @electrum @cberner @erichwang any comments are appreciated

    CLA Signed 
    opened by zhenxiao 36
  • Cassandra connector IN query very slow planning on large list

    Cassandra connector IN query very slow planning on large list

    A query like -

    select col1
    from table
    where col2 in (<long list of integers>)
    and col3 in (<long list of string>)
    and col4 in (<another long list of integers>)
    and col1 is not null
    group by col1;
    

    takes more than 5 minutes just planning. My cassandra table being queried has a lot of partitions and list length for IN query I was experimenting with was anywhere between 50 to 200. <col2, col3, col4> together form the partition keys so I don't imagine a full table scan to take place during planning or execution. Any ideas?

    opened by aandis 34
  • Add InMemory connector

    Add InMemory connector

    Add connector that stores all data in memory on the workers.

    Rationale behind it is to serve as a storage for SQL query benchmarking. Using JMH unit benchmarks from scratch is time consuming to setup, it's often much easier to write some query against TPCH. Previous benchmarks had significant drawback that generating data in TPCH connector was using most of the CPU time. With InMemory connector that's no longer the case.

    Connector is based on BlackHole and first commit is just copy/paste with some renames.

    CLA Signed ready-to-merge 
    opened by pnowojski 33
  • folly compiled failed when running setup-macos.sh

    folly compiled failed when running setup-macos.sh

    When i running setup-macos.sh in presto-native-execution/scripts, I got the follow errors when it compile folly project.

    • run_and_time install_folly
    • install_folly
    • github_checkout facebook/folly v2022.07.11.00
    • local REPO=facebook/folly
    • shift
    • local VERSION=v2022.07.11.00
    • shift
    • local GIT_CLONE_PARAMS= ++ basename facebook/folly
    • local DIRNAME=folly
    • cd /Users/ericdoug/Documents/mydev/presto/presto-native-execution/scripts
    • '[' -z folly ']'
    • '[' -d folly ']'
    • prompt 'folly already exists. Delete?' folly already exists. Delete? [Y, n] Y
    • rm -rf folly
    • '[' '!' -d folly ']'
    • git clone -q -b v2022.07.11.00 [email protected]:facebook/folly.git Note: switching to '4ba3bfed38ad14d0951d82b154c44235d380f59b'.

    You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch.

    If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example:

    git switch -c

    Or undo this operation with:

    git switch -

    Turn off this advice by setting config variable advice.detachedHead to false

    • cd folly ++ brew --prefix [email protected]
    • OPENSSL_ROOT_DIR=/usr/local/opt/[email protected]
    • cmake_install -DBUILD_TESTS=OFF +++ pwd ++ basename /Users/ericdoug/Documents/mydev/presto/presto-native-execution/scripts/folly
    • local NAME=folly
    • local BINARY_DIR=_build
    • '[' -d _build ']'
    • mkdir -p _build
    • CPU_TARGET=avx ++ get_cxx_flags avx ++ local CPU_ARCH=avx ++ local OS +++ uname ++ OS=Darwin ++ local MACHINE +++ uname -m ++ MACHINE=x86_64 ++ ADDITIONAL_FLAGS= ++ '[' -z avx ']' ++ case $CPU_ARCH in ++ echo -n '-mavx2 -mfma -mavx -mf16c -mlzcnt -std=c++17 -mbmi2 '
    • COMPILER_FLAGS='-mavx2 -mfma -mavx -mf16c -mlzcnt -std=c++17 -mbmi2 '
    • cmake -Wno-dev -B_build -GNinja -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCMAKE_CXX_STANDARD=17 '' '' '-DCMAKE_CXX_FLAGS=-mavx2 -mfma -mavx -mf16c -mlzcnt -std=c++17 -mbmi2 ' -DBUILD_TESTING=OFF -DBUILD_TESTS=OFF CMake Warning: Ignoring empty string ("") provided on the command line.

    CMake Warning: Ignoring empty string ("") provided on the command line.

    -- The CXX compiler identification is AppleClang 14.0.0.14000029 -- The C compiler identification is AppleClang 14.0.0.14000029 -- The ASM compiler identification is Clang with GNU-like command-line -- Found assembler: /Library/Developer/CommandLineTools/usr/bin/cc -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE
    -- Found Boost: /usr/local/lib/cmake/Boost-1.79.0/BoostConfig.cmake (found suitable version "1.79.0", minimum required is "1.51.0") found components: context filesystem program_options regex system thread -- Found DoubleConversion: /usr/local/lib/libdouble-conversion.a
    -- Could NOT find gflags (missing: LIBGFLAGS_LIBRARY LIBGFLAGS_INCLUDE_DIR) -- Could NOT find Glog (missing: GLOG_LIBRARY GLOG_INCLUDE_DIR) -- Found libevent: /usr/local/lib/libevent.dylib -- Found ZLIB: /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/libz.tbd (found version "1.2.11") -- Found OpenSSL: /usr/local/opt/[email protected]/lib/libcrypto.dylib (found suitable version "1.1.1q", minimum required is "1.1.1")
    -- Looking for ASN1_TIME_diff -- Looking for ASN1_TIME_diff - not found -- Found BZip2: /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/libbz2.tbd (found version "1.0.8") -- Looking for BZ2_bzCompressInit -- Looking for BZ2_bzCompressInit - not found -- Looking for lzma_auto_decoder in /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/liblzma.tbd -- Looking for lzma_auto_decoder in /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/liblzma.tbd - not found -- Looking for lzma_easy_encoder in /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/liblzma.tbd -- Looking for lzma_easy_encoder in /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/liblzma.tbd - not found -- Looking for lzma_lzma_preset in /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/liblzma.tbd -- Looking for lzma_lzma_preset in /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/lib/liblzma.tbd - not found -- Could NOT find LibLZMA (missing: LIBLZMA_HAS_AUTO_DECODER LIBLZMA_HAS_EASY_ENCODER LIBLZMA_HAS_LZMA_PRESET) (found version "5.2.5") -- Found LZ4: /usr/local/lib/liblz4.dylib
    -- Found LZ4: /usr/local/lib/liblz4.dylib -- Could NOT find ZSTD (missing: ZSTD_LIBRARY ZSTD_INCLUDE_DIR) -- Could NOT find SNAPPY (missing: SNAPPY_LIBRARY SNAPPY_INCLUDE_DIR) -- Could NOT find LIBDWARF (missing: LIBDWARF_LIBRARY LIBDWARF_INCLUDE_DIR) -- Could NOT find LIBIBERTY (missing: LIBIBERTY_LIBRARY LIBIBERTY_INCLUDE_DIR) -- Could NOT find LIBAIO (missing: LIBAIO_LIBRARY LIBAIO_INCLUDE_DIR) -- Could NOT find LIBURING (missing: LIBURING_LIBRARY LIBURING_INCLUDE_DIR) -- Found LIBSODIUM: /usr/local/lib/libsodium.dylib
    -- Found Libsodium: /usr/local/lib/libsodium.dylib -- Could NOT find LIBUNWIND (missing: LIBUNWIND_LIBRARY) -- Looking for swapcontext -- Looking for swapcontext - found -- Looking for C++ include elf.h -- Looking for C++ include elf.h - not found -- Looking for backtrace -- Looking for backtrace - found -- backtrace facility detected in default set of libraries -- Found Backtrace: /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/include
    -- Setting FOLLY_USE_SYMBOLIZER: OFF -- Setting FOLLY_HAVE_ELF: -- Setting FOLLY_HAVE_DWARF: FALSE -- Performing Test FOLLY_CPP_ATOMIC_BUILTIN -- Performing Test FOLLY_CPP_ATOMIC_BUILTIN - Success -- Performing Test FOLLY_STDLIB_LIBSTDCXX -- Performing Test FOLLY_STDLIB_LIBSTDCXX - Failed -- Performing Test FOLLY_STDLIB_LIBSTDCXX_GE_9 -- Performing Test FOLLY_STDLIB_LIBSTDCXX_GE_9 - Failed -- Performing Test FOLLY_STDLIB_LIBCXX -- Performing Test FOLLY_STDLIB_LIBCXX - Success -- Performing Test FOLLY_STDLIB_LIBCXX_GE_9 -- Performing Test FOLLY_STDLIB_LIBCXX_GE_9 - Success -- Performing Test FOLLY_STDLIB_LIBCPP -- Performing Test FOLLY_STDLIB_LIBCPP - Failed -- Looking for C++ include jemalloc/jemalloc.h -- Looking for C++ include jemalloc/jemalloc.h - not found -- Performing Test COMPILER_HAS_UNKNOWN_WARNING_OPTION -- Performing Test COMPILER_HAS_UNKNOWN_WARNING_OPTION - Success -- Performing Test COMPILER_HAS_W_SHADOW_LOCAL -- Performing Test COMPILER_HAS_W_SHADOW_LOCAL - Failed -- Performing Test COMPILER_HAS_W_SHADOW_COMPATIBLE_LOCAL -- Performing Test COMPILER_HAS_W_SHADOW_COMPATIBLE_LOCAL - Failed -- Performing Test COMPILER_HAS_W_NOEXCEPT_TYPE -- Performing Test COMPILER_HAS_W_NOEXCEPT_TYPE - Success -- Performing Test COMPILER_HAS_W_NULLABILITY_COMPLETENESS -- Performing Test COMPILER_HAS_W_NULLABILITY_COMPLETENESS - Success -- Performing Test COMPILER_HAS_W_INCONSISTENT_MISSING_OVERRIDE -- Performing Test COMPILER_HAS_W_INCONSISTENT_MISSING_OVERRIDE - Success -- Performing Test COMPILER_HAS_F_ALIGNED_NEW -- Performing Test COMPILER_HAS_F_ALIGNED_NEW - Success -- Performing Test COMPILER_HAS_F_OPENMP -- Performing Test COMPILER_HAS_F_OPENMP - Failed -- Looking for pthread_atfork -- Looking for pthread_atfork - found -- Looking for accept4 -- Looking for accept4 - not found -- Looking for getrandom -- Looking for getrandom - not found -- Looking for preadv -- Looking for preadv - found -- Looking for pwritev -- Looking for pwritev - found -- Looking for clock_gettime -- Looking for clock_gettime - found -- Looking for pipe2 -- Looking for pipe2 - not found -- Looking for sendmmsg -- Looking for sendmmsg - not found -- Looking for recvmmsg -- Looking for recvmmsg - not found -- Looking for malloc_usable_size -- Looking for malloc_usable_size - not found -- Performing Test FOLLY_HAVE_IFUNC -- Performing Test FOLLY_HAVE_IFUNC - Failed -- Performing Test FOLLY_HAVE_STD__IS_TRIVIALLY_COPYABLE -- Performing Test FOLLY_HAVE_STD__IS_TRIVIALLY_COPYABLE - Success -- Performing Test FOLLY_HAVE_UNALIGNED_ACCESS -- Performing Test FOLLY_HAVE_UNALIGNED_ACCESS - Success -- Performing Test FOLLY_HAVE_VLA -- Performing Test FOLLY_HAVE_VLA - Success -- Performing Test FOLLY_HAVE_WEAK_SYMBOLS -- Performing Test FOLLY_HAVE_WEAK_SYMBOLS - Failed -- Performing Test FOLLY_HAVE_LINUX_VDSO -- Performing Test FOLLY_HAVE_LINUX_VDSO - Failed -- Performing Test FOLLY_HAVE_WCHAR_SUPPORT -- Performing Test FOLLY_HAVE_WCHAR_SUPPORT - Success -- Performing Test FOLLY_HAVE_EXTRANDOM_SFMT19937 -- Performing Test FOLLY_HAVE_EXTRANDOM_SFMT19937 - Failed -- Performing Test HAVE_VSNPRINTF_ERRORS -- Performing Test HAVE_VSNPRINTF_ERRORS - Failed -- arch does not match x86_64, skipping setting SSE2/AVX2 compile flags for LtHash SIMD code -- Performing Test COMPILER_HAS_M_PCLMUL -- Performing Test COMPILER_HAS_M_PCLMUL - Success -- compiler has flag pclmul, setting compile flag for /Users/ericdoug/Documents/mydev/presto/presto-native-execution/scripts/folly/folly/hash/detail/ChecksumDetail.cpp;/Users/ericdoug/Documents/mydev/presto/presto-native-execution/scripts/folly/folly/hash/detail/Crc32CombineDetail.cpp;/Users/ericdoug/Documents/mydev/presto/presto-native-execution/scripts/folly/folly/hash/detail/Crc32cDetail.cpp -- Configuring done CMake Error in CMakeLists.txt: Target "folly_deps" INTERFACE_INCLUDE_DIRECTORIES property contains path:

    "/Users/ericdoug/Documents/mydev/presto/presto-native-execution/scripts/folly/GLOG_INCLUDE_DIR-NOTFOUND"
    

    which is prefixed in the source directory.

    -- Generating done CMake Warning: Manually-specified variables were not used by the project:

    BUILD_TESTING
    

    CMake Generate step failed. Build files cannot be regenerated correctly.

    opened by eric-doug 1
  • Reduce TaskHandle lock contention

    Reduce TaskHandle lock contention

    Avoids synchronizing on TaskHandle when checking whether it has been destroyed by making the destroyed flag volatile instead. The previously synchronized isDestroyed method is called once at the end of each driver processing interval from worker threads check whether the split is finished which could create unnecessary lock contention when all threads are active on tasks with many splits that frequently block or completely quickly.

    Also includes a minor improvement to PrioritizedSplitRunner#process() that avoids unnecessary redundant calls to System.nanoTime() and reduces logging overhead in TaskExecutor by avoiding system calls and string concatenation when debug logging is not enabled.

    == NO RELEASE NOTE ==
    
    opened by pettyjamesm 0
  • WIP: Add S3 Select pushdown JSON tests

    WIP: Add S3 Select pushdown JSON tests

    Test plan

    Locally tested:

    [INFO] --- maven-surefire-plugin:3.0.0-M7:test (default-test) @ presto-hive-hadoop2 ---
    [INFO] Tests will run in random order. To reproduce ordering use flag -Dsurefire.runOrder.random.seed=114273716603401
    [INFO] Using auto detected provider org.apache.maven.surefire.testng.TestNGProvider
    [INFO] 
    [INFO] -------------------------------------------------------
    [INFO]  T E S T S
    [INFO] -------------------------------------------------------
    [INFO] Running com.facebook.presto.hive.s3select.TestHiveFileSystemS3SelectJsonPushdown
    WARNING: An illegal reflective access operation has occurred
    WARNING: Illegal reflective access by org.apache.hadoop.fs.HadoopExtendedFileSystemCache (file:/Users/dnnanuti/.m2/repository/com/facebook/presto/presto-hive-common/0.279-SNAPSHOT/presto-hive-common-0.279-SNAPSHOT.jar) to field java.lang.reflect.Field.modifiers
    WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.fs.HadoopExtendedFileSystemCache
    WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
    WARNING: All illegal access operations will be denied in a future release
    2023-01-05T17:29:59.421-0600 WARNING Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    2023-01-05T17:29:59.523-0600 INFO Successfully loaded & initialized native-bzip2 library system-native
    2023-01-05T17:29:59.528-0600 INFO Successfully loaded & initialized native-zlib library
    2023-01-05T17:30:01.824-0600 WARNING NoSuchMethodException was thrown when disabling normalizeUri. This indicates you are using an old version (< 4.5.8) of Apache http client. It is recommended to use http client version >= 4.5.9 to avoid the breaking change introduced in apache client 4.5.7 and the latency in exception handling. See https://github.com/aws/aws-sdk-java/issues/1919 for more information
    2023-01-05T17:30:03.178-0600 INFO io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
    2023-01-05T17:30:03.397-0600 INFO Got brand-new decompressor [.bz2]
    2023-01-05T17:30:03.498-0600 INFO Got brand-new decompressor [.bz2]
    2023-01-05T17:30:03.834-0600 INFO Got brand-new decompressor [.gz]
    [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.838 s - in com.facebook.presto.hive.s3select.TestHiveFileSystemS3SelectJsonPushdown
    [INFO] 
    [INFO] Results:
    [INFO] 
    [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
    [INFO] 
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time:  28.323 s
    [INFO] Finished at: 2023-01-05T23:30:04Z
    [INFO] ------------------------------------------------------------------------
    + EXIT_CODE=0
    + set -e
    + popd
    ~/workplace/connectors/github-repositories/presto
    + cleanup_docker_containers
    + docker-compose -f ./presto-hive-hadoop2/bin/../conf/docker-compose.yml down
    Stopping conf_hadoop-master_1 ... done
    Removing conf_hadoop-master_1 ... done
    Removing network conf_default
    + wait
    + exit 0
    
    

    Needs rebasing and will be merged after: https://github.com/prestodb/presto/pull/18901

    == NO RELEASE NOTE ==
    
    opened by dnanuti 0
  • WIP: Add S3 Select pushdown for JSON files

    WIP: Add S3 Select pushdown for JSON files

    Add S3 Select pushdown for JSON files

    • Small refactoring for IonSqlQueryBuilder to support query generation for JSON
    • Pushdown logic works for base columns, same as for CSV
    • Tests for JSON support will be added in a separate PR, as they require more changes: https://github.com/prestodb/presto/pull/18902

    Needs rebasing and will be merged after: https://github.com/prestodb/presto/pull/18786 https://github.com/prestodb/presto/pull/18798

    == RELEASE NOTES ==
    
    Hive Changes
    * Add Amazon S3 Select pushdown for JSON files.
    
    opened by dnanuti 0
  • find_first UDF cannot distinguish between NULL returned as a value and NULL returned because of no match

    find_first UDF cannot distinguish between NULL returned as a value and NULL returned because of no match

    The find_first added in https://github.com/prestodb/presto/pull/18316 returns NULL if no match found. However, it cannot distinguish from the NULL returned as values.

    For example, both SELECT FIND_FIRST(ARRAY[NULL, 1], x->x is NULL) is NULL; and SELECT FIND_FIRST(ARRAY[1], x->x is NULL) is NULL; returns true.

    opened by feilong-liu 1
Owner
Presto
Distributed SQL query engine for big data
Presto
SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.

SAMOA: Scalable Advanced Massive Online Analysis. This repository is discontinued. The development of SAMOA has moved over to the Apache Software Foun

Yahoo Archive 424 Dec 28, 2022
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Apache Zeppelin Documentation: User Guide Mailing Lists: User and Dev mailing list Continuous Integration: Contributing: Contribution Guide Issue Trac

The Apache Software Foundation 5.9k Jan 8, 2023
Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more

IMPORTANT NOTE!!! Storm has Moved to Apache. The official Storm git repository is now hosted by Apache, and is mirrored on github here: https://github

Nathan Marz 8.9k Dec 26, 2022
Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter

Heron is a realtime analytics platform developed by Twitter. It has a wide array of architectural improvements over it's predecessor. Heron in Apache

The Apache Software Foundation 3.6k Dec 28, 2022
Real-time Query for Hadoop; mirror of Apache Impala

Welcome to Impala Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters. Impala is a modern, massively-distri

Cloudera 27 Dec 28, 2022
Netflix's distributed Data Pipeline

Suro: Netflix's Data Pipeline Suro is a data pipeline service for collecting, aggregating, and dispatching large volume of application events includin

Netflix, Inc. 772 Dec 9, 2022
Machine Learning Platform and Recommendation Engine built on Kubernetes

Update January 2018 Seldon Core open sourced. Seldon Core focuses purely on deploying a wide range of ML models on Kubernetes, allowing complex runtim

Seldon 1.5k Dec 15, 2022
OpenRefine is a free, open source power tool for working with messy data and improving it

OpenRefine OpenRefine is a Java-based power tool that allows you to load data, understand it, clean it up, reconcile it, and augment it with data comi

OpenRefine 9.2k Jan 1, 2023
Hadoop library for large-scale data processing, now an Apache Incubator project

Apache DataFu Follow @apachedatafu Apache DataFu is a collection of libraries for working with large-scale data in Hadoop. The project was inspired by

LinkedIn's Attic 589 Apr 1, 2022
A platform for visualization and real-time monitoring of data workflows

Status This project is no longer maintained. Ambrose Twitter Ambrose is a platform for visualization and real-time monitoring of MapReduce data workfl

Twitter 1.2k Dec 31, 2022
Program finds average number of words in each comment given a large data set by use of hadoop's map reduce to work in parallel efficiently.

Finding average number of words in all the comments in a data set ?? Mapper Function In the mapper function we first tokenize entire data and then fin

Aleezeh Usman 3 Aug 23, 2021
Access paged data as a "stream" with async loading while maintaining order

DataStream What? DataStream is a simple piece of code to access paged data and interface it as if it's a single "list". It only keeps track of queued

Thomas 1 Jan 19, 2022
The official home of the Presto distributed SQL query engine for big data

Presto Presto is a distributed SQL query engine for big data. See the User Manual for deployment instructions and end user documentation. Requirements

Presto 14.3k Jan 5, 2023
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Trino is a fast distributed SQL query engine for big data analytics. See the User Manual for deployment instructions and end user documentation. Devel

Trino 6.9k Dec 31, 2022
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

Apache Gobblin Apache Gobblin is a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems. Ca

The Apache Software Foundation 2.1k Jan 4, 2023
Keycloak: Home IdP Discovery - discover home identity provider or realm by email domain

Keycloak: Home IdP Discovery This is a simple Keycloak authenticator to redirect users to their home identity provider during login. What is it good f

Sven-Torben Janus 74 Dec 19, 2022
Jornada Big Tech: I will have 3 months to study and prepare myself for the Big Tech interviews. Repository containing all my study material.

Jornada Big Tech (Big Tech Journey) Jornada Big Tech: I will have 3 months to study and prepare myself for the Big Tech interviews. Repository contain

Camila Maia 87 Dec 8, 2022
Allows you to use the MongoDB query syntax to query your relational database.

Spring Data JPA MongoDB Expressions How it works: Customize JPA Repository base class: @SpringBootApplication @EnableJpaRepositories(repositoryBaseCla

Muhammad Hewedy 86 Dec 27, 2022
Spring JPA @Query for custom query in Spring Boot example

Spring JPA @Query example (Custom query) in Spring Boot Use Spring JPA @Query for custom query in Spring Boot example: Way to use JPQL (Java Persisten

null 17 Dec 3, 2022
gMark: a domain- and query language-independent graph instance and query workload generator

gMark is a domain- and query language-independent graph instance and query workload generator.

Roan 3 Nov 19, 2022