ClickHouse AST Parser & Visitor

Overview

ClickHouse AST Parser & Visitor

Introduction

ClickHouse AST Parser, it is much more than a parser. It is a convenient toolbox that provides services related to ClickHouse AST. With ClickHouse AST Parser, you are able to easily convert ClickHouse SQL statement to AST (Abstract Syntax Tree), and further utilize the parsing results. You can operate on key objects such as CST, CST parser, CST visitor, AST, AST parser and AST visitor throughout the parsing process.

String sql = "SELECT t1.id, count(1) as total_count FROM my_db1.table1 t1 LEFT JOIN my_db2.table2 t2 ON t1.id = t2.id GROUP BY t1.id";
AstParser astParser = new AstParser();
INode ast = (INode) astParser.parse(sql);

Use cases

Find out related tables

Functions such as hotspot analysis and caching rely on this basic parsing capability.

tables = referredTablesDetector.searchTables(ast);">
String sql = "SELECT t1.id, count(1) as total_count FROM my_db1.table1 t1 LEFT JOIN my_db2.table2 t2 ON t1.id = t2.id GROUP BY t1.id";
AstParser astParser = new AstParser();
INode ast = (INode) astParser.parse(sql);
ReferredTablesDetector referredTablesDetector = new ReferredTablesDetector();
// tables should be ["my_db1.table1", "my_db2.table2"] in this case
List<String> tables = referredTablesDetector.searchTables(ast);

Find out related partitions

Related partitions can also be found. But this time we need to implement MetadataService to tell it how to get the metadata since ClickHouse SQL Parser needs to know the partition column name of a specified table.

partitionRangeList = referredPartitionsDetector.searchTablePartitions(ast);">
// we need to implement MetadataService first
MetadataService metadataService = new MetadataService() {
            @Override
            public String getPartitionColName(String tableFullName) {
                // TODO: implement this method
                return null;
            }

            @Override
            public List<String> getTables() {
                // TODO: implement this method
                return null;
            }
        };
String todayDate = "2022-01-01"; // for parsing UDF like today() and yesterday() in the SQL
String targetIP = "127.0.0.1"; // the node to get metadata
ReferredPartitionsDetector referredPartitionsDetector = new ReferredPartitionsDetector(todayDate, targetIp, metadataService);
List<String> partitionRangeList = referredPartitionsDetector.searchTablePartitions(ast);

Extract arguments of Distributed engine

Although Distributed engine is well designed to organize data, sometimes we do need to extract the detail information such as related cluster and local table name. However, it is not convenient to extract them with regular expression because it is easy to make mistake, especially when there are complex comments in a CREATE SQL. ClickHouse SQL Parser solve this problem.

String sql = "CREATE TABLE my_db.my_tbl (date Date, name String) Engine = Distributed('my_cluster', 'my_db', 'my_tbl_local', rand())";
DistributedTableInfoDetector distributedTableInfoDetector = new DistributedTableInfoDetector();
// clusterName is "my_cluster"
String clusterName = distributedTableInfoDetector.searchCluster(sql);
// tableFullName is "my_db.my_tbl_local"
String tableFullName = distributedTableInfoDetector.searchLocalTableFullName(sql);

Rewrite SQL adding "global" keywords

This allows us to optimize a slow SQL by rewriting it. Adding "global" keyword to the SQL is such a case. Develop more rewriters based on this ClickHouse SQL Parser will bring more benefits.

String sql = "SELECT t1.id, count(1) as total_count FROM my_db1.table1 t1 LEFT JOIN my_db2.table2 t2 ON t1.id = t2.id GROUP BY t1.id";
AstParser astParser = new AstParser(false);
SelectUnionQuery ast = (SelectUnionQuery) astParser.parse(sql);
GlobalJoinAstRewriter globalJoinAstRewriter = new GlobalJoinAstRewriter();
String rewrittenSql =  globalJoinAstRewriter.visit((INode) ast);
// the rewritten SQL should be:
// SELECT t1.id, count(1) as total_count FROM my_db1.table1 t1 GLOBAL LEFT JOIN my_db2.table2 t2 ON t1.id = t2.id GROUP BY t1.id
You might also like...

Fast and Easy mapping from database and csv to POJO. A java micro ORM, lightweight alternative to iBatis and Hibernate. Fast Csv Parser and Csv Mapper

Simple Flat Mapper Release Notes Getting Started Docs Building it The build is using Maven. git clone https://github.com/arnaudroger/SimpleFlatMapper.

Dec 17, 2022

Jwks RSA - JSON Web Key Set parser.

jwks-rsa Install Maven dependency groupIdcom.auth0/groupId artifactIdjwks-rsa/artifactId version0.17.0/version /dependency

Dec 30, 2022

Java 1-15 Parser and Abstract Syntax Tree for Java, including preview features to Java 13

JavaParser This project contains a set of libraries implementing a Java 1.0 - Java 14 Parser with advanced analysis functionalities. This includes pre

Jan 9, 2023

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.

jsoup: Java HTML Parser jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting a

Jan 4, 2023

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR v4 Build status ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating

Jan 3, 2023

A pure-Java Markdown processor based on a parboiled PEG parser supporting a number of extensions

: DEPRECATION NOTE : Although still one of the most popular Markdown parsing libraries for the JVM, pegdown has reached its end of life. The pro

Nov 24, 2022

Fast JSON parser for java projects

ig-json-parser Fast JSON parser for java projects. Getting started The easiest way to get started is to look at maven-example. For more comprehensive

Dec 26, 2022

The MAVLink parser for the Telestion backend.

telestion-extension-template A template for creating new Telestion extensions. This template contains an initialized project structure with configured

Dec 14, 2021

Rekex parser generator - grammar as algebraic datatypes

Rekex PEG parser generator for Java 17 grammar as algebraic datatypes A context-free grammar has the form of A = A1 | A2 A1 = B C ... which looks ex

Dec 18, 2022

Representational State Transfer + Structured Query Language(RSQL): Demo application using RSQL parser to filter records based on provided condition(s)

Representational State Transfer + Structured Query Language: RSQL Demo application using RSQL parser to filter records based on provided condition(s)

Nov 23, 2022

Reflectionless command line parser

jbock is a command line parser, which uses the same annotation names as JCommander and picocli. However it does not use reflection. It is an annotatio

Jan 4, 2023

Parser of the table of contents file of the 1C platform syntax helper

Parser of the table of contents file of the 1C platform syntax helper

Парсер файла оглавления синтакс-помощника платформы 1С Что делает? Парсит вот это: Оглавление представляет собой файл без расширения, лежит в файле sh

Jan 27, 2022

JavaOTTF - Official OTTF parser and composer for JVM languages

JavaOTTF - Official OTTF parser and composer for JVM languages

JavaOTTF Official OTTF parser and composer for JVM languages. Documentation Please refer to the Wiki Section. Installation Maven Add repository into p

Nov 21, 2022

Simple Ini Parser for Java or SIni4j is a simple INI parse made in Java

Simple Ini Parser for Java or SIni4j is a simple INI parse made in Java

Simple Ini Parser for Java or SIni4j is a simple INI parse made in Java, built for provide a simple and easiest way to load INI files in Java

Mar 18, 2022

A Parser That parses OpenAPI definitions in JSON or YAML format and Generates Pact files That contain HTTP request/response interactions by parsing Given Open API specifications

This is a Parser That parses OpenAPI definitions in JSON or YAML format and Generates Pact files That contain HTTP request/response interactions by parsing Given Open API specifications.

Mar 19, 2022

A Parser tool which actually tries to convert XML data into JSON data

A Parser tool which actually tries to convert XML data into JSON data

SpringBoot A Parser tool which actually tries to convert XML data into JSON data Tools Required Postman (Testing API's) IDE - Eclipse / NetBeans/ Inte

Jan 27, 2022

A simple command-line argument parser for Java applications that relies on records.

RecordArgs A simple command-line argument parser for Java applications that relies on records. Specifically, it uses their component names to parse co

Apr 4, 2022

Idk. Simple argument parser for u. Probably needs some changes xd

SimpleArgumentParser Maven repositories repository idjitpack.io/id urlhttps://jitpack.io/url /repository /repositories de

Sep 30, 2022

A diagnostic library for parser / interpreter usages.

A diagnostic library for parser / interpreter usages.

Nenggao Makes diagnostic easier! A diagnostic library for parser / interpreter usages. About Nenggao is a diagnostic library mainly designed for Yakou

Nov 4, 2022
Comments
  • can't parse create table

    can't parse create table

    CREATE TABLE my_db.my_tbl on cluster my_cluster Engine = Distributed('my_cluster', 'my_db', 'my_tbl_local', rand()) as my_db.my_tbl_local CREATE TABLE mydb.mytb (uuid UUID DEFAULT generateUUIDv4(), cktime DateTime DEFAULT now() COMMENT 'c', openid String, username String, appid String, from_channel String, source_channel String, source String, regtime DateTime, brandid String, devicecode String, actiontime DateTime, ismingamelogin String, version String, platform String, project String, plat String, source_openid String COMMENT 'a', event Int16 COMMENT 'b') ENGINE = ReplicatedMergeTree('/clickhouse/mydb/mytb/{shard}', '{replica}') PARTITION BY toYYYYMM(cktime) ORDER BY (regtime, appid, openid) SETTINGS index_granularity = 8192

    opened by inertance 1
Owner
Jiaming Mai
Only those who have seen prosperity and touched the abyss are qualified to say the word "ordinary".
Jiaming Mai
Clickhouse storage backend for Janusgraph

Clickhouse storage backend for Janusgraph Overview Clickhouse implementation of Janusgraph storage backend. Features New version 0.6.1 of JanusGraph c

null 3 Nov 30, 2022
Inria 1.4k Dec 29, 2022
JDBC driver for ClickHouse

This is a basic and restricted implementation of jdbc driver for ClickHouse. It has support of a minimal subset of features to be usable.

ClickHouse 1.1k Jan 1, 2023
Clickhouse storage backend for Janusgraph

Clickhouse storage backend for Janusgraph Overview Clickhouse implementation of Janusgraph storage backend. Features New version 0.6.1 of JanusGraph c

null 3 Nov 30, 2022
A command line parser generator

jbock is a command line parser that works similar to airline and picocli. While most of these other tools scan for annotations at runtime, jbock is an

H90 73 Dec 13, 2022
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR v4 Build status ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating

Antlr Project 13.6k Dec 28, 2022
JavaCC - a parser generator for building parsers from grammars. It can generate code in Java, C++ and C#.

JavaCC Java Compiler Compiler (JavaCC) is the most popular parser generator for use with Java applications. A parser generator is a tool that reads a

null 971 Dec 27, 2022
Java 1-15 Parser and Abstract Syntax Tree for Java, including preview features to Java 13

JavaParser This project contains a set of libraries implementing a Java 1.0 - Java 14 Parser with advanced analysis functionalities. This includes pre

JavaParser 4.5k Jan 5, 2023
High-performance JSON parser

HikariJSON A High-performance JSON parser. HikariJSON is targeted exclusively at Java 8. If you need legacy support, there are several decent librarie

Brett Wooldridge 454 Dec 31, 2022
A fast JSON parser/generator for Java.

fastjson Fastjson is a Java library that can be used to convert Java Objects into their JSON representation. It can also be used to convert a JSON str

Alibaba 25.1k Dec 31, 2022