Rekex parser generator - grammar as algebraic datatypes

Overview

Rekex

PEG parser generator for Java 17

grammar as algebraic datatypes

A context-free grammar has the form of

A  = A1 | A2
A1 = B C
...

which looks exactly like definitions of algebraic datatypes. This is by no coincidence - both formalisms reflect an underlying model with which we build complex concepts from constituents.

Given this correspondence, we could express a context-free grammar entirely as datatypes in a programming language. For example, in Java 17,

sealed interface A permits A1, A2{}

record A1(B b, C c) implements A{}

Such datatypes can be transliterated into a grammar, which is fed to a particular parser generator to build a parser. During the parsing process, constructors of datatypes are invoked, eventually outputting a parse tree in the very same datatypes.

PegParser<A> parser = PegParser.of(A.class);

A a = parser.matchFull(input);

Rekex is a PEG parser generator that implements this novel idea for Java 17. It is the simplest, most intuitive way for writing parsers.

Read More:


Create by Zhong Yu. I am looking for a job; helps appreciated.

Comments
  • Annotations should allow ElementType=FIELD for use in singleton enums.

    Annotations should allow ElementType=FIELD for use in singleton enums.

    When I try to add an @Str or @Ch annotation to an enum field, I get "Syntax error: Type annotations are illegal here." This is probably because those annotations are not declared with @Target including ElementType.FIELD (for enum fields) Screen Shot 2021-09-27 at 12 03 15 PM

    enhancement 
    opened by raptor494 5
  • Is incremental parsing supported?

    Is incremental parsing supported?

    First of all, thanks for your amazing work! @zhong-j-yu I'm developinng a DSL these days and I've already made an ugly parser. Each change on the grammar of my DSL takes me a lot of time because of its bad design. So I'm considering using rekex instead. The design of relex actually amazed me. Using parameter annotations on record parameters is a genius idea! But there is still one thing that worries me. That's the incremental parsing because I want to use one parser in both compilers and IDEs. I have no time for maintaining more than one parsers. Could you pls tell me if incremental parsing is supported? If not, is there any simple way to fix it? Thank you!

    opened by Superice666 3
  • Semantic predicate failures as declared Exceptions

    Semantic predicate failures as declared Exceptions

    If the invocation of a ctor throws an Exception, the type of which is declared in the throws clause, it is considered a semantic predicate failure; the corresponding production rule is considered to fail to match the input. But this is not fatal, the parser will try alternatives if there's any.

    If a ctor throws any other Exception, it is a Fatal error that stops the parser immediately.

    If a ctor throws an Error, we do not catch it or handle it, it will propagate out of our parser. It is more fatal than Fatal.

    enhancement 
    opened by zhong-j-yu 1
  • release 1.0.0

    release 1.0.0

    Currently we are waiting for approval of the groupId: https://issues.sonatype.org/browse/OSSRH-71991

    Once that's approved, tag and release v1.0.0.0, publish to maven central. If everything is ok, update documents on HEAD about maven.

    enhancement 
    opened by zhong-j-yu 1
  • @Regex on Void

    @Regex on Void

    @Regex()Void is similar to @Regex()String, except the string value isn't retained. This can save both time and space. User can use Void to express the intention that the exact value doesn't matter and is ignored. Also Void is shorter than String.

    enhancement 
    opened by zhong-j-yu 0
  • Add ParseInfo

    Add ParseInfo

    Introduce special datatype ParseInfo, which can be inserted in any position in a ctor signature. It retains information about input regions that matched the rule, as well as the subrules.

    enhancement 
    opened by zhong-j-yu 0
  • Return type of ctor must be equal to datatype

    Return type of ctor must be equal to datatype

    Currently, we allow return type of a ctor to be a subtype of the target type. Therefore

    public A1 a1(...){...}
    

    is a ctor for type A, given A1 <: A.

    This makes sense from Java's point of view, where it's always safe to make the return type a more specific type. But it is problematic for mapping between grammar rules and ctors. Typically, a complex grammar contains rules like

    A = A1 | A2
    ...
    A1 = ...
    ...
    A2 = ...
    

    It's not easy to review a ctor catalog to find all ctors for A and confirm that they are in the correct order.

    If the user explicitly declare ctors Ai->A

    public A a1(A1 a1){ ... }
    

    we are in trouble of how to handle them together with ctors returning subtypes.

    We should take a simpler approach, which is easier to map grammar rules, easier to reason about, which also gives us more flexibility --

    The return type of ctors for A must be A exactly. It is easy to see all ctors for A in declaration order. If no such ctors are found, add implicit ctors Ai->A for direct subtypes as if they are

    public A a1(A1 a1){ return a1; }
    

    the order of these ctors is the order of subtypes which is also easy to see. This order is stable, and ctors for subtypes can be arranged in any order.

    The user may have good reasons to explicitly declare Ai->A -- to limit subtypes, to order them differently, to make ctors resemble grammar rules more closely, to do some transformation or test semantic predicate in method body.

    public A a1(A1 a1){ ... }
    public A a2(A2 a2){ ... }
    // or: public A a(Alt2<A1,A2> alt){...}
    
    public A1 a1(...){...}
    public A2 a2(...){...}
    
    enhancement 
    opened by zhong-j-yu 0
  • Writing a Tutorial

    Writing a Tutorial

    If someone could write a tutorial, I'd appreciate it very much and link to your article. It's better coming from someone other than the author of this library.

    documentation help wanted good first issue 
    opened by zhong-j-yu 2
Owner
Zhong Yu
Java Programmer. See my latest project http://rekex.org/
Zhong Yu
osc2checker is a grammar check tool for ASAM OpenSCENARIO 2 scenario files.

51WORLD OpenSCENARIO2 Grammar Checker (osc2checker) osc2checker is a grammar check tool for ASAM OpenSCENARIO 2 scenario files. It's implemented by AN

51Sim-One 14 Dec 7, 2022
Set of support modules for Java 8 datatypes (Optionals, date/time) and features (parameter names)

Overview This is a multi-module umbrella project for Jackson modules needed to support Java 8 features, especially with Jackson 2.x that only requires

FasterXML, LLC 372 Dec 23, 2022
Extension module to properly support datatypes of javax.money

Jackson Datatype Money Jackson Datatype Money is a Jackson module to support JSON serialization and deserialization of JavaMoney data types. It fills

Zalando SE 217 Jan 2, 2023
A command line parser generator

jbock is a command line parser that works similar to airline and picocli. While most of these other tools scan for annotations at runtime, jbock is an

H90 73 Dec 13, 2022
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR v4 Build status ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating

Antlr Project 13.6k Dec 28, 2022
JavaCC - a parser generator for building parsers from grammars. It can generate code in Java, C++ and C#.

JavaCC Java Compiler Compiler (JavaCC) is the most popular parser generator for use with Java applications. A parser generator is a tool that reads a

null 971 Dec 27, 2022
A fast JSON parser/generator for Java.

fastjson Fastjson is a Java library that can be used to convert Java Objects into their JSON representation. It can also be used to convert a JSON str

Alibaba 25.1k Dec 31, 2022
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR v4 Build status ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating

Antlr Project 13.6k Jan 3, 2023
A JNI code generator based on the JNI generator used by the eclipse SWT project

HawtJNI Description HawtJNI is a code generator that produces the JNI code needed to implement java native methods. It is based on the jnigen code gen

FuseSource 153 Nov 17, 2022
Log4j-payload-generator - Log4j jndi injects the Payload generator

0x01 简介 log4j-payload-generator是 woodpecker框架 生产log4 jndi注入漏洞payload的插件。目前可以一键生产以下5类payload。 原始payload {[upper|lower]:x}类型随机混payload {[upper|lower]:x}

null 469 Dec 30, 2022
OpenApi Generator - REST Client Generator

Quarkus - Openapi Generator Welcome to Quarkiverse! Congratulations and thank you for creating a new Quarkus extension project in Quarkiverse! Feel fr

Quarkiverse Hub 46 Jan 3, 2023
adt4j - Algebraic Data Types for Java

adt4j - Algebraic Data Types for Java This library implements Algebraic Data Types for Java. ADT4J provides annotation processor for @GenerateValueCla

Victor Nazarov 136 Aug 25, 2022
Java 8 annotation processor and framework for deriving algebraic data types constructors, pattern-matching, folds, optics and typeclasses.

Derive4J: Java 8 annotation processor for deriving algebraic data types constructors, pattern matching and more! tl;dr Show me how to write, say, the

null 543 Nov 23, 2022
Java 8 annotation processor and framework for deriving algebraic data types constructors, pattern-matching, folds, optics and typeclasses.

Derive4J: Java 8 annotation processor for deriving algebraic data types constructors, pattern matching and more! tl;dr Show me how to write, say, the

null 543 Nov 23, 2022
MathParser - a simple but powerful open-source math tool that parses and evaluates algebraic expressions written in pure java

MathParser is a simple but powerful open-source math tool that parses and evaluates algebraic expressions written in pure java. This projec

AmirHosseinAghajari 40 Dec 24, 2022
Java 1-15 Parser and Abstract Syntax Tree for Java, including preview features to Java 13

JavaParser This project contains a set of libraries implementing a Java 1.0 - Java 14 Parser with advanced analysis functionalities. This includes pre

JavaParser 4.5k Jan 5, 2023
High-performance JSON parser

HikariJSON A High-performance JSON parser. HikariJSON is targeted exclusively at Java 8. If you need legacy support, there are several decent librarie

Brett Wooldridge 454 Dec 31, 2022
Fast and Easy mapping from database and csv to POJO. A java micro ORM, lightweight alternative to iBatis and Hibernate. Fast Csv Parser and Csv Mapper

Simple Flat Mapper Release Notes Getting Started Docs Building it The build is using Maven. git clone https://github.com/arnaudroger/SimpleFlatMapper.

Arnaud Roger 418 Dec 17, 2022
Jwks RSA - JSON Web Key Set parser.

jwks-rsa Install Maven <dependency> <groupId>com.auth0</groupId> <artifactId>jwks-rsa</artifactId> <version>0.17.0</version> </dependency>

Auth0 158 Dec 30, 2022
Java 1-15 Parser and Abstract Syntax Tree for Java, including preview features to Java 13

JavaParser This project contains a set of libraries implementing a Java 1.0 - Java 14 Parser with advanced analysis functionalities. This includes pre

JavaParser 4.5k Jan 9, 2023