This library provides facilities to match an input string against a collection of regex patterns.

Overview

Regex Matcher

Tests Coverage Duplicated Lines (%) Vulnerabilities Security Rating Reliability Rating Maintainability Rating Technical Debt Quality Gate Status JitPack

This library provides facilities to match an input string against a collection of regex patterns. This library acts as a wrapper around the popular Chimera library, which allows it to be used in Java.

Chimera is a software regular expression matching engine that is a hybrid of Hyperscan and PCRE. The design goals of Chimera are to fully support PCRE syntax as well as to take advantage of the high performance nature of Hyperscan. Hyperscan is a high-performance multiple regex matching library available as open source with a C API.

You can learn more about them through the following links:

Build Process

You can get a released version of the Java jar file from this Jitpack link. The jar file contains the C++ library in its resources and is ready to use. But you may want to know the build process as a contributor or if you want to build the jar file from scratch. If this is the case, this section is for you.

Due to the nature of the library, this library consists of two parts, Java and C++ source files. The connection between the two parts is established through the JNI standard provided by Java. For this reason, the compilation and build process of each of these parts is different.

Building the C++ JNI library

If for any reason there is a change in the C++ source code side of the regex matcher, the JNI library must be rebuilt. The C++ part of this project, after being built, is placed in the form of a linux library file in the maven resource folder and is used by the Java program during execution.

Since compiling the C++ codes requires the Chimera library to be installed, all the requirements for compiling the C++ source files are provided in the form of a docker file. This docker file, in addition to providing all the requirements related to the C++ codes in an isolated environment, also builds them automatically and puts the generated library file (.so file) in compressed format (tar.gz file) in the required path for the Java language. For this reason, the only requirement to rebuild the module is to have Docker installed.

First, apply the changes you want to make to the C++ source codes. Then execute below command:

src/main/cpp/build_lib_jni.sh

If changes are made correctly, the library will be built successfully by Docker and the new library file will be copied to the appropriate path. You can further check it by running Java tests in next step.

Building the Java wrapper library

In both cases, where there is a change in source files of the Java wrapper or a change in the C++ library you should build and test the Java library. Note that even if the changes in C++ library does not affect the corresponding Java interfaces, you should build the Java library because the output jar file contains the whole C++ library in its resources. To build the Java library and run its tests, just call Maven:

mvn clean package

After confirming the changes, you can publish the changes as a new version.

Sample usage

Suppose you have a set of regular expression (regex) patterns through which you want to examine a variety of string inputs. At first, create a new matcher instance and add the patterns you want to match. By a boolean flag you will specify whether the pattern is case sensitive or not:

RegexMatcher regexMatcher = new RegexMatcher();
regexMatcher.addPattern(1L, ".*w.*", true); // anything contains exactly 'w'
regexMatcher.addPattern(1L, ".*y.*", true); // anything contains exactly 'y'
regexMatcher.addPattern(2L, ".*z.*", false); // anything contains 'z' or 'Z'

It is legal to have multiple patterns, sharing the same ID:

regexMatcher.addPattern(3L, ".*t.*", true); // anything contains exactly 't'
regexMatcher.addPattern(3L, ".*T.*", true); // anything contains exactly 'T'

Note that the patterns are dynamic, you can add or remove the patterns over time:

regexMatcher.removePattern(3L);

In next step, we want to match the input strings against the patterns. But to make the patterns effective, it is necessary to first call the prepare method:

(Arrays.asList(1L)), regexMatcher.match("text-contains-y-letter")); assertEquals(new HashSet<>(Arrays.asList(2L)), regexMatcher.match("text-contains-z-letter")); assertEquals(new HashSet<>(Arrays.asList(1L, 2L)), regexMatcher.match("text-contains-wz-letter")); assertEquals(new HashSet<>(Arrays.asList(1L, 2L)), regexMatcher.match("text-contains-wyz-letter")); assertTrue(regexMatcher.match("t").isEmpty());">
regexMatcher.preparePatterns();

assertEquals(new HashSet<>(Arrays.asList(1L)), regexMatcher.match("text-contains-w-letter"));
assertEquals(new HashSet<>(Arrays.asList(1L)), regexMatcher.match("text-contains-y-letter"));
assertEquals(new HashSet<>(Arrays.asList(2L)), regexMatcher.match("text-contains-z-letter"));
assertEquals(new HashSet<>(Arrays.asList(1L, 2L)), regexMatcher.match("text-contains-wz-letter"));
assertEquals(new HashSet<>(Arrays.asList(1L, 2L)), regexMatcher.match("text-contains-wyz-letter"));
assertTrue(regexMatcher.match("t").isEmpty());

At the end, do not forget to release the resources by closing the matcher after the job is finished:

regexMatcher.close()
You might also like...

Discord IPC - Pure Java 16 library

Pure Java 16 library for interacting with locally running Discord instance without the use of JNI.

Nov 14, 2022

Scaffolding is a library for Minestom that allows you to load and place schematics.

This library is very early in development and has too many bugs to count. For your own safety, you should not use it in a production environment.

Nov 29, 2022

Crackersanimator is a particle system library that works with the standard Android UI

Crackersanimator is a particle system library that works with the standard Android UI

Crackersanimator is a particle system library that works with the standard Android UI. This library build from https://github.com/plattysoft/Leonids library but make some update to support for latest version of android.

Jun 14, 2022

A Local implementation of a java library functions to create a serverside and clientside application which will communicate over TCP using given port and ip address.

A Local implementation of a java library functions to create a serverside and clientside application which will communicate over TCP using given port and ip address.

A Local implementation of java library functions to create a serverside and clientside application which will communicate over TCP using given port and ip address.

Feb 12, 2022

JBinanceAPI - An easy to maintain java library that will cover all available Binance endpoints & stream

JBinanceAPI - A full java Binance API (REST + Websockets)

Dec 22, 2022

Copy Regex Matches is a Burp Suite plugin to copy regex matches from selected requests and/or responses to the clipboard.

Copy Regex Matches is a Burp Suite plugin to copy regex matches from selected requests and/or responses to the clipboard.

Copy Regex Matches Copy Regex Matches is a Burp Suite plugin to copy regex matches from selected requests and/or responses to the clipboard. Install D

Dec 2, 2022

The loader for mods under Fabric. It provides mod loading facilities and useful abstractions for other mods to use, which is compatible with spigot now

Silk The loader for mods under Fabric. It provides mod loading facilities and useful abstractions for other mods to use, which is compatible with spig

Oct 1, 2022

Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...

java-string-similarity A library implementing different string similarity and distance measures. A dozen of algorithms (including Levenshtein edit dis

Dec 29, 2022

NeverScapeAlone! Instantly match with other players and take the hassle out of finding partners for bosses, minigames, skills, pking, and more!

NeverScapeAlone An Old School RuneScape Matchmaking Plugin on RuneLite! Tired of having to scour friend's chats, discords, and forums to find friends

Sep 2, 2022

A complete and performing library to highlight text snippets (EditText, SpannableString and TextView) using Spannable with Regular Expressions (Regex) for Android.

A complete and performing library to highlight text snippets (EditText, SpannableString and TextView) using Spannable with Regular Expressions (Regex) for Android.

Highlight A complete and performing library to highlight text snippets (EditText/Editable and TextView) using Spannable with Regular Expressions (Rege

Dec 22, 2022

A collection of design patterns implemented in Java

Design Patterns A collection of design patterns implemented in Java and referenced from this book: Design Patterns: Elements of Reusable Object-Orient

Sep 5, 2022

JLine is a Java library for handling console input.

JLine JLine is a Java library for handling console input. It is similar in functionality to BSD editline and GNU readline but with additional features

Jan 5, 2023

PCRE RegEx matching Log4Shell CVE-2021-44228 IOC in your logs

PCRE RegEx matching Log4Shell CVE-2021-44228 IOC in your logs

Log4Shell-Rex The following RegEx was written in an attempt to match indicators of a Log4Shell (CVE-2021-44228 and CVE-2021-45046) exploitation. If yo

Nov 9, 2022

It creates a Trie with given input and perform character based operations

It creates a Trie with given input and perform character based operations

Trie-with-character-based-operations It creates a Trie with given input and perform character based operations Boolean Search(String arg): This functi

Jul 3, 2022

Your new way of getting player input. An alternative to the Conversation API

Your new way of getting player input. An alternative to the Conversation API

Dialogue Dialogue is a Spigot API that completely revamps the Conversation API. This is not a plugin you put on your server. Want to know what's curre

Aug 26, 2022

Make the Velocity proxy run commands based on backend input.

Command Sync Server This plugin allows the Velocity proxy to run commands based on backend input. Purpose The purpose of this is to allow easy access

May 23, 2022

Spring Integration provides an extension of the Spring programming model to support the well-known Enterprise Integration Patterns (EIP)

Spring Integration Code of Conduct Please see our Code of conduct. Reporting Security Vulnerabilities Please see our Security policy. Checking out and

Dec 30, 2022

Apache Log4j 2 is an upgrade to Log4j that provides significant improvements over its predecessor, Log4j 1.x, and provides many of the improvements available in Logback while fixing some inherent problems in Logback's architecture.

Apache Log4j 2 Apache Log4j 2 is an upgrade to Log4j that provides significant improvements over its predecessor, Log4j 1.x, and provides many of the

Jan 4, 2023

State-of-the-art cryptography to protect your world seed against seed cracking tools

SecureSeed State-of-the-art cryptography to protect your world seed against seed cracking tools. This mod is written for the Fabric Mod Loader. If you

Dec 28, 2022
Comments
  • Build instructions on macos

    Build instructions on macos

    I'd like to thank first for putting up all the necessary instructions required to build Chimera on Linux. Could you also please share the build instructions for mac platform ? I require the project to run on my mac machine.

    Thanks in advance.

    opened by simpleyetawesomeprojects 0
Releases(1.0.2)
  • 1.0.2(Oct 15, 2022)

    What's Changed

    • Fix small issue in regex matcher by @Borjianamin98 in https://github.com/sahabpardaz/regex-matcher/pull/7

    Full Changelog: https://github.com/sahabpardaz/regex-matcher/compare/1.0.1...1.0.2

    Source code(tar.gz)
    Source code(zip)
  • 1.0.1(Sep 26, 2022)

    What's Changed

    • Improve README by @Borjianamin98 in https://github.com/sahabpardaz/regex-matcher/pull/4
    • Fix some thread safe issues of regex matcher by @salehsagharchi in https://github.com/sahabpardaz/regex-matcher/pull/6

    New Contributors

    • @salehsagharchi made their first contribution in https://github.com/sahabpardaz/regex-matcher/pull/6

    Full Changelog: https://github.com/sahabpardaz/regex-matcher/compare/1.0.0...1.0.1

    Source code(tar.gz)
    Source code(zip)
Owner
Sahab
Sahab
Implementation of various design patterns in C++, Java and Python

DesignPatterns Implementation of various design patterns in C++, Java and Python. Strategy Pattern Description: Strategy Pattern in implemented in a p

Lakshmanan Meiyappan 12 Jul 20, 2022
This API provides functionalities to lookup and manage user accounts

This API provides functionalities to lookup and manage user accounts. Any human or computer system that will interact with any of the API's requires being authenticated as a user. The API allows for common functionalities such as creating a new user account, resetting passwords and generating JWT tokens.

Narek Naltakyan 1 Jan 22, 2022
Backend for Saunah Management App provides a REST-API for the Saunah management app

?? ?? Saunah Backend Backend for Saunah Management App. This application provides a REST-API for the Saunah management app. ????‍?? Technology Stack T

null 2 Jun 13, 2022
EJE provides accessible methods for handling events/actions/listeners

Easy-Java-Events EJE provides accessible methods for handling events/actions/listeners. Add this as dependency to your project via Maven/Gradle/Sbt/Le

Osiris-Team 4 Aug 23, 2022
A Java Telegram bot that provides thirukkural in tamil, english translations with all information!

A Java Telegram bot that provides thirukkural in tamil, english translations with all information!

VINU 5 Oct 19, 2022
Java XML library. A really cool one. Obviously.

XMLBeam This is a Java XML library with an extraordinary expressive API. By using XPath for read and write operations, many operations take only one l

Sven Ewald 70 Aug 25, 2022
icecream-java is a Java port of the icecream library for Python.

icecream-java is a Java port of the icecream library for Python.

Akshay Thakare 20 Apr 7, 2022
JPassport works like Java Native Access (JNA) but uses the Foreign Linker API instead of JNI. Similar to JNA, you declare a Java interface that is bound to the external C library using method names.

JPassport works like Java Native Access (JNA) but uses the Foreign Linker API instead of JNI. Similar to JNA, you declare a Java interface t

null 28 Dec 30, 2022
Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text

Welcome to Apache OpenNLP! The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. This toolkit is

The Apache Software Foundation 1.2k Dec 29, 2022
Java serialization library, proto compiler, code generator

A java serialization library with built-in support for forward-backward compatibility (schema evolution) and validation. efficient, both in speed and

protostuff 1.9k Dec 23, 2022