gMark: a domain- and query language-independent graph instance and query workload generator

Overview

gMark

gMark is a domain- and query language-independent graph instance and query workload generator. The original version of gMark is available on GitHub at gbagan/gmark. The version of gMark in this repository has as goal to rewrite gMark such that it is easier to extend and has better documented code. Currently the focus of the rewrite is on query generation, but the end goal is full feature parity with the original version of gMark. The rewrite has not reached that point yet and is notably still missing graph generation, RPQ (regular path query) based queries and several output formats. However, the current version does also offer some features not present in the original version of gMark, such as the ability to generate CPQ (conjunctive path query) based queries and a graphical user interface for the program.

Documentation (javadoc) can be found at: gmark.docs.roanh.dev more details on gMark itself can be found in the technical report arxiv.org/abs/1511.08386. Details regarding the aforementioned queries containing CPQs can be found in my report titled Conjunctive Path Query Generation for Benchmarking.

Getting started with gMark

To support a wide variety of of use cases gMark is a available in a number of different formats.

Command line usage

When using gMark on the command line the following arguments are supported:

usage: gmark [-c 
   
    ] [-f] [-g 
    
     ] [-h] [-o 
     
      ] [-s 
      
       ] [-w 
       
        ] -c,--config 
        
          The workload and graph configuration file -f,--force Overwrite existing files if present -g,--graph 
         
           Triggers graph generation, a graph size can be provided (overrides the ones set in the configuration file) -h,--help Prints this help text -o,--output 
          
            The folder to write the generated output to -s,--syntax 
           
             The concrete syntax(es) to output -w,--workload 
            
              Triggers workload generation, a previously generated input workload can optionally be provided to generate concrete syntaxes for instead 
            
           
          
         
        
       
      
     
    
   

For example, a workload of queries in SQL format can be generated using:

gmark -c config.xml -o ./output -s sql -w

An example configuration XML file can be found both in this repository and in the graphical interface of the standalone executable. The example RPQ workload configuration files included in the original gMark repository are also compatible and can be found in the use-cases folder.

Executable download

gMark is available as a standalone portable executable that has both a graphical interface and a command line interface. The graphical interface will only be launched when no command line arguments are passed. This version of gMark requires Java 8 or higher to run.

All releases: releases
GitHub repository: RoanH/gMark

Command line usage of the standalone executable

The following commands show how to generate a workload of queries in SQL format using the standalone executable.

Windows executable
./gMark.exe -c config.xml -o ./output -s sql -w
Runnable Java archive
java -jar gMark.jar -c config.xml -o ./output -s sql -w

Docker image

gMark is available as a docker image on Docker Hub. This means that you can obtain the image using the following command:

docker pull roanh/gmark:latest

Using the image then works much the same as the regular command line version of gMark. For example, we can generate the example workload of queries in SQL format using the following command:

docker run --rm -v "$PWD/data:/data" roanh/gmark:latest -c /data/config.xml -o /data/queries -s sql -w

Note that we mount a local folder called data into the container to pass our configuration file and to retrieve the generated queries.

Maven artifact Maven Central

gMark is available on maven central as an artifact so it can be included directly in another Java project using Gradle or Maven. This way it becomes possible to directly use all the implemented constructs and utilities. A hosted version of the javadoc for gMark can be found at gmark.docs.roanh.dev.

Gradle
repositories{
	mavenCentral()
}

dependencies{
	implementation 'dev.roanh.gmark:gmark:1.0'
}
Maven
<dependency>
	<groupId>dev.roanh.gmarkgroupId>
	<artifactId>gmarkartifactId>
	<version>1.0version>
dependency>

Development of gMark

This repository contain an Eclipse & Gradle project with Util and Apache Commons CLI as the only dependencies. Development work can be done using the Eclipse IDE or using any other Gradle compatible IDE. Continuous integration will check that all source files use Unix style line endings (LF) and that all functions and fields have valid documentation. Unit testing is employed to test core functionality, CI will also check for regressions using these tests. A hosted version of the javadoc for gMark can be found at gmark.docs.roanh.dev. Compiling the runnable Java archive (JAR) release of gMark using Gradle can be done using the following command in the gMark directory:

./gradlew clientJar

After which the generated JAR can be found in the build/libs directory. On windows ./gradlew.bat should be used instead of ./gradlew.

History

Project development started: 25th of September, 2021.

You might also like...

An example Twitch.tv bot that allows you to manage channel rewards (without requiring a message), and chat messages.

Twitch Bot Example shit code that can be used as a template for a twitch bot that takes advantage of channel rewards (that dont require text input) an

Nov 3, 2022

A Minecraft mod that extends Diet. It rebalances food stats and gives subtle perks for different food groups.

------------------------------------------- Source installation information for modders ------------------------------------------- This code follows

Mar 8, 2022

Tiny and fast event dispatcher.

HookDispatcher - Tiny and fast event dispatcher. Installation Gradle repositories { maven { url 'https://jitpack.io' } } dependencies { imple

Dec 7, 2021

This is a very lightweight plugin for Velocity proxy. Have functions including tabList, pingList and global chat.

This is a very lightweight plugin for Velocity proxy. Have functions including tabList, pingList and global chat.

Essential-PlayerInfo This repo had been transported to our team, and will not be updated here. https://github.com/Team-Jackdaw/Essential-PlayerInfo In

Mar 4, 2022

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

SeaTunnel SeaTunnel was formerly named Waterdrop , and renamed SeaTunnel since October 12, 2021. SeaTunnel is a very easy-to-use ultra-high-performanc

Jan 2, 2023

ApnaKhata is a free staff attendance & payroll management app that can be used by owners and employees according to their perspectives

ApnaKhata is a free staff attendance & payroll management app that can be used by owners and employees according to their perspectives

ApnaKhata is a free staff attendance & payroll management app that can be used by owners and employees according to their perspectives. It enables employers to easily manage their attendance, Working time hours, and salary Details. The owner can keep track of employees’ Working hours and salary Details

Oct 27, 2022

A template and introduction for the first kafka stream application. The readme file contains all the required commands to run the Kafka cluster from Scrach

Kafka Streams Template Maven Project This project will be used to create the followings: A Kafka Producer Application that will start producing random

Jan 10, 2022

Plugin for keycloak that serves as an event listener, displaying user information in the log when there are registration and login events

Keycloak - Event listener Details Plugin for keycloak that serves as an event listener, displaying user information in the log when there are registra

Jan 14, 2022

Demo project for Kafka Ignite streamer, Kafka as source and Ignite cache as sink

ignite-kafka-streamer **Description : Demo project for Kafka Ignite streamer, Kafka as source and Ignite cache as sink Step-1) Run both Zookeeper and

Feb 1, 2022
Releases(v1.1)
Owner
Roan
I'm just a random programmer :3 . At the moment I really like osu! and writing programs for it. My favorite programming language is Java. Discord: Roan#5667
Roan
Evgeniy Khyst 54 Dec 28, 2022
Microservice-based online payment system for customers and merchants using RESTful APIs and message queues

Microservice-based online payment system for customers and merchants using RESTful APIs and message queues

Daniel Larsen 1 Mar 23, 2022
Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

Firehose - Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

Open DataOps Foundation 279 Dec 22, 2022
Efficient reliable UDP unicast, UDP multicast, and IPC message transport

Aeron Efficient reliable UDP unicast, UDP multicast, and IPC message transport. Java and C++ clients are available in this repository, and a .NET clie

Real Logic 6.3k Jan 9, 2023
Event bus for Android and Java that simplifies communication between Activities, Fragments, Threads, Services, etc. Less code, better quality.

EventBus EventBus is a publish/subscribe event bus for Android and Java. EventBus... simplifies the communication between components decouples event s

Markus Junginger 24.2k Jan 3, 2023
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.

Apache Camel Apache Camel is a powerful, open-source integration framework based on prevalent Enterprise Integration Patterns with powerful bean integ

The Apache Software Foundation 4.7k Dec 31, 2022
Powerful event-bus optimized for high throughput in multi-threaded applications. Features: Sync and Async event publication, weak/strong references, event filtering, annotation driven

MBassador MBassador is a light-weight, high-performance event bus implementing the publish subscribe pattern. It is designed for ease of use and aims

Benjamin Diedrichsen 930 Jan 6, 2023
Fast and reliable message broker built on top of Kafka.

Hermes Hermes is an asynchronous message broker built on top of Kafka. We provide reliable, fault tolerant REST interface for message publishing and a

Allegro Tech 742 Jan 3, 2023
This repository contains a functional example of an order delivery service similar to UberEats, DoorDash, and Instacart.

Order Delivery Microservice Example In an event-driven microservices architecture, the concept of a domain event is central to the behavior of each se

Kenny Bastani 198 Dec 7, 2022
Dataflow template which read data from Kafka (Support SSL), transform, and outputs the resulting records to BigQuery

Kafka to BigQuery Dataflow Template The pipeline template read data from Kafka (Support SSL), transform the data and outputs the resulting records to

DoiT International 12 Jun 1, 2021