Parallel boolean circuit evaluation

Overview

splitmap

Build Status Coverage Status

This library builds on top of RoaringBitmap to provide a parallel implementation of boolean circuits (multidimensional filters) and arbitrary aggregations over filters.

For instance, to compute a sum product on a dataset filtered such that only one of two conditions holds:

    PrefixIndex<ChunkedDoubleArray> quantities = ...
    PrefixIndex<ChunkedDoubleArray> prices = ...
    SplitMap februarySalesIndex = ...
    SplitMap luxuryProductsIndex = ...
    QueryContext<String, PriceQty> context = new QueryContext<>(
    Map.ofEntries(entry("luxuryProducts", luxuryProductsIndex), entry("febSales", februarySalesIndex), 
    Map.ofEntries(entry(PRICE, prices), entry(QTY, quantities)))); 

    double februaryRevenueFromLuxuryProducts = 
            Circuits.evaluateIfKeysIntersect(context, slice -> slice.get("febSales").and(slice.get("luxuryProducts")), "febSales", "luxuryProducts")
            .stream()
            .parallel()
            .mapToDouble(partition -> partition.reduceDouble(SumProduct.<PriceQty>reducer(price, quantities)))
            .sum();

Which, over millions of quantities and prices, can be computed in under 200 microseconds on a modern processor, where parallel streams may take upwards of 20ms.

It is easy to write arbitrary routines combining filtering, calculation and aggregation. For example statistical calculations evaluated with filter criteria.

  public double productMomentCorrelationCoefficient() {
    // calculate the correlation coefficient between prices observed on different exchanges
    PrefixIndex<ChunkedDoubleArray> exchange1Prices = ...
    PrefixIndex<ChunkedDoubleArray> exchange2Prices = ...
    SplitMap beforeClose = ...
    SplitMap afterOpen = ...
    QueryContext<Exchange, PriceQty> context = new QueryContext<>(
    Map.ofEntries(entry(BEFORE_CLOSE, beforeClose), entry(AFTER_OPEN, afterOpen), 
    Map.ofEntries(entry(NASDAQ, exchange1Prices), entry(LSE, exchange2Prices)))); 
    // evaluate product moment correlation coefficient 
    return Circuits.evaluate(context, slice -> slice.get(BEFORE_CLOSE).or(slice.get(AFTER_OPEN)), 
            BEFORE_CLOSE, AFTER_OPEN) 
            .stream()
            .parallel()
            .map(partition -> partition.reduce(SimpleLinearRegression.<Exchanges>reducer(exchange1Prices, exchange2Prices)))
            .collect(SimpleLinearRegression.pmcc());
  }
You might also like...
Comments
  • Performance Tuning

    Performance Tuning

    With a purposefully high level API there are several abstraction costs being incurred.

    • [x] Container.forEach preventing vectorisation.
    • [x] ChunkedDoubleArray.get being used rather than iterating over its pages.
    • [ ] If the model of the index is an enum, EnumMap should be used.
    opened by richardstartin 1
  • Investigate index on keys of PrefixIndex

    Investigate index on keys of PrefixIndex

    It can take quite a while to iterate over sometimes hundreds of empty key words. It might be worthwhile putting a 64 word index on the key words which contain keys, so they can be skipped over.

    opened by richardstartin 1
Owner
Richard Startin
Richard Startin
Parallel programming quick sort and parallel sum examples with Fork-join, RecursiveTask, RecursiveAction

QuickSortMultiThreading Parallel programming quick sort and parallel sum examples with Fork-join, RecursiveTask<T>, RecursiveAction Fork-Join Fork-Joi

Güven TUNCAY 4 Jun 12, 2022
The Java implementation of "A Survey of Trajectory Distance Measures and Performance Evaluation". VLDBJ 2020

A Survey of Trajectory Distance Measures and Performance Evaluation The Java implementation of the following paper: Han Su, Shuncheng Liu, Bolong Zhen

Shuncheng Liu 99 Oct 19, 2022
A circuit breaker design pattern for dropwizard

Status Circuit Breaker Library This library provides a simple implementation of a circuit breaker design pattern. It uses dropwizard metrics to provid

null 41 Apr 7, 2022
Spring boot microservice example with Eureka Server + Eureka Client + Spring Cloud API Gateway + OAuth2.0 + Circuit Breaker + Resilience4J + FeignClient + RestTemplate

Spring boot microservice example Spring boot microservice example with Eureka Server + Eureka Client + Spring Cloud API Gateway + OAuth2.0 + Circuit B

Subhash Lamba 47 Dec 29, 2022
LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan), audio (OpenAL), parallel computing (OpenCL, CUDA) and XR (OpenVR, LibOVR) applications.

LWJGL - Lightweight Java Game Library 3 LWJGL (https://www.lwjgl.org) is a Java library that enables cross-platform access to popular native APIs usef

Lightweight Java Game Library 4k Dec 29, 2022
LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan), audio (OpenAL), parallel computing (OpenCL, CUDA) and XR (OpenVR, LibOVR) applications.

LWJGL - Lightweight Java Game Library 3 LWJGL (https://www.lwjgl.org) is a Java library that enables cross-platform access to popular native APIs usef

Lightweight Java Game Library 4k Dec 29, 2022
Program finds average number of words in each comment given a large data set by use of hadoop's map reduce to work in parallel efficiently.

Finding average number of words in all the comments in a data set ?? Mapper Function In the mapper function we first tokenize entire data and then fin

Aleezeh Usman 3 Aug 23, 2021
MFP (Mathematic language For Parallel computing) Android library

MFPAndroLib This is MFP (Mathematic language For Parallel computing) Android library project. MFP is a novel scripting programming language designed a

Tony Cui 6 Sep 5, 2022