Persistent (immutable) collections for Java and Kotlin

Overview

What are Dexx Collections?

Dexx Collections are a port of Scala's immutable, persistent collection classes to pure Java.

Persistent in the context of functional data structures means the data structure preserves the previous version of itself when modified. This means any reference to a collection is effectively immutable. However, modifications can be made by returning a new version of the data structure, leaving the original structure unchanged.

Here's an example using Dexx's Sets (examples are in Kotlin for conciseness, but the collections are pure Java):

val set1 = Sets.of(1, 2, 3)
val set2 = set1.add(4)
val set3 = set1.remove(1)
println(set1) // Prints Set(1, 2, 3)
println(set2) // Prints Set(1, 2, 3, 4)
println(set3) // Prints Set(2, 3)

From the above example we can see that although we've made modifications to set1 to create set2 and set3, the contents of set1 remain unchanged.

Note: There's now first class support for Kotlin - see the kollection module README for more information.

Why port?

Scala's collections can be directly used from Java, but the resulting code is far from idiomatic. Scala's standard library is also large and binary incompatible between versions.

Secondly, a pure Java implementation of functional persistent collections is usable from not only Java, but other JVM languages that interoperate with Java such as Kotlin, Ceylon or GWT. In fact, the collections have been specifically designed for use with Kotlin.

Overview

The diagram below shows Dexx's class hierarchy (interfaces are in blue and concrete implementations are in green).

Dexx Collections Overview

Note that the interfaces such as Map, Set and List are not related to the java.util equivalents as persistent collections require all modification methods such as add and remove to return a new collection instance.

Status

  • All collections have been implemented
  • HashSet, TreeSet, HashMap, TreeMap and Vector are ports from Scala
  • ConsList and ArrayList have been written from scratch.
  • Helper classes for construction and adapters to java.util collections are available
  • Test coverage is fairly comprehensive: 95% line and 90% branch at present

Dependencies

  • There are no runtime dependencies
  • JetBrain's annotations (@NotNull and @Nullable) are used in the source to support Kotlin's nullable types, but they are not required at runtime.
  • The tests are written in Kotlin, but again this is not a runtime dependency

Roadmap

  • Explore annotating methods that return a new collection with @CheckReturnValue to allow static verification of collection usage.
  • Active development is essentially complete. Further work is expected to be bug fixes and refinements.

Release Notes

  • 0.7:
    • Fixes #11 - a balancing error in red black trees
  • 0.6:
    • Added OSGI metadata (thanks ajs6f)
    • Make internal fields final (thanks mkull)
    • Performance improvement to first and last of TreeMap & TreeSet (thanks mkull)
  • 0.5:
    • Updated to 1.0.0
    • Added toImmutableMap() conversions from existing Maps
  • 0.4:
    • Updated to 1.0.0-rc-1036
    • Removed accidental assertJ compile dependency in kollection (thanks @brianegan)
  • 0.3.1:
    • Added a native Kotlin api in the kollection module
    • Converted the build to gradle from maven
    • Renamed dexx-collections artifact to collection
  • 0.2:
    • Add LinkedLists support with ConsList as the default implementation
    • Add ArrayList as an alternative IndexedList implementation
    • Formalise the Builder contract and enforce at runtime
  • 0.1:
    • Includes ports of Scala's HashSet, TreeSet, HashMap, TreeMap and Vector

License

This project is licensed under a MIT license. Portions ported from Scala are Scala's 3-clause BSD license.

Usage

Adding to your project

Version 0.7 has been released and is available in Maven Central here. You can use it via the following gradle dependency:

'com.github.andrewoma.dexx:collection:0.7' // For Java
'com.github.andrewoma.dexx:kollection:0.7' // For Kotlin
Constructing collections

Each of the leaf interfaces (Set, SortedSet, Map, SortedMap, IndexedList and LinkedList) have associated companion classes with static methods for construction.

The companion class uses the plural form of the interface. e.g. Set has a companion class of Sets.

To build a collection from a fixed number of elements, use the overloaded of() methods. e.g.

val set = Sets.of(1, 2, 3)

To build a collection from a java.util collection, use the copyOf() methods. e.g.

val set = Sets.copyOf(javaCollection)

Builders should be used when incrementally constructing a collection. This allows for more efficient structures to be used internally during construction. In the case of LinkedList, using a builder is important as LinkedList does not support appending without copying the entire collection.

val builder = Sets.builder<Int>()
for (i in 1..100) {
    builder.add(i)
}
val set = builder.build()
Viewing as java.util collections

Unfortunately, the java.util collection interfaces are not compatible with persistent collections as modifications such as add() must return a new collection instance, leaving the original untouched.

However, all collections can be viewed as an immutable form of their java.util equivalent by using the the as...() methods.

val javaSet = Sets.of(1, 2, 3).asSet() // Now a java.util.Set
Where are filter(), map() and friends?

Such transformations are deliberately not supported:

Here's an example of using lazy evaluation in a functional style with Kotlin:

val set = SortedSets.of(1, 2, 3, 4, 5, 6).asSequence()
        .filter { it % 2 == 0 }
        .map { "$it is even" }
        .take(2)
        .toImmutableSet()

assertEquals(SortedSets.of("2 is even", "4 is even"), set)

The example above uses Kotlins in-built extension function that converts any Iterable into a Sequence. It also uses the following extension functions to add Sequence<T>.toImmutableSet() to cleanly convert the sequence back into a Dexx Collection.

fun <T, R> Sequence<T>.build(builder: Builder<T, R>): R {
    this.forEach { builder.add(it) }
    return builder.build()
}

fun <T> Sequence<T>.toImmutableSet(): SortedSet<T> = build(SortedSets.builder<T>())

Performance

Benchmarking is still a work in progress (all the warnings about JVM benchmarks apply). The results so far running on Mac OS X 10.11.1 x86_64 with JDK 1.8.0_65 (Oracle Corporation 25.65-b01) are here.

My conclusions so far are that the collections perform adequately to be used as a drop-in replacement for the majority of use cases. While slower, slow is generally referring to millions of operations per second.

In general, mutating methods incur a overhead of 2-5 times that of java.util equivalents and reading operations are 1-1.5 time slower.

@ptitjes Has done some more rigorous benchmarks here: https://github.com/ptitjes/benchmark-immutables/blob/master/results/2016-10-02-23:56:36.pdf

Development

  • Dexx is built with gradle. Use ./gradlew install to build and install into your local repository.
  • To run the benchmarks, use ./gradlew check -PdexxTestMode=BENCHMARK --info | grep '^ BENCHMARK:'.
  • To generate coverage reports, use:
./gradlew :collection:clean :collection:check :collection:jacocoTestReport
 open collection/build/jacocoHtml/index.html
  • By default, a quick version of tests are run. Getting better test coverage of Vectors requires large collections. To run tests with complete coverage use: ./gradlew -PdexxTestMode=COMPLETE :collection:clean :collection:check :collection:jacocoTestReport

Method counts

For android developers, here are method counts:

  • 'com.github.andrewoma.dexx:collection:0.6' = 1036 methods
  • 'com.github.andrewoma.dexx:kollection:0.6' = 1213 methods

Build Status

Comments
  • Set< A > Does not use .equals as comments contend.

    Set< A > Does not use .equals as comments contend.

    I had an object consisting of three scalar values the .equals() method used the equals() method on these for comparison.

    The set continued to contain duplicates. Implementing hashCode() with respect to the scalar values was the only solution.

    opened by greyson 5
  • OSGi metadata?

    OSGi metadata?

    It would be useful for the Dexx artifacts to include OSGi metadata, and the cost is very low. I include a PR that does the job for the Java artifact. Thanks!

    opened by ajs6f 4
  • Add method count for Android developers

    Add method count for Android developers

    Android has a 64k method limit, so a lot of Android developers take method count into consideration when choosing libraries.

    As this library appears to be pretty compact (especially compared to some of the bloat-ridden alternatives like Guava) highlighting the method count might encourage more use by Android developers.

    opened by vaughandroid 3
  • Remove all instanceof to test for black/red trees

    Remove all instanceof to test for black/red trees

    There was a performance bottleneck with all the instanceof tests done to recursively test for black or red sub-trees. Adding 100 elements is now almost 3 times faster.

    Benchmark                      (implementation)  (size)  Mode  Cnt      Score      Error   Units
    Sets.append                        Dexx TreeSet     100  avgt   20  24352.449  ± 654.566   ns/op
    Sets.append               Modified Dexx TreeSet     100  avgt   20   7647.562  ± 282.896   ns/op
    
    opened by ptitjes 3
  • Migrate AssertJ to a test compile dependency

    Migrate AssertJ to a test compile dependency

    Hello! Thanks for the cool library. Been playing with it on the side and it's working really nicely :)

    This is a simple change.

    I've moved AssertJ into the list of testCompile dependencies. This should reduce the method count (a win for Android development) and reduce version conflicts for folks consuming this library.

    Let me know if this works from your end!

    opened by brianegan 3
  • Documentation request

    Documentation request

    Is there anyway to add benchmarks of what Dexx's performance is compared to Java acting in the same way? I've been using the java.util collections in an immutable fashion for a while, and the most telling benchmark in there for immutable use of the java collections is prepending elements to a list.

    Also, a side note, it may be nice to have Sets.collector(), Lists.collector() etc to work with Java 1.8's collection streaming interface; I've done a couple of trivial ones that I might be able to make a pull request for when I finish with the project I'm currently working on (scheduled for delivery mid-to-late March).

    opened by greyson 2
  • "Defect: unexpected empty zipper while computing range" for some range queries

    import com.github.andrewoma.dexx.collection.TreeMap;
    
    public class Test {
    	public static void main(String[] args) {
    		TreeMap<Integer, String> map = new TreeMap<>();
    		map = map.put(new Integer(0), "0");
    		map = map.put(new Integer(50), "50");		
    		System.out.println(map.to(new Integer(101), false).values());
    	}	
    }```
    
    This little harmless looking piece of code generates an exception:
    java.lang.RuntimeException: Defect: unexpected empty zipper while computing range
    	at com.github.andrewoma.dexx.collection.internal.redblack.RedBlackTree.findDepth(RedBlackTree.java:514)
    	at com.github.andrewoma.dexx.collection.internal.redblack.RedBlackTree.findDepth(RedBlackTree.java:517)
    	at com.github.andrewoma.dexx.collection.internal.redblack.RedBlackTree.rebalance(RedBlackTree.java:533)
    	at com.github.andrewoma.dexx.collection.internal.redblack.RedBlackTree.doUntil(RedBlackTree.java:390)
    	at com.github.andrewoma.dexx.collection.internal.redblack.RedBlackTree.until(RedBlackTree.java:128)
    	at com.github.andrewoma.dexx.collection.TreeMap.to(TreeMap.java:169)
    
    Note that adding another pair, adding the values in a different order or querying for a key that is available prevents the exception from being thrown.
    
    Please fix
    opened by mfischerq 2
  • New release to Maven Central?

    New release to Maven Central?

    The version of Dexx Collections available from Maven Central is 0.2, but it seems that there have been several moves ahead since 2014 (when that Maven Central release was made). Is it possible to please have a more recent release supplied? Thanks!

    opened by ajs6f 2
  • Add ArrayLists factory class and optimize copyOf(E[] es)

    Add ArrayLists factory class and optimize copyOf(E[] es)

    Sometimes, I directly create ArrayLists from arrays and I am forced to use a builder. The proposed new ArrayLists.of|copyOf methods fix that.

    Also, all the other types of lists have a direct XXXList.builder() whereas for ArrayList you are force to do ArrayList.<E>factory().newBuilder() which doesn't play really nice with type inference of completion in IntelliJ IDEA.

    opened by ptitjes 1
  • README provides Maven instructions instead of Gradle

    README provides Maven instructions instead of Gradle

    * Dexx is currently built with maven. Usemvn installto build and install into your local repository. should become * Dexx is currently built with gradle. Use./gradlew installto build and install into your local repository.

    Not sure what should become of this block:

    * By default, a quick version of tests are run. Getting better test coverage of `Vectors` requires large
      collections. To run tests with complete coverage use: `mvn -Ddexx.test.mode=COMPLETE -P cobertura clean cobertura:cobertura`
    * To run the benchmarks, use `mvn -Ddexx.test.mode=BENCHMARK test` (results are lines starting with BENCHMARK).
    
    opened by jmmk 1
  • final fields

    final fields

    Immutable Objects must have all-final fields else the JVM wont grant a happens-before relation when publishing to other threads.

    Also changed tree-first()/last/() to avoid throwing and handling an exception.

    opened by MarkusKull 0
  • Constructing a map from java.util.Map

    Constructing a map from java.util.Map

    The following does not work:

    java.util.Map<String, Integer> javaMap = new java.util.HashMap<String, Integer>();
    Map<String, Integer> immutableMap = Maps.copyOf(javaMap.entrySet());
    

    Compile error:

    build/plugins/java/Dexx.java:20: error: no suitable method found for copyOf(java.util.Map<String,Integer>)
    Map<String, Integer> immutableMap = Maps.copyOf(javaMap);
                                            ^
        method Maps.<K#1,V#1>copyOf(Iterable<Pair<K#1,V#1>>) is not applicable
          (cannot infer type-variable(s) K#1,V#1
            (argument mismatch; java.util.Map<String,Integer> cannot be converted to Iterable<Pair<K#1,V#1>>))
        method Maps.<K#2,V#2>copyOf(Iterator<Pair<K#2,V#2>>) is not applicable
          (cannot infer type-variable(s) K#2,V#2
            (argument mismatch; java.util.Map<String,Integer> cannot be converted to Iterator<Pair<K#2,V#2>>))
        method Maps.<K#3,V#3>copyOf(Pair<K#3,V#3>[]) is not applicable
          (cannot infer type-variable(s) K#3,V#3
            (argument mismatch; java.util.Map<String,Integer> cannot be converted to Pair<K#3,V#3>[]))
      where K#1,V#1,K#2,V#2,K#3,V#3 are type-variables:
        K#1 extends Object declared in method <K#1,V#1>copyOf(Iterable<Pair<K#1,V#1>>)
        V#1 extends Object declared in method <K#1,V#1>copyOf(Iterable<Pair<K#1,V#1>>)
        K#2 extends Object declared in method <K#2,V#2>copyOf(Iterator<Pair<K#2,V#2>>)
        V#2 extends Object declared in method <K#2,V#2>copyOf(Iterator<Pair<K#2,V#2>>)
        K#3 extends Object declared in method <K#3,V#3>copyOf(Pair<K#3,V#3>[])
        V#3 extends Object declared in method <K#3,V#3>copyOf(Pair<K#3,V#3>[])
    

    It would be nice to have a method in Maps that allows to construct a Map from a java.util.Map

    opened by skogsbaer 0
  • Please consider renaming collection artifact back to dexx-collections

    Please consider renaming collection artifact back to dexx-collections

    See relevant issue https://github.com/ben-manes/caffeine/issues/364

    The case is it often happens that people place artifact jars to a single folder, and they use just artifact name for the file name.

    If different artifacts use collection for the artifactId, then it would cause trouble (like with com.github.ben-manes.caffeine:guava which happened to produce guava.jar :-/ )

    opened by vlsi 0
  • Support null elements in sets

    Support null elements in sets

    Hello,

    I noticed that sets don't support null elements.

    java.lang.NullPointerException
    	at com.github.andrewoma.dexx.collection.internal.hashmap.CompactHashMap.put(CompactHashMap.java:69)
    	at com.github.andrewoma.dexx.collection.HashSet.add(HashSet.java:88)
    	at com.github.andrewoma.dexx.collection.HashSet.add(HashSet.java:36)
    	[...]
    
    java.lang.NullPointerException
    	at com.github.andrewoma.dexx.collection.internal.hashmap.CompactHashMap.put(CompactHashMap.java:69)
    	at com.github.andrewoma.dexx.collection.HashSet.add(HashSet.java:88)
    	at com.github.andrewoma.dexx.collection.Sets.copyOf(Sets.java:133)
    	[...]
    

    As there is no @NotNull annotation for it, I assume it is a bug.

    opened by jcornaz 4
Owner
Andrew O'Malley
Andrew O'Malley
A Persistent Java Collections Library

PCollections A Persistent Java Collections Library Overview PCollections serves as a persistent and immutable analogue of the Java Collections Framewo

harold cooper 708 Dec 28, 2022
Eclipse Collections is a collections framework for Java with optimized data structures and a rich, functional and fluent API.

English | 中文 | Deutsch | Español | Ελληνικά | Français | 日本語 | Norsk (bokmål) | Português-Brasil | Русский | हिंदी Eclipse Collections is a comprehens

Eclipse Foundation 2.1k Dec 29, 2022
Immutable key/value store with efficient space utilization and fast reads. They are ideal for the use-case of tables built by batch processes and shipped to multiple servers.

Minimal Perfect Hash Tables About Minimal Perfect Hash Tables are an immutable key/value store with efficient space utilization and fast reads. They a

Indeed Engineering 92 Nov 22, 2022
Immutable in-memory R-tree and R*-tree implementations in Java with reactive api

rtree In-memory immutable 2D R-tree implementation in java using RxJava Observables for reactive processing of search results. Status: released to Mav

Dave Moten 999 Dec 20, 2022
RTree2D is a 2D immutable R-tree with STR (Sort-Tile-Recursive) packing for ultra-fast nearest and intersection queries

RTree2D RTree2D is a 2D immutable R-tree with STR (Sort-Tile-Recursive) packing for ultra-fast nearest and intersection queries. Goals Main our requir

Andriy Plokhotnyuk 121 Dec 14, 2022
The Java collections framework provides a set of interfaces and classes to implement various data structures and algorithms.

Homework #14 Table of Contents General Info Technologies Used Project Status Contact General Information Homework contains topics: Sorting an ArrayLis

Mykhailo 1 Feb 12, 2022
Java Collections till the last breadcrumb of memory and performance

Koloboke A family of projects around collections in Java (so far). The Koloboke Collections API A carefully designed extension of the Java Collections

Roman Leventov 967 Nov 14, 2022
fasttuple - Collections that are laid out adjacently in both on- and off-heap memory.

FastTuple Introduction There are lots of good things about working on the JVM, like a world class JIT, operating system threads, and a world class gar

BMC TrueSight Pulse (formerly Boundary) 137 Sep 30, 2022
High Performance Primitive Collections for Java

HPPC: High Performance Primitive Collections Collections of primitive types (maps, sets, stacks, lists) with open internals and an API twist (no java.

Carrot Search 890 Dec 28, 2022
Java port of a concurrent trie hash map implementation from the Scala collections library

About This is a Java port of a concurrent trie hash map implementation from the Scala collections library. It is almost a line-by-line conversion from

null 147 Oct 31, 2022
A beginner's guide to Learn Java Collections Framework

Collections-Framework A beginner's guide to Learn Java Collections Framework Topics included: Collection Interface ArrayList Iterator Stack Queue and

Anuj Kumar Sharma 97 Dec 30, 2022
gRPC and protocol buffers for Android, Kotlin, and Java.

Wire “A man got to have a code!” - Omar Little See the project website for documentation and APIs. As our teams and programs grow, the variety and vol

Square 3.9k Jan 5, 2023
A modern I/O library for Android, Kotlin, and Java.

Okio See the project website for documentation and APIs. Okio is a library that complements java.io and java.nio to make it much easier to access, sto

Square 8.2k Dec 31, 2022
Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low latency stream processing and data analysis framework. Milliseconds latency and 10+ times faster than Flink for complicated use cases.

Table-Computing Welcome to the Table-Computing GitHub. Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low la

Alibaba 34 Oct 14, 2022
A Java library for quickly and efficiently parsing and writing UUIDs

fast-uuid fast-uuid is a Java library for quickly and efficiently parsing and writing UUIDs. It yields the most dramatic performance gains when compar

Jon Chambers 142 Jan 1, 2023
High Performance data structures and utility methods for Java

Agrona Agrona provides a library of data structures and utility methods that are a common need when building high-performance applications in Java. Ma

Real Logic 2.5k Jan 5, 2023
Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-only access.

Hollow Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-on

Netflix, Inc. 1.1k Dec 25, 2022