A Persistent Java Collections Library

Overview

PCollections

A Persistent Java Collections Library

Maven Central Javadoc

Overview

PCollections serves as a persistent and immutable analogue of the Java Collections Framework. This includes efficient, thread-safe, generic, immutable, and persistent stacks, maps, vectors, sets, and bags, compatible with their Java Collections counterparts.

Persistent and immutable datatypes are increasingly appreciated as a simple, design-friendly, concurrency-friendly, and sometimes more time- and space-efficient alternative to mutable datatypes.

Persistent versus Unmodifiable

Note that these immutable collections are very different from the immutable collections returned by Java's Collections.unmodifiableCollection() and similar methods. The difference is that Java's unmodifiable collections have no producers, whereas PCollections have very efficient producers.

Thus if you have an unmodifiable Collection x and you want a new Collection x2 consisting of the elements of x in addition to some element e, you would have to do something like:

Collection x2 = new HashSet(x);
x2.add(e);

which involves copying all of x, using linear time and space.

If, on the other hand, you have a PCollection y you can simply say:

PCollection y2 = y.plus(e);

which still leaves y untouched but generally requires little or no copying, using time and space much more efficiently.

Usage

PCollections are created using producers and static factory methods. Some example static factory methods are HashTreePSet.empty() which returns an empty PSet, while HashTreePSet.singleton(e) returns a PSet containing just the element e, and HashTreePSet.from(collection) returns a PSet containing the same elements as collection. See Example Code below for an example of using producers.

The same empty(), singleton(), and from() factory methods are found in each of the PCollections implementations, which currently include one concrete implementation for each abstract type:

  • HashTreePMap provides a PMap implementation, analogous to Java's HashMap.
  • ConsPStack provides a PStack implementation, analogous to Java's LinkedList.
  • TreePVector provides a PVector implementation, analogous to Java's ArrayList.
  • HashTreePSet provides a PSet implementation, analogous to Java's HashSet.
  • HashTreePBag provides a PBag implementation, which is unordered like a set but can contain duplicate elements.

PCollections are highly interoperable with Java Collections: every PCollection is a java.util.Collection, every PMap is a java.util.Map, every PSequence — including every PStack and PVector — is a java.util.List, and every PSet is a java.util.Set.

PCollections uses Semantic Versioning, which establishes a strong correspondence between API changes and version numbering.

PCollections is in the Maven Central repository, under org.pcollections. Thus the Maven coordinates for PCollections are:

<dependency>
    <groupId>org.pcollections</groupId>
    <artifactId>pcollections</artifactId>
    <version>3.1.4</version>
</dependency>

or Gradle:

compile 'org.pcollections:pcollections:3.1.4'

Example Code

The following gives a very simple example of using PCollections, including the static factory method HashTreePSet.empty() and the producer plus(e):

import org.pcollections.*;

public class Example {
  public static void main(String... args) {
    PSet<String> set = HashTreePSet.empty();
    set = set.plus("something");

    System.out.println(set);
    System.out.println(set.plus("something else"));
    System.out.println(set);
  }
}

Running this program gives the following output:

[something]
[something else, something]
[something]

Building form source

To build the project from source clone the repository and then run ./gradlew

Related Work

Clojure and Scala also provide persistent collections on the JVM, but they are less interoperable with Java. Both Guava and java.util.Collections provide immutable collections but they are not persistent—that is, they do not provide efficient producers—so they are not nearly as useful. See Persistent versus Unmodifiable above.

Comments
  • Work around a type inference change in javac

    Work around a type inference change in javac

    The javac compiler's behavior when handling wildcards and "capture" type variables has been changed in Java 9 [1]. This change allows the code to be compiled with the latest version of javac.

    [1] https://bugs.openjdk.java.net/browse/JDK-8039214

    opened by cushon 11
  • OrderedPSet.minus is linear time

    OrderedPSet.minus is linear time

    OrderedPSet.minus calls TreePVector.minus which loops through all the entries searching for the right one to remove.

    Possible fix?

    Rather than augmenting the PVector with a PSet, use something like a PMap<E, Integer> that maps each entry to its index in the vector.

    Boxing all these ints would add overhead, but at least it would have the right asymptotic complexity.

    Alternately, it might be a good idea to have an IntTree that maps ints to ints

    enhancement 
    opened by prdoyle 10
  • [Patch] POrderedMap, and array-based implementations of that, POrderedSet, and PVector

    [Patch] POrderedMap, and array-based implementations of that, POrderedSet, and PVector

    From [email protected] on 2011-08-18T10:01:21Z

    Hi Harold,

    this patch adds the following new stuff:

    • POrderedMap: interface for PMaps that preserve the insertion order
    • ArrayPMap: an implementation of POrderedMap based on arrays
    • ArrayPVector: an implementation of PVector based on arrays
    • ArrayPSet: an implementation of POrderedSet based on arrays
    • ArrayPVectorTest: a junit test for ArrayPVector

    All these array-based variants are especially well-suited (efficient, both performance and memory wise) for small collections.

    We use them (since yesterday!) for collection-valued attributes on nodes and edges of graphs in our graph library JGraLab. There you have thousands or even millions of nodes and edges, but their collection-valued attributes are all rather small, so small memory footprints are very important for us.

    I think there are many similar use-cases that can benefit from these variants, too, so it'd be great if you included them in the pcollections proper.

    Original issue: http://code.google.com/p/pcollections/issues/detail?id=19

    opened by blackdrag 10
  • Null keys & values should be supported

    Null keys & values should be supported

    Currently, the contract of PMap is "non-null keys and non-null values"; as it stands HashPMap forbids null keys, but actually allows null values. That isn't the problem though: in fact, there is a very strong case to explicitly support both null keys and null values:

    • The "for all K and V" implied by <K,V> should really mean "for all". Since null is an element of all K and V, it should be allowed so the types can tell the truth.
    • It's much simpler to use a collection that is guaranteed to work for any unknown values than one that imposes special runtime constraints.
    • While using nulls in code is generally a bad idea, this is none of PCollection's business. Collections shouldn't know about or constrain their contents, unless there's a good performance or correctness or ordering reason, which there isn't here.
    • Prohibiting nulls makes PCollections less useful and less safe in exchange for no gain. In particular, PCollections is unusable for my current usecase, and there is no good reason for this to be so. It's a shame, because there are so few good immutable collection libraries for Java.

    This can mostly be fixed by replacing key.hashCode() and key.equals with Objects.hashCode and Objects.equals. I'm happy to knock up a PR if you are convinced.

    opened by kenbot 9
  • Create a maven repository

    Create a maven repository

    From ian.clarke on 2010-05-05T17:09:24Z

    We use Maven for our dependency management, and therefore it is very difficult for us to use Java libraries that are not available via a public Maven repository.

    Would be greatly appreciated if you could create one for pcollections.

    Original issue: http://code.google.com/p/pcollections/issues/detail?id=10

    opened by blackdrag 9
  • Remove personal details from older commits

    Remove personal details from older commits

    I have realised that my personal details have leaked in some of the commits that have been merged in master a few years ago - can those commits be amended to remove them by any chance?

    opened by ghost 7
  • Upload artifacts on Maven Central

    Upload artifacts on Maven Central

    From [email protected] on 2011-10-27T11:13:37Z

    The binary JAR, sources JAR and documentation may be uploaded to Maven Central or a suitable Maven repository. Makes for an easy inclusion in projects.

    Original issue: http://code.google.com/p/pcollections/issues/detail?id=21

    bug duplicate 
    opened by blackdrag 7
  • Sorted maps and sets (fixes issue #20)

    Sorted maps and sets (fixes issue #20)

    This pull request addresses issue #20 (SortedSet/TreeSet alternative?), by adding the following:

    • new interface PSortedMap extending both PMap and java.util.NavigableMap
    • new interface PSortedSet extending both PSet and java.util.NavigableSet
    • new classes TreePMap and TreePSet implementing those interfaces (respectively)
    • a few new classes that support the implementation of TreePMap and TreePSet
    • corresponding unit-tests and documentation
    • new methods Empty.sortedMap and Empty.sortedSet

    I realize this is kind of huge; if you're on board with the general approach, but want to go back-and-forth with separate pull requests for separate pieces, I'm open to that.

    opened by ran-arigur 6
  • HashTreePSet not serializable

    HashTreePSet not serializable

    HashTreePSet is not serializable. Only tested on HashTreePSet.empty(). Tested on pcollections 3.1.0.

    Simple test case:

    import org.pcollections.HashTreePSet;
    import org.pcollections.PSet;
    
    import java.io.ByteArrayOutputStream;
    import java.io.ObjectOutputStream;
    
    public class SerializePSet {
    
      public static void main(String... args) throws Exception {
        PSet set = HashTreePSet.empty();
        ByteArrayOutputStream bytes = new ByteArrayOutputStream();
        ObjectOutputStream out = new ObjectOutputStream(bytes);
        out.writeObject(set);
      }
    
    }
    

    Output:

    Exception in thread "main" java.io.NotSerializableException: org.pcollections.IntTreePMap$1
    	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
    	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    	at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
    	at test.SerializePSet.main(SerializePSet.java:28)
    
    bug 
    opened by JimN-LiveAction 6
  • SortedSet/TreeSet alternative?

    SortedSet/TreeSet alternative?

    From [email protected] on 2011-10-11T15:52:20Z

    I was wondering if any of the collections offered by pcollections offer an alternative to java.util.SortedSet and java.util.TreeSet.

    I would thus like to have a persistent SortedSet implementation with at least the following methods (with semantics as defined by java.util.SortedSet):

    • first
    • last
    • headSet
    • tailSet

    Original issue: http://code.google.com/p/pcollections/issues/detail?id=20

    opened by blackdrag 6
  • PStack is not a stack

    PStack is not a stack

    From [email protected] on 2010-03-29T14:26:41Z

    Stacks should support the operations: push, pop, peek.

    Original issue: http://code.google.com/p/pcollections/issues/detail?id=7

    bug 
    opened by blackdrag 6
  • optimize `PSet.intersect()`

    optimize `PSet.intersect()`

    • [ ] if the passed list is a Set, then iterate over whichever set is smaller, so we get O(mlgn) complexity, with m<n
    • [ ] benchmark that this is actually faster
    enhancement 
    opened by hrldcpr 1
  • add a utility method for deep conversion from java collections

    add a utility method for deep conversion from java collections

    Thank you for this great library that makes good stuff from Clojure available to Java developers.

    In the context of my book Data-Oriented programming, I am considering to use pcollections to illustrate to Java developers the power of data immutability when building a software system.

    A fundamental principle of Data-Oriented programming is to represent data with generic immutable data structures instead of instances of classes.

    As an example, a book in a library management system (that's the example I use in the book) is represented with a nested map combining maps and lists:

    Map watchmen = Map.of(
                          "isbn", "978-1779501127",
                          "title", "Watchmen",
                          "publicationYear", 1987,
                          "authors", List.of("alan-moore", "dave-gibbons"),
                          "bookItems", List.of(
                                               Map.of(
                                                      "id", "book-item-1",
                                                      "rackId", "rack-17",
                                                      "isLent", true
                                                      ),
                                               Map.of (
                                                       "id", "book-item-2",
                                                       "rackId", "rack-17",
                                                       "isLent", false
                                                       )
                                               )
                          );
    

    Now, I'd like to represent the same data as an immutable collection.

    The only way I found to recursively convert my nested map into persistent collection is by explicitly converting every map and list.

    Map watchmen = HashTreePMap.from(Map.of(
                          "isbn", "978-1779501127",
                          "title", "Watchmen",
                          "publicationYear", 1987,
                          "authors", TreePVector.from(List.of("alan-moore", "dave-gibbons")),
                          "bookItems", TreePVector.from(List.of(
                                               HashTreePMap.from(Map.of(
                                                      "id", "book-item-1",
                                                      "rackId", "rack-17",
                                                      "isLent", true
                                                      )),
                                               HashTreePMap.from(Map.of(
                                                       "id", "book-item-2",
                                                       "rackId", "rack-17",
                                                       "isLent", false
                                                       ))
                                               ))
                          ));
    

    Two questions:

    1. Is there a way to make this conversion less verbose?
    2. Is there a way to recursively converts a map into a pmap?
    enhancement good first issue 
    opened by viebel 4
  • Provide Stream collectors for persistent collections.

    Provide Stream collectors for persistent collections.

    One of the major differences between pcollections and other persistent collections libraries like vavr is the desire to remain interoperable with the standard java collections framework. However, because persistent collections do not allow for mutation, we miss out on a good chunk of the standard library's methods for working with collections.

    As an example, PVector implements java.util.List but does not support any of the mutating methods on that interface. For some there are natural replacements - add, addAll, remove and removeAll are supplanted by plus, plusAll, minus, and minusAll. For others though, there is no natural replacement: sort, retainAll, and clear have no direct analogue.

    This contrasts with classes like vavr's Vector which are fairly full featured in terms of the methods they support on account of not needing to maintain this compatibility.

    io.vavr.Vector.of(1, -1, 2, 3).filter(x -> x <= 2).sorted();
    

    The analogue with regular java collections pre-java 8 would be

    List<Integer> xs = new ArrayList<>();
    xs.add(1);
    xs.add(-1);
    xs.add(2);
    xs.add(3);
    xs.retainAll(x -> x <= 2);
    xs.sort();
    

    But with java 8, we get the streams api, which lets us do those larger functional transformations in the same manner as vavr albeit via explicitly targeting a collection type as the end result..

    List.of(1, -1, 2, 3)
      .stream()
      .filter(x -> x <= 2)
      .sorted()
      .collect(Collectors.toList());
    

    Which is all a long winded way of saying, it would make a lot of sense to provide collector implementations for targeting persistent collections.

    It matches up with the standard java way of doing things and would be a decent usability win. The main downside is that to do this would require a major version bump to make Java 8 the minimum and maybe working out some of the other issues with regards to performance. Specifically, if there is ever added a notion of transients or some other kind of mutable intermediate representation it would be transparent to swap the implementation of a collector provided by the library, perhaps without exposing that functionality until it is fully designed or maybe never.

    public interface PVector<E> extends PSequence<E> {
        public static <T> Collector<T, ?, PVector<T>> collector() {
            return Collector.of(Empty::pvector, PVector::plus, PVector::plusAll);
        }
        // ...
    }
    
    TreePVector.from(List.of(1, -1, 2, 3))
      .stream()
      .filter(x -> x <= 2)
      .sorted()
      .collect(PVector.collector());
    

    There would be other benefits to moving to Java 8 as well - chaos and major version bumps are ladders. Perhaps a PVector.empty() could supplant Empty.pvector() or similar api improvements and deprecations.

    enhancement help wanted 
    opened by bowbahdoe 1
  • try to reuse existing collections whenever possible

    try to reuse existing collections whenever possible

    …for example in minus and minusAll methods, if nothing is removed then ideally the original collection should be returned. Same for plusAll(empty())

    • [ ] find any methods that don't achieve this
    • [ ] fix them if it won't hurt performance, otherwise just add a comment

    for example see 4720fa7

    enhancement 
    opened by hrldcpr 0
  • fix deprecation warnings

    fix deprecation warnings

    mvn clean compile -Dmaven.compiler.showDeprecation=true

    …they don't indicate any actual problem, they're just because we have to implement mutators to fulfill Collections interfaces, but we have also deprecated mutators in PCollections so that people get IDE warnings if they try to use them

    possibly can suppress the warning with a suppression comment?

    enhancement good first issue 
    opened by hrldcpr 3
Owner
harold cooper
dilettante @recursecenter @mit
harold cooper
A big, fast and persistent queue based on memory mapped file.

Big Queue A big, fast and persistent queue based on memory mapped file. Notice, bigqueue is just a standalone library, for a high-throughput, persiste

bulldog 520 Dec 30, 2022
A better compressed bitset in Java

RoaringBitmap Bitsets, also called bitmaps, are commonly used as fast data structures. Unfortunately, they can use too much memory. To compensate, we

Roaring bitmaps: A better compressed bitset 2.9k Dec 29, 2022
A lightning fast, transactional, file-based FIFO for Android and Java.

Tape by Square, Inc. Tape is a collection of queue-related classes for Android and Java. QueueFile is a lightning-fast, transactional, file-based FIFO

Square 2.4k Dec 30, 2022
gRPC and protocol buffers for Android, Kotlin, and Java.

Wire “A man got to have a code!” - Omar Little See the project website for documentation and APIs. As our teams and programs grow, the variety and vol

Square 3.9k Dec 23, 2022
Example of implementing data structures in Java

Data Structures Example of implementing data structures in Java Implemented data structure : Queue Stack Linked List Binary Search Tree (BST) Trie Hea

Ismaïl BENHALLAM 2 Sep 27, 2021
This repository contains codes for various data structures and algorithms in C, C++, Java, Python, C#, Go, JavaScript and Kotlin.

Overview The goal of this project is to have codes for various data structures and algorithms - in C, C++, Java, Python, C#, Go, JavaScript and Kotlin

Manan 25 Mar 2, 2022
Material de apoio da turma de introdução ao JAVA da Let's Code

Introdução ao Java Esse repositório foi criado com o intuito de servir como material de apoio ao módulo de Introdução a Java da Let's Code. Sumário Li

Bruno Pereira Pinho 5 Oct 25, 2022
A repository that contains Data Structure and Algorithms coded on Java

A repository that contains Data Structure and Algorithms coded on Java . It will also contain solutions of questions from Leetcode.

Akshat Gupta 6 Oct 15, 2022
🎓☕ Repository of lessons and exercises from loiane.training's course on data structure with Java

☕ Curso estrutura de dados com Java by @loiane.training Repositório com as aulas e exercícios do curso de estrutura de dados com Java da loiane.traini

Leticia Campos 2 Feb 1, 2022
Facebook Clone created using java based on Graph data Structure

Facebook Clone Facebook Clone created using java based on Graph data Structure Representation of Social Media using Graph Data Structure in Java It is

yogita pandurang chaudhari 1 Jan 16, 2022
JAVA DASAR PROGRAMING

JAVA DASAR PROGRAMING Halo , Selamat datang di repository kami ! BELAJAR JAVA DASAR HANYA DALAM WAKTU 14 HARI Ayo berpartisipasi menjadi bagian dari J

Radja Aulia Al Ramdani 2 Nov 5, 2022
Worker-queue implementation on top of Java and database

Database Queue Library provides worker-queue implementation on top of Java and database. Fintech company YooMoney uses db-queue in cases where reliabi

null 17 Dec 12, 2022
Data structures and algorithms exercises in java

Data structure and algorithms in Java About The Project [] In this repository you can find examples of data structure exercises solved in java and som

Luis Perez Contreras 1 Nov 25, 2021
C++ / JAVA Optimised Solutions for LeetCode Problems

LeetCode Solutions Algorithms Math Bit Manipulation Array String Matrix Hash Table Binary Search Sorting Two Pointers Dynamic Programming My Most Vote

Akshaya Amar 191 Jan 5, 2023
Data structures & algorithms implemented in Java and solutions to leetcode problems.

Hello, World! ?? Hey everyone, I'm Sharad ☃ , and I'm a Software Engineer ?? at eGain! This repository ?? is all about data structures & algorithms an

Sharad Dutta 16 Dec 16, 2022
GS Collections has been migrated to the Eclipse Foundation, re-branded as Eclipse Collections. https://www.eclipse.org/collections/

GS Collections is now Eclipse Collections We are pleased to announce that GS Collections has been migrated to the Eclipse Foundation, re-branded as Ec

null 1.8k Dec 30, 2022
A Persistent Java Collections Library

PCollections A Persistent Java Collections Library Overview PCollections serves as a persistent and immutable analogue of the Java Collections Framewo

harold cooper 708 Dec 28, 2022
A Persistent Java Collections Library

PCollections A Persistent Java Collections Library Overview PCollections serves as a persistent and immutable analogue of the Java Collections Framewo

harold cooper 708 Dec 28, 2022
Persistent (immutable) collections for Java and Kotlin

What are Dexx Collections? Dexx Collections are a port of Scala's immutable, persistent collection classes to pure Java. Persistent in the context of

Andrew O'Malley 208 Sep 30, 2022
Eclipse Collections is a collections framework for Java with optimized data structures and a rich, functional and fluent API.

English | 中文 | Deutsch | Español | Ελληνικά | Français | 日本語 | Norsk (bokmål) | Português-Brasil | Русский | हिंदी Eclipse Collections is a comprehens

Eclipse Foundation 2.1k Jan 5, 2023