Java large off heap cache

Robert Stupp

Last update: Dec 31, 2022

Related tags

Memory and concurrency ohc

Overview

OHC - An off-heap-cache

Features

asynchronous cache loader support
optional per entry or default TTL/expireAt
entry eviction and expiration without a separate thread
capable of maintaining huge amounts of cache memory
suitable for tiny/small entries with low overhead using the chunked implementation
runs with Java 8 and Java 11 - support for Java 7 and earlier has been dropped with version 0.7.0
to build OHC from source, Java 11 or newer (tested with Java 11 + 15) is required

Performance

OHC shall provide a good performance on both commodity hardware and big systems using non-uniform-memory-architectures.

No performance test results available yet - you may try the ohc-benchmark tool. See instructions below. A very basic impression on the speed is in the _Benchmarking_ section.

Requirements

Java 8 VM that support 64bit and has sun.misc.Unsafe (Oracle JVMs on x64 Intel CPUs).

OHC is targeted for Linux and OSX. It should work on Windows and other Unix OSs.

Architecture

OHC provides two implementations for different cache entry characteristics: - The _linked_ implementation allocates off-heap memory for each entry individually and works best for medium and big entries. - The _chunked_ implementation allocates off-heap memory for each hash segment as a whole and is intended for small entries.

Linked implementation

The number of segments is configured via org.caffinitas.ohc.OHCacheBuilder, defaults to # of cpus * 2 and must be a power of 2. Entries are distribtued over the segments using the most significant bits of the 64 bit hash code. Accesses on each segment are synchronized.

Each hash-map entry is allocated individually. Entries are free'd (deallocated), when they are no longer referenced by the off-heap map itself or any external reference like org.caffinitas.ohc.DirectValueAccess or a org.caffinitas.ohc.CacheSerializer.

The design of this implementation reduces the locked time of a segment to a very short time. Put/replace operations allocate memory first, call the org.caffinitas.ohc.CacheSerializer to serialize the key and value and then put the fully prepared entry into the segment.

Eviction is performed using an LRU algorithm. A linked list through all cached elements per segment is used to keep track of the eldest entries.

Chunked implementation

Chunked memory allocation off-heap implementation.

Purpose of this implementation is to reduce the overhead for relatively small cache entries compared to the linked implementation since the memory for the whole segment is pre-allocated. This implementation is suitable for small entries with fast (de)serialization implementations of org.caffinitas.ohc.CacheSerializer.

Segmentation is the same as in the linked implementation. The number of segments is configured via org.caffinitas.ohc.OHCacheBuilder, defaults to # of cpus * 2 and must be a power of 2. Entries are distribtued over the segments using the most significant bits of the 64 bit hash code. Accesses on each segment are synchronized.

Each segment is divided into multiple chunks. Each segment is responsible for a portion of the total capacity (capacity / segmentCount). This amount of memory is allocated once up-front during initialization and logically divided into a configurable number of chunks. The size of each chunk is configured using the chunkSize option in org.caffinitas.ohc.OHCacheBuilder.

Like the linked implementation, hash entries are serialized into a temporary buffer first, before the actual put into a segment occurs (segement operations are synchronized).

New entries are placed into the current write chunk. When that chunk is full, the next empty chunk will become the new write chunk. When all chunks are full, the least recently used chunk, including all the entries it contains, is evicted.

Specifying the fixedKeyLength and fixedValueLength builder properties reduces the memory footprint by 8 bytes per entry.

Serialization, direct access and get-with-loader functions are not supported in this implementation.

To enable the chunked implementation, specify the chunkSize in org.caffinitas.ohc.OHCacheBuilder.

Note: the chunked implementation should still be considered experimental.

Eviction algorithms

OHC supports three eviction algorithms:

LRU: The oldest (least recently used) entries are evicted to make room for new entries.
Window Tiny-LFU: Entries with lower usage frequency are evicted to make room for new entries. The goal of this eviction algorithm is to prevent heavily used entries from being evicted. Note that the maximum size of entries is limited to the size of the eden generation, which is currently fixed at 20% of the segment size (i.e. overall capacity / number of segments). Each OHC cache segment is divided into an eden and a main "generation". New entries start in the eden generation to give these time to build up their usage frequencies. When the eden generation becomes full, entries in the eden generation have to pass the admission filter, which checks the frequencies of the entries in the eden generation against the frequencies of the oldest (least recently used) entries in the main generation. See this article for a more thorough description. (Only supported in the _linked_ implementation, not supported by the chunked implementation)
None: OHC performs no eviction on its own. It is up to the caller to check the return values and monitor free capacity. (Only supported in the _linked_ implementation, not supported by the chunked implementation)

Configuration

Use the class OHCacheBuilder to configure all necessary parameter like

number of segments (must be a power of 2), defaults to number-of-cores * 2
hash table size (must be a power of 2), defaults to 8192
load factor, defaults to .75
capacity for data over the whole cache
key and value serializers
default TTL
optional unlocked mode

Generally you should work with a large hash table. The larger the hash table, the shorter the linked-list in each hash partition - that means less linked-link walks and increased performance.

The total amount of required off heap memory is the total capacity plus hash table. Each hash bucket (currently) requires 8 bytes - so the formula is capacity + segment_count * hash_table_size * 8.

OHC allocates off-heap memory directly bypassing Java's off-heap memory limitation. This means, that all memory allocated by OHC is not counted towards -XX:maxDirectMemorySize.

Memory & jemalloc

Since especially the linked implementation performs alloc/free operations for each individual entry, consider that memory fragmentation can happen.

Also leave some head room since some allocations might still be in flight and also "the other stuff" (operating system, JVM, etc) need memory. It depends on the usage pattern how much head room is necessary. Note that the linked implementation allocates memory during write operations _before_ it is counted towards the segments, which will evict older entries. This means: do not dedicate all available memory to OHC.

We recommend using jemalloc to keep fragmentation low. On Unix operating systems, preload jemalloc.

OSX usually does not require jemalloc for performance reasons. Also make sure that you are using a recent version of jemalloc - some Linux distributions still provide quite old versions.

To preload jemalloc on Linux, use export LD_PRELOAD=<path-to-libjemalloc.so, to preload jemalloc on OSX, use export DYLD_INSERT_LIBRARIES=<path-to-libjemalloc.so. A script template for preloading can be found at the Apache Cassandra project.

Usage

Quickstart:

OHCache ohCache = OHCacheBuilder.newBuilder()
                                .keySerializer(yourKeySerializer)
                                .valueSerializer(yourValueSerializer)
                                .build();

This quickstart uses the very least default configuration:

total cache capacity of 64MB or 16 * number-of-cpus, whichever is smaller
number of segments is 2 * number of cores
8192 buckets per segment
load factor of .75
your custom key serializer
your custom value serializer
no maximum serialized cache entry size

See javadoc of CacheBuilder for a complete list of options.

Key and value serializers need to implement the CacheSerializer interface. This interface has three methods:

int serializedSize(T t) to return the serialized size of the given object
void serialize(Object obj, DataOutput out) to serialize the given object to the data output
T deserialize(DataInput in) to deserialize an object from the data input

Building from source

Clone the git repo to your local machine. Either use the stable master branch or a release tag.

git clone https://github.com/snazy/ohc.git

You need OpenJDK 11 or newer to build from source. Just execute

mvn clean install

Benchmarking

You need to build OHC from source because the big benchmark artifacts are not uploaded to Maven Central.

Execute java -jar ohc-benchmark/target/ohc-benchmark-0.7.1-SNAPSHOT.jar -h (when building from source) to get some help information.

Generally the benchmark tool starts a bunch of threads and performs _get_ and _put_ operations concurrently using configurable key distributions for _get_ and _put_ operations. Value size distribution also needs to be configured.

Available command line options:

-cap <arg>    size of the cache
-d <arg>      benchmark duration in seconds
-h            help, print this command
-lf <arg>     hash table load factor
-r <arg>      read-write ration (as a double 0..1 representing the chance for a read)
-rkd <arg>    hot key use distribution - default: uniform(1..10000)
-sc <arg>     number of segments (number of individual off-heap-maps)
-t <arg>      threads for execution
-vs <arg>     value sizes - default: fixed(512)
-wkd <arg>    hot key use distribution - default: uniform(1..10000)
-wu <arg>     warm up - <work-secs>,<sleep-secs>
-z <arg>      hash table size
-cs <arg>     chunk size - if specified it will use the "chunked" implementation
-fks <arg>    fixed key size in bytes
-fvs <arg>    fixed value size in bytes
-mes <arg>    max entry size in bytes
-unl          do not use locking - only appropiate for single-threaded mode
-hm <arg>     hash algorithm to use - MURMUR3, XX, CRC32
-bh           show bucket historgram in stats
-kl <arg>     enable bucket histogram. Default: false

Distributions for read keys, write keys and value sizes can be configured using the following functions:

EXP(min..max)                        An exponential distribution over the range [min..max]
EXTREME(min..max,shape)              An extreme value (Weibull) distribution over the range [min..max]
QEXTREME(min..max,shape,quantas)     An extreme value, split into quantas, within which the chance of selection is uniform
GAUSSIAN(min..max,stdvrng)           A gaussian/normal distribution, where mean=(min+max)/2, and stdev is (mean-min)/stdvrng
GAUSSIAN(min..max,mean,stdev)        A gaussian/normal distribution, with explicitly defined mean and stdev
UNIFORM(min..max)                    A uniform distribution over the range [min, max]
FIXED(val)                           A fixed distribution, always returning the same value
Preceding the name with ~ will invert the distribution, e.g. ~exp(1..10) will yield 10 most, instead of least, often
Aliases: extr, qextr, gauss, normal, norm, weibull

(Note: these are similar to the Apache Cassandra stress tool - if you know one, you know both ;)

Quick example with a read/write ratio of .9, approx 1.5GB max capacity, 16 threads that runs for 30 seconds:

java -jar ohc-benchmark/target/ohc-benchmark-0.5.1-SNAPSHOT.jar

(Note that the version in the jar file name might differ.)

On a 2.6GHz Core i7 system (OSX) the following numbers are typical running the above benchmark (.9 read/write ratio):

# of gets per second: 2500000
# of puts per second: 270000

Why off-heap memory

When using a very huge number of objects in a very large heap, Virtual machines will suffer from increased GC pressure since it basically has to inspect each and every object whether it can be collected and has to access all memory pages. A cache shall keep a hot set of objects accessible for fast access (e.g. omit disk or network roundtrips). The only solution is to use native memory - and there you will end up with the choice either to use some native code (C/C++) via JNI or use direct memory access.

Native code using C/C++ via JNI has the drawback that you have to naturally write C/C++ code for each and every platform. Although most Unix OS (Linux, OSX, BSD, Solaris) are quite similar when dealing with things like compare-and-swap or Posix libraries, you usually also want to support the other platform (Windows).

Both native code and direct memory access have the drawback that they have to "leave" the JVM "context" - want to say that access to off heap memory is slower than access to data in the Java heap and that each JNI call has some "escape from JVM context" cost.

But off heap memory is great when you have to deal with a huge amount of several/many GB of cache memory since that dos not put any pressure on the Java garbage collector. Let the Java GC do its job for the application where this library does its job for the cached data.

Why not use ByteBuffer.allocateDirect()?

TL;DR allocating off-heap memory directly and bypassing ByteBuffer.allocateDirect is very gentle to the GC and we have explicit control over memory allocation and, more importantly, free. The stock implementation in Java frees off-heap memory during a garbage collection - also: if no more off-heap memory is available, it likely triggers a Full-GC, which is problematic if multiple threads run into that situation concurrently since it means lots of Full-GCs sequentially. Further, the stock implementation uses a global, synchronized linked list to track off-heap memory allocations.

This is why OHC allocates off-heap memory directly and recommends to preload jemalloc on Linux systems to improve memory managment performance.

History

OHC was developed in 2014/15 for Apache Cassandra 2.2 and 3.0 to be used as the new row-cache backend.

Since there were no suitable fully off-heap cache implementations available, it has been decided to build a completely new one - and that's OHC. But it turned out that OHC alone might also be usable for other projects - that's why OHC is a separate library.

Contributors

A big 'thank you' has to go to Benedict Elliott Smith and Ariel Weisberg from DataStax for their very useful input to OHC!

Ben Manes, the author of Caffeine, the highly configurable on-heap cache using W-Tiny LFU.

Developer: Robert Stupp

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Comments

Your machine is a bit too slow (tests error)

I'm building OHC for fedora-rawhide and while executing tests on ohc-core I'm sutck, getting this error message: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-2.b14.fc26.x86_64/jre/bin/java: symbol lookup error: /tmp/snappy-1.1.2.4-a36a540c-a5ee-471b-a91c-a2ec39e1703e-libsnappyjava.so: undefined symbol: _ZN6snappy19MaxCompressedLengthEm

Do you have any Idea where the problem might be? Thanks in advance, Tomas

opened by trepik 22
Evaluate TinyLFU

TinyLFU is an admission filter that improves the hit rates of caches by eagerly discarding entries with a low historic frequency. This differs from an eviction policy, which tries to organize entries to find a good victim but always accepts new entries into the cache. As many entries are low value, e.g. one hit wonders, this results in "cache pollution".

Policies like FIFO and LRU are the most susceptible to this by allowing new arrivals to flow through the entire cache before being evicted. Policies like SLRU, S4LRU, ARC, and LIRS try to reduce this by having a smaller "probation" period to monitor incoming entries for a second access, which promotes them to a "protected" region. To further increase hit rates, ARC and LIRS use ghost entries (evicted keys) to retain a longer history than the working set.

TinyLFU (paper) retains its history more compactly through a CountMinSketch. Instead of retaining ghost keys on recency ordered lists, it hashes into a counter matrix and ages them periodically. This allows it to identify the "heavy hitters" in a stream of events and decide if the new arrival has a higher value than the eviction policy's victim by comparing their frequencies.

W-TinyLFU (HighScalability article variant is a policy based on tuning using various simulations. It better copes with recency skewed traces by delaying the TinyLFU filter using an admission window. It then uses SLRU so that a better victim is chosen. In most workloads the simpler TinyLFU has equal performance, but for more robustness over a variety then the W-TinyLFU is recommended.

Caffeine uses a 4-bit CountMinSketch. It does not use the paper's idea of a "doorkeeper", but does include it in the simulator. One or both of these could be ported easily. It is likely that only TinyLFU is needed in this cache's workload types, but if not then W-TinyLFU would require a little more work to integrate. However, either should not be difficult to add and test.

The benefit of this approach is quite astounding.

opened by ben-manes 15
replace ReentrantLock with CAS-lock

ReentrantLock is kind-of heavy compared to the operations that need to be performed in locked-state within OHC impl.

Therefore it's reasonable to replace ReentrantLock with AtomicLongUpdate and use CAS to lock (and use a spinning try-lock).

opened by snazy 7
Support for multiple keys

A common implementation to support multiple keys is to create a single object which combines a couple of keys. ie.

class CombinedKey { public PrimaryKey primaryKey; public SecondaryKey secondaryKey; ... }

That works fine, but it results in a lot of heap allocation if you're attempting to do high performance (increases GC pressure. Although I do wonder if keeping it short lived enough would keep the object effectively on the stack).

Looking at the linked implementation, the first action most of the methods perform is to serialize the Key object to a KeyBuffer. Would it be difficult to allow var args for the key and pass these through to the KeySerializer? Instead of CombinedKey as above you could then use:

map.get(primaryKey, secondaryKey);

I've just been refactoring some other on-heap code to pull apart a combined key like above to reduce the number of objects being allocated. It would be nice to carry this through to the off heap cache.

opened by oobles 5

Test failure on Fedora 26

I'm using Fedora native dependencies to build ohc (0.5.1), not the bundled ones. There are some tests failing, and I'm eager to fix it. Here is a log. Can you suggest what deps could be causing this behavior?

Tests run: 139, Failures: 5, Errors: 0, Skipped: 3, Time elapsed: 51.074 sec <<< FAILURE! - in TestSuite
testBasics(org.caffinitas.ohc.chunked.ChunkedFixedCacheImplTest)  Time elapsed: 0.017 sec  <<< FAILURE!
java.lang.AssertionError: expected [hello world ????] but found [hello world ????]
	at org.caffinitas.ohc.chunked.ChunkedFixedCacheImplTest.testBasics(ChunkedFixedCacheImplTest.java:103)

testHotKeyBufferIterator(org.caffinitas.ohc.chunked.ChunkedFixedCacheImplTest)  Time elapsed: 0.092 sec  <<< FAILURE!
java.lang.AssertionError: count is 8 but should be >= 10 expected [true] but found [false]
	at org.caffinitas.ohc.chunked.ChunkedFixedCacheImplTest.testHotKeyBufferIterator(ChunkedFixedCacheImplTest.java:499)

testHotKeyIterator(org.caffinitas.ohc.chunked.ChunkedFixedCacheImplTest)  Time elapsed: 0.061 sec  <<< FAILURE!
java.lang.AssertionError: count is 8 but should be >= 10 expected [true] but found [false]
	at org.caffinitas.ohc.chunked.ChunkedFixedCacheImplTest.testHotKeyIterator(ChunkedFixedCacheImplTest.java:365)

testHotKeyBufferIterator(org.caffinitas.ohc.chunked.ChunkedCacheImplTest)  Time elapsed: 0.128 sec  <<< FAILURE!
java.lang.AssertionError: count is 8 but should be >= 10 expected [true] but found [false]
	at org.caffinitas.ohc.chunked.ChunkedCacheImplTest.testHotKeyBufferIterator(ChunkedCacheImplTest.java:573)

testHotKeyIterator(org.caffinitas.ohc.chunked.ChunkedCacheImplTest)  Time elapsed: 0.084 sec  <<< FAILURE!
java.lang.AssertionError: count is 8 but should be >= 10 expected [true] but found [false]
	at org.caffinitas.ohc.chunked.ChunkedCacheImplTest.testHotKeyIterator(ChunkedCacheImplTest.java:417)

opened by trepik 4

NullPointerException for OHCacheLinkedImpl.stats()
Am seeing a NullPointerException for OHCacheLinkedImpl.stats() on line 706 for v0.5.1. This corresponds to line 716 in the develop branch. This is the code in question:

for (OffHeapLinkedMap map : maps) rehashes += map.rehashes();

map is null for some reason.
opened by davidhoyt 4
Simple put and get fails
Thank you for your excellent work -- I have a simple test where I create a cache with all the default settings (using a string key/value serializer) and then do:

cache.put("a", "exists"); assert(cache.get("a") != null);

Unfortunately doing a get right after a put sometimes works and sometimes doesn't. Am I misunderstanding how it should work?

If I place put in a loop then it works reliably:

while (!cache.containsKey("a")) { cache.put("a", "exists"); }

However, it can take several attempts before the entry is added. I'm using version 0.5.1 on OS X (Sierra), Oracle Java v1.8.0_112.
opened by davidhoyt 4

getWithLoader needs contract for null/non-existant values

When using the getWithLoader for a key that doesn't exist, the value is still passed to the serializedSize method. Would it be better to check for null values and not store null entries?

Line 395: try { value = loader.load(key);

                        long entryExpireAt = expireAt;
                        if (entryExpireAt > 0L && entryExpireAt <= System.currentTimeMillis())
                        {
                            segment.removeEntry(sentinelHashEntryAdr);

                            // already expired
                            return null;
                        }

                        // not already expired

                        long valueLen = valueSerializer.serializedSize(value);

opened by oobles 4

java.lang.NoSuchMethodError: java.nio.ByteBuffer.flip()

The official build of version 0.7.0 produces the following exception:

java.lang.NoSuchMethodError: java.nio.ByteBuffer.flip()Ljava/nio/ByteBuffer; when calling serializeHotNEntries.

It's actually a compilation issue experienced by many other projects:

https://github.com/hazelcast/hazelcast/issues/14214 https://github.com/plasma-umass/doppio/issues/497 https://github.com/functional-streams-for-scala/fs2/issues/1357 https://github.com/eclipse/jetty.project/issues/3244 https://github.com/apache/felix/pull/114

opened by maltalex 3
Improve lock: replace spin-on-compare-and-set with spin-on-read

Improve the spin lock implementation. Replace spin-on-test-and-set with spin-on-read, it's more friendly with SMP per-CPU cache, and get better performance(almost 3%~7% speed up when running benchmark on my machine(ubuntu 17.10) under different threads size concurrent).

opened by killme2008 3
Evaluate use of CRC32 as hash function

Current x86 CPUs have built in support for CRC32. Supported with Java8. CRC32 performance with J8 w/ CRC intrinsics is very interesting.

So - evaluate whether its fine to use CRC32 as the hash function instead of murmur3.

See also: https://issues.apache.org/jira/browse/CASSANDRA-8614

opened by snazy 3
Bump log4j-core from 2.16.0 to 2.17.1
Bumps log4j-core from 2.16.0 to 2.17.1.

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump log4j-api from 2.16.0 to 2.17.1
Bumps log4j-api from 2.16.0 to 2.17.1.

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
There may be a memory leak here ?

https://github.com/snazy/ohc/blob/75a28f5042f95e4e8e0b2080d7842b5523ebc70f/ohc-core/src/main/java/org/caffinitas/ohc/linked/OffHeapLinkedMap.java#L426

The previous code deletes all the pointers, and there is no retry to release the memory here. Will there be a memory leak in high concurrency?

opened by DoubleGOO 1
Is OHC still in active development?

The repository activity seems very quiet. I am wondering if OHC is still in active development. I acknowledge a library may enter a maturity phase and after that it is fine to not have rapid new development but wanted to know the current state before I take a dependency on it.

opened by swaranga 1
Actual off-heap used memory is much bigger than configured capacity
When I configured the max capacity to 10GB, the actual off-heap memory usage is 16GB, where does the extra 6GB come ?

Scenario:

capacity: 10GB

segment: 512

hashtablesize: 32768 (make sure no rehash happen)

linkedLRUMap

key+value size: 1KB

using jemalloc
opened by kfchu 2

Owner

Robert Stupp

GitHub

High Performance data structures and utility methods for Java

Agrona Agrona provides a library of data structures and utility methods that are a common need when building high-performance applications in Java. Ma

2.5k Jan 5, 2023

Bloofi: A java implementation of multidimensional Bloom filters

Bloofi: A java implementation of multidimensional Bloom filters Bloom filters are probabilistic data structures commonly used for approximate membersh

71 Nov 2, 2022

A high performance caching library for Java

Caffeine is a high performance, near optimal caching library. For more details, see our user's guide and browse the API docs for the latest release. C

13k Jan 5, 2023

Chronicle Bytes has a similar purpose to Java NIO's ByteBuffer with many extensions

Chronicle-Bytes Chronicle-Bytes Chronicle Bytes contains all the low level memory access wrappers. It is built on Chronicle Core’s direct memory and O

334 Jan 1, 2023

High performance Java implementation of a Cuckoo filter - Apache Licensed

Cuckoo Filter For Java This library offers a similar interface to Guava's Bloom filters. In most cases it can be used interchangeably and has addition

161 Dec 30, 2022

An advanced, but easy to use, platform for writing functional applications in Java 8.

Getting Cyclops X (10) The latest version is cyclops:10.4.0 Stackoverflow tag cyclops-react Documentation (work in progress for Cyclops X) Integration

1.3k Dec 29, 2022

Eclipse Collections is a collections framework for Java with optimized data structures and a rich, functional and fluent API.

2.1k Dec 29, 2022

External-Memory Sorting in Java

Externalsortinginjava External-Memory Sorting in Java: useful to sort very large files using multiple cores and an external-memory algorithm. The vers

235 Dec 29, 2022

A Java library for quickly and efficiently parsing and writing UUIDs

fast-uuid fast-uuid is a Java library for quickly and efficiently parsing and writing UUIDs. It yields the most dramatic performance gains when compar

142 Jan 1, 2023

Geohash utitlies in java

geo Java utility methods for geohashing. Status: production, available on Maven Central Maven site reports are here including javadoc. Add this to you

386 Jan 1, 2023

Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-only access.

Hollow Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-on

1.1k Dec 25, 2022

High Performance Primitive Collections for Java

HPPC: High Performance Primitive Collections Collections of primitive types (maps, sets, stacks, lists) with open internals and an API twist (no java.

890 Dec 28, 2022

Java port of a concurrent trie hash map implementation from the Scala collections library

About This is a Java port of a concurrent trie hash map implementation from the Scala collections library. It is almost a line-by-line conversion from

147 Oct 31, 2022

Java library for the HyperLogLog algorithm

java-hll A Java implementation of HyperLogLog whose goal is to be storage-compatible with other similar offerings from Aggregate Knowledge. NOTE: This

296 Dec 30, 2022

A simple integer compression library in Java

JavaFastPFOR: A simple integer compression library in Java License This code is released under the Apache License Version 2.0 http://www.apache.org/li

487 Dec 30, 2022

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere. It adds tuple support, function support, and a lot of additional functionality around sequential Streams. The JDK 8's main efforts (default methods, lambdas, and the Stream API) were focused around maintaining backwards compatibility and implementing a functional API for parallelism.

jOOλ jOOλ is part of the jOOQ series (along with jOOQ, jOOX, jOOR, jOOU) providing some useful extensions to Java 8 lambdas. It contains these classes

2k Dec 31, 2022

Java large off heap cache

Related tags

Overview

OHC - An off-heap-cache

Features

Performance

Requirements

Architecture

Linked implementation

Chunked implementation

Eviction algorithms

Configuration

Memory & jemalloc

Usage

Building from source

Benchmarking

Why off-heap memory

Why not use ByteBuffer.allocateDirect()?

History

Contributors

License

Comments

Owner

Robert Stupp

High Performance data structures and utility methods for Java

Bloofi: A java implementation of multidimensional Bloom filters

A high performance caching library for Java

Chronicle Bytes has a similar purpose to Java NIO's ByteBuffer with many extensions

High performance Java implementation of a Cuckoo filter - Apache Licensed

An advanced, but easy to use, platform for writing functional applications in Java 8.

Eclipse Collections is a collections framework for Java with optimized data structures and a rich, functional and fluent API.

External-Memory Sorting in Java

A Java library for quickly and efficiently parsing and writing UUIDs

Geohash utitlies in java

Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-only access.

High Performance Primitive Collections for Java

Java port of a concurrent trie hash map implementation from the Scala collections library

Java library for the HyperLogLog algorithm

A simple integer compression library in Java

Java Collections till the last breadcrumb of memory and performance

Port of LevelDB to Java

LMDB for Java