Chronicle Bytes has a similar purpose to Java NIO's ByteBuffer with many extensions

Last update: Jan 1, 2023

Related tags

Overview

Chronicle-Bytes

Chronicle Bytes contains all the low level memory access wrappers. It is built on Chronicle Core’s direct memory and OS system call access.

Chronicle Bytes has a similar purpose to Java NIO’s ByteBuffer with some extensions.

The API supports.

64-bit sizes
UTF-8 and ISO-8859-1 encoded strings.
thread safe off heap memory operations.
deterministic release of resources via reference counting.
compressed data types such as stop bit encoding.
elastic ByteBuffer wrappers which resize as required.
parsing text and writing text directly to off heap bytes.

Data types supported

operation	Indexed or streaming	binary	text
read/write binary primitives	both	float, double, boolean and unsigned/signed byte, short, 24-bit int, int, long, incompleteLong	double, int, long, char, double with precision
read/write text	both	8-bit/UTF-8 string with length or limit	8-bit/UTF-8 string
read/write other	streaming	histogram, named enum	bigdecimal, biginteger, date/time/zone, UUID, hexadecimal
data driven tests	from files	no	yes
CAS	indexed	int, long
volatile read/write	indexed	byte, short, int, long, float, double
peek	both	unsigned byte
stop bit compression	streaming	int,long, double, float, char
search	from start	indexOf string, findByte
addAndGet	indexed	float, double, int, long
copy	from start	write, copy
hash	from start	byteSum, fastHash
bytes marshallable	streaming	nested data structures, expected types only.

Data types explained

operation	explained
read/write binary primitives	to read an write primitive data stuctures stored in a binary from
read/write text	to read an write text data
read/write other	to read an write other data
data driven tests	https://en.wikipedia.org/wiki/Data-driven_testing
CAS	an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if they are the same, modifies the contents of that memory location to a new given value.
volatile read/write	http://tutorials.jenkov.com/java-concurrency/volatile.html
peek	peek is an operation which returns the value of the bytes without effecting its read position
stop bit compression	https://github.com/OpenHFT/RFC/tree/master/Stop-Bit-Encoding
search	is any algorithm which solves the search problem, namely, to retrieve information stored within some data structure
addAndGet	atomically adds the given value to the current value.
copy	to transfer data from one structure to another
hash	https://en.wikipedia.org/wiki/Hash_function
bytes marshallable	a serialization funciton

Creating Bytes

Bytes which wraps an on heap ByteBuffer

Bytes<ByteBuffer> bytes = Bytes.elasticHeapByteBuffer(64);
ByteBuffer bb = bytes.underlyingObject();

Bytes which wraps a direct ByteBuffer

Bytes<ByteBuffer> bytes = Bytes.elasticByteBuffer(64);
ByteBuffer bb = bytes.underlyingObject();

Bytes which wraps some native memory

Bytes bytes = Bytes.allocateElasticDirect(64);
long address = bytes.address
bytes.releaseLast(); // when it can be freed.

Bytes which will wrap some native memory when used

Bytes bytes = Bytes.allocateElasticDirect();
// use the bytes
bytes.releaseLast(); // when it can be freed.

Flipping Bytes

ByteBuffer needs to be flipped to switch between reading and writing.

Bytes holds a read position and a write position allowing you to write and immediately read without flipping.

Note	The writePosition is the readLimit.

Writing to a Hexadecimal dump

Writing to a hexadecimal dump is useful for documenting the format for messages written. We have used the hexadecimal dump here.

Writing primitives as binary and dumping

// only used for documentation
HexDumpBytes bytes = new HexDumpBytes();
bytes.comment("true").writeBoolean(true);
bytes.comment("s8").writeByte((byte) 1);
bytes.comment("u8").writeUnsignedByte(2);
bytes.comment("s16").writeShort((short) 3);
bytes.comment("u16").writeUnsignedShort(4);
bytes.comment("char").writeUnsignedShort('5'); // char
bytes.comment("s24").writeInt24(-6_666_666);
bytes.comment("u24").writeUnsignedInt24(16_666_666);
bytes.comment("s32").writeInt(6);
bytes.comment("u32").writeUnsignedShort(7);
bytes.comment("s64").writeLong(8);
bytes.comment("f32").writeFloat(9);
bytes.comment("f64").writeDouble(10);

System.out.println(bytes.toHexString());

prints

59                                              # true
01                                              # s8
02                                              # u8
03 00                                           # s16
04 00                                           # u16
35                                              # char
56 46 9a                                        # s24
2a 50 fe                                        # u24
06 00 00 00                                     # s32
07 00 00 00                                     # u32
08 00 00 00 00 00 00 00                         # s64
00 00 10 41                                     # f32
00 00 00 00 00 00 24 40                         # f64

to read this data you can use

Reading the primitive values above

boolean flag = bytes.readBoolean();
byte s8 = bytes.readByte();
int u8 = bytes.readUnsignedByte();
short s16 = bytes.readShort();
int u16 = bytes.readUnsignedShort();
char ch = bytes.readStopBitChar();
int s24 = bytes.readInt24();
long u24 = bytes.readUnsignedInt24();
int s32 = bytes.readInt();
long u32 = bytes.readUnsignedInt();
long s64 = bytes.readLong();
float f32 = bytes.readFloat();
double f64 = bytes.readDouble();

Writing and reading using offsets

Instead of streaming the data, sometimes you need to control the placement of data, possibly at random.

Write and read primitive by offset

Bytes<ByteBuffer> bytes = Bytes.elasticHeapByteBuffer(64);
bytes.writeBoolean(0, true);
bytes.writeByte(1, (byte) 1);
bytes.writeUnsignedByte(2, 2);
bytes.writeShort(3, (short) 3);
bytes.writeUnsignedShort(5, 4);
bytes.writeInt(7, 6);
bytes.writeUnsignedInt(11, 7);
bytes.writeLong(15, 8);
bytes.writeFloat(23, 9);
bytes.writeDouble(27, 10);
bytes.writePosition(35);

System.out.println(bytes.toHexString());

boolean flag = bytes.readBoolean(0);
byte s8 = bytes.readByte(1);
int u8 = bytes.readUnsignedByte(2);
short s16 = bytes.readShort(3);
int u16 = bytes.readUnsignedShort(5);
int s32 = bytes.readInt(7);
long u32 = bytes.readUnsignedInt(11);
long s64 = bytes.readLong(15);
float f32 = bytes.readFloat(23);
double f64 = bytes.readDouble(27);

prints

00000000 59 01 02 03 00 04 00 06  00 00 00 07 00 00 00 08 Y······· ········
00000010 00 00 00 00 00 00 00 00  00 10 41 00 00 00 00 00 ········ ··A·····
00000020 00 24 40                                         ·$@

Note	While HexDumpBytes supports the offset methods, you need to provide the offset in binary and the dump making it more complex to use.

Volatile read and ordered write

Chronicle Bytes supports variants of the write primitives which have a store barrier writeOrderedXxxx, and reads with a load barrier readVolatileXxxx

Note	write ordered doesn’t stall the pipeline to wait for the write to occur, making it possible for a single thread to read an old value after the ordered write.

Working wth text

You can also write and read text to Bytes for low level, direct to native memory text processing.

Writing primitives as text

Bytes<ByteBuffer> bytes = Bytes.elasticHeapByteBuffer(64);
bytes.append(true).append('\n');
bytes.append(1).append('\n');
bytes.append(2L).append('\n');
bytes.append('3').append('\n');
bytes.append(4.1f).append('\n');
bytes.append(5.2).append('\n');
bytes.append(6.2999999, 3).append('\n');

System.out.println(bytes.toHexString());

prints

00000000 54 0a 31 0a 32 0a 33 0a  34 2e 31 0a 35 2e 32 0a T·1·2·3· 4.1·5.2·
00000010 36 2e 33 30 30 0a                                6.300·

Reading primitives as text

boolean flag = bytes.parseBoolean();
int s32 = bytes.parseInt();
long s64 = bytes.parseLong();
String ch = bytes.parseUtf8(StopCharTesters.SPACE_STOP);
float f32 = bytes.parseFloat();
double f64 = bytes.parseDouble();
double f64b = bytes.parseDouble();

Note	There are less methods for text as 8, 16 and 24 bit can use methods for `int`, Unsigned int can use the `long` method.

Reading and Writing Strings

Chronicle Bytes supports two encodings, ISO-8859-1 and UTF-8. It also supports writing these as binary with a length prefix, and a string which should be terminated. Bytes expects Strings to be read to a buffer for further processing, possibly with a String pool.

HexDumpBytes bytes = new HexDumpBytes();
bytes.comment("write8bit").write8bit("£ 1");
bytes.comment("writeUtf8").writeUtf8("£ 1");
bytes.comment("append8bit").append8bit("£ 1").append('\n');
bytes.comment("appendUtf8").appendUtf8("£ 1").append('\n');

System.out.println(bytes.toHexString());

prints

03 a3 20 31                                     # write8bit
04 c2 a3 20 31                                  # writeUtf8
a3 20 31 0a                                     # append8bit
c2 a3 20 31 0a                                  # appendUtf8

String a = bytes.read8bit();
String b = bytes.readUtf8();
String c = bytes.parse8bit(StopCharTesters.CONTROL_STOP);
String d = bytes.parseUtf8(StopCharTesters.CONTROL_STOP);

Binary strings are prefixed with a Stop Bit Encoded length.

HexDumpBytes bytes = new HexDumpBytes();
bytes.comment("write8bit").write8bit((String) null);
bytes.comment("writeUtf8").writeUtf8(null);

System.out.println(bytes.toHexString());

String a = bytes.read8bit();
String b = bytes.readUtf8();
assertEquals(null, a);
assertEquals(null, b);

prints

80 00                                           # write8bit
80 00                                           # writeUtf8

Note	`80 00` is the stop bit encoding for `-1` or `~0`

Compare and Set operation

In binary, you can atomically replace an int or long on condition that it is an expected value.

Write two fields, remember where the int and long are

HexDumpBytes bytes = new HexDumpBytes();

bytes.comment("s32").writeUtf8("s32");
long s32 = bytes.writePosition();
bytes.writeInt(0);

bytes.comment("s64").writeUtf8("s64");
long s64 = bytes.writePosition();
bytes.writeLong(0);

System.out.println(bytes.toHexString());

prints

03 73 33 32 00 00 00 00                         # s32
03 73 36 34 00 00 00 00 00 00 00 00             # s64

CAS two fields

assertTrue(bytes.compareAndSwapInt(s32, 0, Integer.MAX_VALUE));
assertTrue(bytes.compareAndSwapLong(s64, 0, Long.MAX_VALUE));

System.out.println(bytes.toHexString());

prints

03 73 33 32 ff ff ff 7f                         # s32
03 73 36 34 ff ff ff ff ff ff ff 7f             # s64

INFO: You might wonder, how is the hex dump updated as well as the binary? The readPosition actually holds the write position for both, which is why it has to be computed in this case.

Stop bit compression

Stop Bit encoding is one form of simple compression. For each 7 bits set, a byte is used with the high bit set when there is another byte to write.

See Stop Bit Encoding RFC for more details

Writing with stop bit encoding

HexDumpBytes bytes = new HexDumpBytes();

for (long i : new long[]{
        0, -1,
        127, -127,
        128, -128,
        1 << 14, 1 << 21,
        1 << 28, 1L << 35,
        1L << 42, 1L << 49,
        1L << 56, Long.MAX_VALUE,
        Long.MIN_VALUE}) {
    bytes.comment(i + "L").writeStopBit(i);
}

for (double d : new double[]{
        0.0,
        -0.0,
        1.0,
        1.0625,
        -128,
        -Double.MIN_NORMAL,
        Double.NEGATIVE_INFINITY,
        Double.NaN,
        Double.POSITIVE_INFINITY}) {
    bytes.comment(d + "").writeStopBit(d);
}

System.out.println(bytes.toHexString());

prints

00                                              # 0L
80 00                                           # -1L
7f                                              # 127L
fe 00                                           # -127L
80 01                                           # 128L
ff 00                                           # -128L
80 80 01                                        # 16384L
80 80 80 01                                     # 2097152L
80 80 80 80 01                                  # 268435456L
80 80 80 80 80 01                               # 34359738368L
80 80 80 80 80 80 01                            # 4398046511104L
80 80 80 80 80 80 80 01                         # 562949953421312L
80 80 80 80 80 80 80 80 01                      # 72057594037927936L
ff ff ff ff ff ff ff ff 7f                      # 9223372036854775807L
ff ff ff ff ff ff ff ff ff 00                   # -9223372036854775808L
00                                              # 0.0
40                                              # -0.0
9f 7c                                           # 1.0
9f fc 20                                        # 1.0625
e0 18                                           # -128.0
c0 04                                           # -2.2250738585072014E-308
ff 7c                                           # -Infinity
bf 7e                                           # NaN
bf 7c                                           # Infinity

To read these you need either long x = bytes.readStopBit() or double d = bytes.readStopBitDouble()

BytesMarshallable objects

Chronicle Bytes supports serializing simple objects where the type is not stored. This is similar to`RawWire` in Chronicle Wire.

@NotNull MyByteable mb1 = new MyByteable(false, (byte) 1, (short) 2, '3', 4, 5.5f, 6, 7.7);
@NotNull MyByteable mb2 = new MyByteable(true, (byte) 11, (short) 22, 'T', 44, 5.555f, 66, 77.77);
ZonedDateTime zdt1 = ZonedDateTime.parse("2017-11-06T12:35:56.775Z[Europe/London]");
ZonedDateTime zdt2 = ZonedDateTime.parse("2016-10-05T01:34:56.775Z[Europe/London]");
UUID uuid1 = new UUID(0x123456789L, 0xABCDEF);
UUID uuid2 = new UUID(0x1111111111111111L, 0x2222222222222222L);
@NotNull MyScalars ms1 = new MyScalars("Hello", BigInteger.ONE, BigDecimal.TEN, zdt1.toLocalDate(), zdt1.toLocalTime(), zdt1.toLocalDateTime(), zdt1, uuid1);
@NotNull MyScalars ms2 = new MyScalars("World", BigInteger.ZERO, BigDecimal.ZERO, zdt2.toLocalDate(), zdt2.toLocalTime(), zdt2.toLocalDateTime(), zdt2, uuid2);
@NotNull MyNested mn1 = new MyNested(mb1, ms1);
@NotNull MyNested mn2 = new MyNested(mb2, ms2);
bytes.comment("mn1").writeUnsignedByte(1);
mn1.writeMarshallable(bytes);
bytes.comment("mn2").writeUnsignedByte(2);
mn2.writeMarshallable(bytes);

MyByteable data structure

class MyByteable implements BytesMarshallable {
    boolean flag;
    byte b;
    short s;
    char c;
    int i;
    float f;
    long l;
    double d;

    public MyByteable(boolean flag, byte b, short s, char c, int i, float f, long l, double d) {
        this.flag = flag;
        this.b = b;
        this.s = s;
        this.c = c;
        this.i = i;
        this.f = f;
        this.l = l;
        this.d = d;
    }

MyScalars data structure

class MyScalars implements BytesMarshallable {
    String s;
    BigInteger bi;
    BigDecimal bd;
    LocalDate date;
    LocalTime time;
    LocalDateTime dateTime;
    ZonedDateTime zonedDateTime;
    UUID uuid;

    public MyScalars(String s, BigInteger bi, BigDecimal bd, LocalDate date, LocalTime time, LocalDateTime dateTime, ZonedDateTime zonedDateTime, UUID uuid) {
        this.s = s;
        this.bi = bi;
        this.bd = bd;
        this.date = date;
        this.time = time;
        this.dateTime = dateTime;
        this.zonedDateTime = zonedDateTime;
        this.uuid = uuid;
    }

prints

01                                              # mn1
                                                # byteable
      4e                                              # flag
      01                                              # b
      02 00                                           # s
      33                                              # c
      04 00 00 00                                     # i
      00 00 b0 40                                     # f
      06 00 00 00 00 00 00 00                         # l
      cd cc cc cc cc cc 1e 40                         # d
                                                # scalars
      05 48 65 6c 6c 6f                               # s
      01 31                                           # bi
      02 31 30                                        # bd
      0a 32 30 31 37 2d 31 31 2d 30 36                # date
      0c 31 32 3a 33 35 3a 35 36 2e 37 37 35          # time
      17 32 30 31 37 2d 31 31 2d 30 36 54 31 32 3a 33 # dateTime
      35 3a 35 36 2e 37 37 35 27 32 30 31 37 2d 31 31 # zonedDateTime
      2d 30 36 54 31 32 3a 33 35 3a 35 36 2e 37 37 35
      5a 5b 45 75 72 6f 70 65 2f 4c 6f 6e 64 6f 6e 5d # uuid
      24 30 30 30 30 30 30 30 31 2d 32 33 34 35 2d 36
      37 38 39 2d 30 30 30 30 2d 30 30 30 30 30 30 61
      62 63 64 65 66
02                                              # mn2
                                                # byteable
      59                                              # flag
      0b                                              # b
      16 00                                           # s
      54                                              # c
      2c 00 00 00                                     # i
      8f c2 b1 40                                     # f
      42 00 00 00 00 00 00 00                         # l
      e1 7a 14 ae 47 71 53 40                         # d
                                                # scalars
      05 57 6f 72 6c 64                               # s
      01 30                                           # bi
      01 30                                           # bd
      0a 32 30 31 36 2d 31 30 2d 30 35                # date
      0c 30 31 3a 33 34 3a 35 36 2e 37 37 35          # time
      17 32 30 31 36 2d 31 30 2d 30 35 54 30 31 3a 33 # dateTime
      34 3a 35 36 2e 37 37 35 2c 32 30 31 36 2d 31 30 # zonedDateTime
      2d 30 35 54 30 31 3a 33 34 3a 35 36 2e 37 37 35
      2b 30 31 3a 30 30 5b 45 75 72 6f 70 65 2f 4c 6f
      6e 64 6f 6e 5d 24 31 31 31 31 31 31 31 31 2d 31 # uuid
      31 31 31 2d 31 31 31 31 2d 32 32 32 32 2d 32 32
      32 32 32 32 32 32 32 32 32 32

Data driven tests

The purpose of a Lambda function is to create a simple, highly reproducible, easily testable component.

Once you have your data dumped as hexadecimal, you can create tests using that data, and make variations of those tests.

What do we mean by a Lambda function?

In this context a Lambda function is one which is entirely input driven and produces a list of messages (one or more outputs).

The simplest Lambda function is stateless, however this has limited application.They are useful for message translation.

If you need a stateful Lambda function, you can consider the input to the function to be every message it has ever consumed. Obviously this is inefficient, however with appropriate caches in your lamdba function, you can process and produce result incrementally.

Data in and out.

We module a Lambda function as having an interface for inputs and another for outputs.These interfaces can be the same.

Sample interface for Lambda function

interface IBytesMethod {
    @MethodId(0x81L) // (1)
    void myByteable(MyByteable byteable);

    @MethodId(0x82L)
    void myScalars(MyScalars scalars);

    @MethodId(0x83L)
    void myNested(MyNested nested);
}

assign a unique id to each method to simplify decoding/encoding.

Each method needs a DTO to describe the data for that message.

class MyByteable implements BytesMarshallable {
    boolean flag;
    byte b;
    short s;
    char c;
    int i;
    float f;
    long l;
    double d;
....
class MyScalars implements BytesMarshallable {
    String s;
    BigInteger bi;
    BigDecimal bd;
    LocalDate date;
    LocalTime time;
    LocalDateTime dateTime;
    ZonedDateTime zonedDateTime;
    UUID uuid;
....
class MyNested implements BytesMarshallable {
    MyByteable byteable;
    MyScalars scalars;
....

The implementation needs to take it’s output interface and implement the input interface

A simple pass through implementation

static class IBMImpl implements IBytesMethod {
    final IBytesMethod out;

    IBMImpl(IBytesMethod out) { this.out = out; }

    @Override
    public void myByteable(MyByteable byteable) { out.myByteable(byteable); }

    @Override
    public void myScalars(MyScalars scalars) { out.myScalars(scalars); }

    @Override
    public void myNested(MyNested nested) { out.myNested(nested); }
}

Once we have interfaces, DTOs, and an implementation we can setup a test harness

Setup a test harness for a Lambda function

protected void btmttTest(String input, String output)
throws IOException {
    BytesTextMethodTester tester = new BytesTextMethodTester<>(
            input,
            IBMImpl::new,
            IBytesMethod.class,
            output);
    tester.run();
    assertEquals(tester.expected(), tester.actual());
}

This allows us to give two files, one for expected inputs and one for expected outputs.

@Test
public void run()
throws IOException {
    btmttTest("btmtt/prim-input.txt", "btmtt/prim-output.txt");
}

Note	In this case the input and outputs are expected to be the same.

Sample input/output file

81 01                                           # myByteable
   4e                                              # flag
   01                                              # b
   02 00                                           # s
   33                                              # c
   04 00 00 00                                     # i
   00 00 b0 40                                     # f
   06 00 00 00 00 00 00 00                         # l
   cd cc cc cc cc cc 1e 40                         # d
### End Of Block
81 01                                           # myByteable
   59                                              # flag
   0b                                              # b
   16 00                                           # s
   54                                              # c
   2c 00 00 00                                     # i
   8f c2 b1 40                                     # f
   42 00 00 00 00 00 00 00                         # l
   e1 7a 14 ae 47 71 53 40                         # d
### End Of Block
82 01                                           # myScalars
   05 48 65 6c 6c 6f                               # s
   01 31                                           # bi
   02 31 30                                        # bd
   0a 32 30 31 37 2d 31 31 2d 30 36                # date
   0c 31 32 3a 33 35 3a 35 36 2e 37 37 35          # time
   17 32 30 31 37 2d 31 31 2d 30 36 54 31 32 3a 33 # dateTime
   35 3a 35 36 2e 37 37 35 27 32 30 31 37 2d 31 31 # zonedDateTime
   2d 30 36 54 31 32 3a 33 35 3a 35 36 2e 37 37 35
   5a 5b 45 75 72 6f 70 65 2f 4c 6f 6e 64 6f 6e 5d # uuid
   24 30 30 30 30 30 30 30 31 2d 32 33 34 35 2d 36
   37 38 39 2d 30 30 30 30 2d 30 30 30 30 30 30 61
   62 63 64 65 66
### End Of Block
83 01                                           # myNested
                                                # byteable
      59                                              # flag
      0b                                              # b
      16 00                                           # s
      54                                              # c
      2c 00 00 00                                     # i
      8f c2 b1 40                                     # f
      42 00 00 00 00 00 00 00                         # l
      e1 7a 14 ae 47 71 53 40                         # d
                                                # scalars
      05 57 6f 72 6c 64                               # s
      01 30                                           # bi
      01 30                                           # bd
      0a 32 30 31 36 2d 31 30 2d 30 35                # date
      0c 30 31 3a 33 34 3a 35 36 2e 37 37 35          # time
      17 32 30 31 36 2d 31 30 2d 30 35 54 30 31 3a 33 # dateTime
      34 3a 35 36 2e 37 37 35 2c 32 30 31 36 2d 31 30 # zonedDateTime
      2d 30 35 54 30 31 3a 33 34 3a 35 36 2e 37 37 35
      2b 30 31 3a 30 30 5b 45 75 72 6f 70 65 2f 4c 6f
      6e 64 6f 6e 5d 24 31 31 31 31 31 31 31 31 2d 31 # uuid
      31 31 31 2d 31 31 31 31 2d 32 32 32 32 2d 32 32
      32 32 32 32 32 32 32 32 32 32
### End Of Block
### End Of Test

In this case, the test calls the methods with the matching method ids which in turn uses the same ids to encode the output.

Note	Creating and maintain such tests can be an overhead you don’t need.In this case, you can use Chronicle Wire’s YAML testing format to check functionality.WIre can be used for most of the tests even if you intend to use Bytes for production.

Comparison of access to native memory

Access	ByteBuffer	Netty IOBuffer	Aeron UnsafeBuffer	Chronicle Bytes
Read/write primitives in native memory	yes	yes	yes	yes
Separate Mutable interfaces	run time check	run time check	yes	yes
Read/Write UTF8 strings	no	no	String	any CharSequence + Appendable
Read/Write ISO-8859-1 strings	no	no	?	any CharSequence + Appendable
Support Endianness	Big and Little	Big and Little	Big and Little	Native only
Size of buffer	31-bit	31-bit	31-bit	63-bit
Elastic ByteBuffers	no	yes	no	yes
Disable bounds checks	no	no	set globally	by buffer
Wrap an address	no	no	yes	yes
Thread safe read/write, CAS and atomic add operations	no	no	int; long	int; long; float and double
Streaming access	yes	yes	no	yes
Deterministic release of memory	Internal API	Internal API	Caller’s responsibility	yes
Separate read and write position	no	yes	na	yes

View Chronicle-Bytes in the debugger

When using intellij idea, you can set up a custom renderer to view the bytes, see the images below :

Performing code coverage

When performing code coverage, you might wanty to exclude AbstractBytes as this significantly slows down the running of unit tests.

Comments

Polish: convert to OpenHFT @NotNull and @Nullable

Based on @peter-lawrey answer to this SO question, this PR convert the NotNull and Nullable annotations to the those provided in OpenHFT and removes the IntelliJ annotation dependency.

opened by JanStureNielsen 24
NativeBytesStore.toTemporaryDirectByteBuffer() doesn't strongly reference the memory it points to

I was having issues where deserialization would fail with an out of range value here https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/tools/fqltool/Dump.java#L84

If I change it toByteArray() it works consistently. java.nio.DirectByteBuffer has an attachment field you can set to whatever it needs to strongly reference.

I can't say with 100% certainty this is what is happening, since it is surprising to get so unlucky. It fails the first time almost every time.

opened by aweisberg 17
CAS operation fails in readMarshallable()
Hi, please check this code example:

public void testCAS() { final File dir = getTmpDir(); try (final SingleChronicleQueue queue = SingleChronicleQueueBuilder.binary(dir) .build()) { final ExcerptAppender appender = queue.acquireAppender(); final ExcerptTailer tailer = queue.createTailer(); final WriteBytesMarshallable writeBytesMarshallable = bytes -> bytes.writeLong(123L); final ReadBytesMarshallable readBytesMarshallable = bytes -> bytes.compareAndSwapLong(bytes.readPosition(), 123L, 321L); final int iterations = 5; // fails at 5th iteration for (int i = 0; i < iterations; i++) { appender.writeBytes(writeBytesMarshallable); tailer.readBytes(readBytesMarshallable); } } }

It fails with exception:

net.openhft.chronicle.core.util.MisAlignedAssertionError at net.openhft.chronicle.core.UnsafeMemory.compareAndSwapLong(UnsafeMemory.java:983) at net.openhft.chronicle.bytes.internal.NativeBytesStore.compareAndSwapLong(NativeBytesStore.java:342) at net.openhft.chronicle.bytes.MappedBytesStore.compareAndSwapLong(MappedBytesStore.java:187) at net.openhft.chronicle.bytes.internal.ChunkedMappedBytes.compareAndSwapLong(ChunkedMappedBytes.java:607) at net.openhft.chronicle.queue.AcquireReleaseTest.lambda$testCAS$5(AcquireReleaseTest.java:147) at net.openhft.chronicle.wire.MarshallableIn.readBytes(MarshallableIn.java:69) at net.openhft.chronicle.queue.AcquireReleaseTest.testCAS(AcquireReleaseTest.java:152)

I guess it is somehow connected with padding/alignment. There were some changes in Chronicle Core library, addressing padding. My environment is Intel Macbook Pro 2013, Oracle JDK 1.8.0_201. Currently we are using 5.21ea44 in production, there is no such problem.
opened by garikjan 14
Improve handling of large Strings.

The current implementation for BytesInternal fails to write if the String is longer than 1/4 of the block size of a queue. parseUtf8_SB1 also fails to read Strings of more than 1 MB correctly.
enhancement

opened by peter-lawrey 11
Add address checking using assert

Many memory abstractions, for example, HeapBytesStore, lacks address range checking. We should add assertions to these accessors to be able to capture potential problems at least when running tests.
bug

opened by minborg 10
bytes.copyTo(outputStream) fails
Hi there, Is there a known issue with bytes.copyTo(outputStream)? I experience a strange behavior where bytes.copyTo(outputStream) stops at 2GB. However simply doing the following works fine:

byte[] buffer = new byte[4096]; int len; while ((len = bytes.read(buffer)) > 0) { outputStream.write(buffer, 0, len); }
opened by joa23 10

AbstractInterner equalBytes() returning true when strings are not equal.

There appears to be an issue in the string interning implementation (AbstractInterner). The following unit tests demonstrates a case where invoking the intern method can return a different value to that passed in. This seems to be caused by equalBytes returning true when strings are not equal.

This has been reproduced in 1.16.23 and also the most recent version of chronicle-bytes.

@Test
public void internFailingTest() throws IORuntimeException {
    UTF8StringInterner utf8StringInterner = new UTF8StringInterner(4096);

    utf8StringInterner.intern(Bytes.from("DELEGATE: 912796UF4 Notional is > limit 25.00mm"));
    utf8StringInterner.intern(Bytes.from("TW-TRSY-20181217-NY572677_3256N1"));
    String intern = utf8StringInterner.intern(Bytes.from("TW-TRSY-20181217-NY572677_3256N15"));
    assertThat(intern, equalTo("TW-TRSY-20181217-NY572677_3256N15"));
}

@Test
public void equalBytesFailingTest() throws IORuntimeException {
    BytesStore store1 = Bytes.from("TW-TRSY-20181217-NY572677_3256N1");
    BytesStore store2 = Bytes.from("TW-TRSY-20181217-NY572677_3256N15");
    assertThat(store1.equalBytes(store2, 33), equalTo(false));
}

opened by chrisjeg 9

Replace regexp with state machine, Fix #306

Most of the regular expression engines use backtracking to try all possible execution paths of the regular expression when evaluating an input, in some cases, it can cause performance issues, called catastrophic backtracking situations. In the worst case, the complexity of the regular expression is exponential in the size of the input, this means that a small carefully-crafted input (like 20 chars) can trigger catastrophic backtracking and cause a denial of service of the application. Super-linear regex complexity can lead to the same impact too with, in this case, a large carefully-crafted input (thousands of chars).

This PR replaces the use of regexp with a custom-made state machine, thereby preventing catastrophic backtracking.

opened by minborg 8
Patch for JDK-9
Hi @epickrram,

This is my patch for JDK-9. There are two things I don't know how to solve:

https://github.com/OpenHFT/Chronicle-Bytes/compare/master...MartyIX:jdk9-patch?expand=1#diff-600376dffeb79835ede4a0b285078036R112 - this should be in the profile I guess

https://github.com/OpenHFT/Chronicle-Bytes/compare/master...MartyIX:jdk9-patch?expand=1#diff-77fac035fff19bf1e53777e3b28955d6R27 - I don't know how to wrap the class properly for it to work on released JDKs and JDK 9+

Tests pass for me. Hopefully, the patch will be of some use.

Regards, Martin
opened by MartyIX 8

Boundary underflow in chunked mapped file in multiple operations near chunk boundary

Exception in thread "main" net.openhft.chronicle.bytes.util.DecoratedBufferUnderflowException: Acquired the next BytesStore, but still not room to add 2 when realCapacity 15728640
	at net.openhft.chronicle.bytes.internal.ChunkedMappedBytes.writeCheckOffset(ChunkedMappedBytes.java:266)
	at net.openhft.chronicle.bytes.AbstractBytes.writeOffsetPositionMoved(AbstractBytes.java:1023)
	at net.openhft.chronicle.bytes.AbstractBytes.writeOffsetPositionMoved(AbstractBytes.java:1016)
	at net.openhft.chronicle.bytes.AbstractBytes.writeShort(AbstractBytes.java:1043)
	at org.example.Main.main(Main.java:38)

opened by alamar 7

bytes.append(double) and bytes.parseDouble() problematic for some numbers

bytes.append(6.85202d).toString() results in 6.8520200000000008

In this case we behave differently to Double.toString and Double.parseDouble - see the unit test.

Seems that underlying problem is in BytesInternal.asDouble. Have changed the test to call this more directly.

opened by JerryShea 7
RandomDataInput.copyTo(ByteBuffer) does not use readPosition
Some time ago @alamar modified Bytes.copy() and Bytes.copyTo(BytesStore) to take readPosition into account (see #300).

But it seems that RandomDataInput.copyTo(ByteBuffer) has the same problem:

RandomDataInput.copyTo(byte[]) uses readPosition and readRemaining.

RandomDataInput.copyTo(ByteBuffer) uses readRemaining, but not readPosition (it uses start() instead).

BytesStore implements RandomDataInput and it has its own copyTo methods, which since #300 use readPosition and readRemaining.
opened by gortiz 0
Improve Bytes performance tests
ensure they show up in TC statistics panel

make sure all JLBH tests ran

extra specialisations of ContentEqualsJLBHTest for smaller size and also compare vs non-vectorized
opened by JerryShea 0
UTF8 vs ASCII 8bit encoding performance issues.

The encoding/decoding for ASCII and UTF8 show small but annoying performance differences which shouldn't exist and are not consistent from one run of the JVM to another. This makes performance regression tests hard to implement.

opened by peter-lawrey 1

Releases(chronicle-bytes-2.24ea2)

chronicle-bytes-2.24ea2(Nov 16, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.24ea1(Nov 10, 2022)
[closed] vectorisedMismatch falsely equates single character strings #469

[closed] reference to DirectBuffer doesn't compile by default in Java 9+ #465

[closed] Boundary underflow in chunked mapped file in multiple operations near chunk boundary #454

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23.33(Nov 10, 2022)
[closed] vectorisedMismatch falsely equates single character strings #469

[closed] Boundary underflow in chunked mapped file in multiple operations near chunk boundary #454

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.22.28(Oct 8, 2022)
[closed] Acquiring next byte store may not change the state when shifting backwards #452

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23.32(Oct 4, 2022)
[closed] Acquiring next byte store may not change the state when shifting backwards #452

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23.31(Sep 29, 2022)
[closed] The CanonicalPath should not be unique when run with different classLoaders in the same JVM #443

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.22.27(Sep 22, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea29(Sep 19, 2022)
[closed] Exponent is ignored when it is explicitly positive #438

[closed] null byte in JSON representation #440

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.22.26(Sep 15, 2022)
[closed] MappedBytes has capacity >2G leading to ArithmeticException in NativeBytes methods #434

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea28(Sep 7, 2022)
[closed] MappedBytes has capacity >2G leading to ArithmeticException in NativeBytes methods #434

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea27(Sep 2, 2022)
[closed] PointerBytesStore bounds check failing when attempting to read from direct bytes #432

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea26(Aug 31, 2022)
[closed] Bytes.wrapForWrite(ByteBuffer).write() corrupts heap #410

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea25(Aug 8, 2022)
[closed] Embedded bytes can't be cleared after TriviallyCopyable unmarshalling #422

[closed] Bytes.copyTo method writes extra bytes to the output stream #414

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.22.25(Aug 8, 2022)
[closed] Bytes.copyTo method writes extra bytes to the output stream #414

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea24(Jul 27, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea23(Jul 21, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea22(Jun 30, 2022)
[closed] Improve how whitespace is combined in YamlWire #401

[closed] Optimise ensureCapacity #398

[closed] Fix the optimization of zeroOut #397

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea21(Jun 24, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea20(Jun 17, 2022)
[closed] Performance optimisation writing to mapped file #391

[closed] create table with system properties in Chronicle-Bytes #377

[closed] Provide a property configurable to BytesInternal.SI = new StringInternerBytes() #205

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea10(Jun 6, 2022)
[bug] Bytes: Writing an array of length zero does not create new ByteStore #384

[bug] Improve NoBytesStore #374

[bug] Several threads in MappedUniqueTimeProvider is using the same Bytes #372

[bug] Add a plugin that ensures copyright messages are present in all applicable source files #242

[closed] update figure 1 in ReadMe #379

[enhancement] Improve handling of deprecated RDI methods #382

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea9(May 23, 2022)
[bug] Document MappedBytes #368

[bug] Document that Bytes.forFieldGroup uses the first byte to store the length, consuming a byte and limiting the length to 255 bytes #254

[closed] StreamingDataInput.readWithLength doesn't work for length > 7 #370

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea8(May 19, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea7(May 12, 2022)
[closed] Fixed thread safety issue with StringInternerBytes #363

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea6(May 3, 2022)
[enhancement] Add other methods with hoisted boundary checks #297

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.22.23(Apr 21, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea5(Apr 15, 2022)
[bug] Fix broken javadocs #307

[enhancement] Remove sonarcube warnings #318

[enhancement] Remove @NonNull for primitive arrays #291

[enhancement] Consistently introduce @NonNegative across the public API #290

[enhancement] Specify and test cursor operations on a closed resource #274

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.22.22(Apr 6, 2022)
[closed] AbstractReference uses existing offset to trigger initialisation of newly assigned BytesStore #342

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea4(Mar 23, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.23ea3(Mar 21, 2022)
[closed] AbstractReference uses existing offset to trigger initialisation of newly assigned BytesStore #342

Source code(tar.gz)
Source code(zip)
chronicle-bytes-2.21.95(Mar 21, 2022)

No changelog for this release.
Source code(tar.gz)
Source code(zip)