A novel implementation of the Raft consensus algorithm

Atomix

Last update: Dec 6, 2022

Overview

Copycat

Copycat has moved!

Copycat 2.x is now atomix-raft and includes a variety of improvements to Copycat 1.x:

Multiple state machines per cluster
Multiple sessions per client
Index-free memory mapped log
Per-state-machine snapshots
Framework agnostic serialization
Partitioning
etc

This repository is no longer officially maintained.

Comments

ISE: inconsistent index
I haven't done much debugging on this yet. But thought I will log this first.

Here's what I was doing:

3 node cluster

A separate client is submitting commands to the cluster at the rate of few hundred per second

Randomly kill a node bring it back up and verify everything is still running ok.

Here's what I did to see this error:

Kill the leader

Noticed another node was elected as new leader

This expection is logged on the new leader

12:21:00.424 [copycat-server-Address[localhost/127.0.0.1:5003]] ERROR i.a.c.u.c.SingleThreadContext - An uncaught exception occurred java.lang.IllegalStateException: inconsistent index: 1605633 at io.atomix.catalyst.util.Assert.state(Assert.java:69) ~[classes/:na] at io.atomix.copycat.server.storage.Segment.get(Segment.java:319) ~[classes/:na] at io.atomix.copycat.server.storage.Log.get(Log.java:319) ~[classes/:na] at io.atomix.copycat.server.state.LeaderState$Replicator.entriesCommit(LeaderState.java:1040) ~[classes/:na] at io.atomix.copycat.server.state.LeaderState$Replicator.commit(LeaderState.java:980) ~[classes/:na] at io.atomix.copycat.server.state.LeaderState$Replicator.lambda$commit$126(LeaderState.java:876) ~[classes/:na] at io.atomix.copycat.server.state.LeaderState$Replicator$$Lambda$111/982607974.apply(Unknown Source) ~[na:na] at java.util.Map.computeIfAbsent(Map.java:957) ~[na:1.8.0_25] at io.atomix.copycat.server.state.LeaderState$Replicator.commit(LeaderState.java:874) ~[classes/:na] at io.atomix.copycat.server.state.LeaderState$Replicator.access$100(LeaderState.java:818) ~[classes/:na] at io.atomix.copycat.server.state.LeaderState.accept(LeaderState.java:652) ~[classes/:na] at io.atomix.copycat.server.state.ServerState.lambda$registerHandlers$20(ServerState.java:421) ~[classes/:na] at io.atomix.copycat.server.state.ServerState$$Lambda$34/1604755402.handle(Unknown Source) ~[na:na] at io.atomix.catalyst.transport.NettyConnection.handleRequest(NettyConnection.java:102) ~[classes/:na] at io.atomix.catalyst.transport.NettyConnection.lambda$0(NettyConnection.java:90) ~[classes/:na] at io.atomix.catalyst.transport.NettyConnection$$Lambda$50/1312252238.run(Unknown Source) ~[na:na] at io.atomix.catalyst.util.concurrent.Runnables.lambda$logFailure$7(Runnables.java:20) ~[classes/:na] at io.atomix.catalyst.util.concurrent.Runnables$$Lambda$4/1471868639.run(Unknown Source) [classes/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_25] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_25] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_25] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_25] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_25] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_25] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_25]
bug
opened by madjam 41
Ensure tombstones are retained in the log until applied on all servers

This PR fixes #38 by recovering the concept of a globalIndex. The globalIndex is indicative of the highest index stored on all servers. The global index is used to determine when it's safe to remove tombstones from the log during major compaction. If a server is partitioned, failure to replicate a tombstone entry to that server before compaction will result in inconsistent state across the cluster.

This PR also fixes some issues with the previous implementation of globalIndex. Previously, only active cluster members were considered in the calculation of the global index. But because passive members maintain state machines while they catch up to the rest of the cluster, it's important that they be included in counts. Similarly, this PR fixes an issue wherein state for passive servers wasn't reset when a leader first transitioned.

Note that this PR does still allow major compaction to operate normally aside from removing tombstones. It stores two compaction indexes in the log: minorIndex and majorIndex. The minorIndex is the highest index for which normal entries can be safely removed from the log (based on events received by clients), and majorIndex is the highest index for which tombstones can be safely removed from the log. If the major compaction process is compacting a section of the log with indexes greater than the majorIndex, it will simply retain any tombstones.

I'm going very carefully through the entire project to document and verify the algorithms. This is a big step in the right direction. Excited about finding some good potential issues.

opened by kuujo 19
Client connection/sequencing bug fixes

This PR fixes a bug in the Copycat client when concurrent requests have been submitted to the cluster and ultimately timeout. When that happens, currently the client resets the connection on every failed requests. This causes the client to reconnect repeatedly, causing more timeouts, causing more reconnections, and more or less resulting in recursion between the client and server. To prevent this, we simply ensure the current connection hasn't already been reset before resetting it.

@jhall11 I'm assigning you to review this
bug

opened by kuujo 18
Support dynamic replacement of voting members
This PR is the last significant refactoring that will be done before the 1.0 release. Once this is merged, a release candidate will be released.

The idea here is to allow clusters to dynamically resize themselves according to a configured quorumHint. With this change, servers can be in one of three states:

ACTIVE - active voting members of the Raft cluster. The cluster will always attempt to maintain quorumHint active members

PASSIVE - passive members are members that receive AppendRequests from followers and maintain a state machine. In the event that an ACTIVE server becomes unavailable, it can be quickly replaced by the most up-to-date PASSIVE member

RESERVE - reserve members are stateless members that simply await promotion. Reserve servers receive only configuration updates from followers

This results in a cluster that can rapidly replace unavailable Raft members with available passive members. Each server periodically sends a HeartbeatRequest to the leader, and heartbeats are tracked in a manner similar to sessions. When an ACTIVE member stops sending heartbeats to the leader, it will, if possible, be demoted to RESERVE and a PASSIVE or RESERVE server will be promoted to ACTIVE to take its place. The promotion will happen by first promoting a new ACTIVE member and then demoting the UNAVAILABLE voting member.

This, in theory, allows large clusters to be started and operated with little management. Servers can join and leave the cluster at will, and the system will always attempt to maintain exactly quorumHint voting members. This is illustrated in the testReplace test method: https://github.com/atomix/copycat/blob/dynamic-configurations/test/src/test/java/io/atomix/copycat/test/ClusterTest.java#L105-L122

This test creates a 3 node cluster, adds three nodes to the cluster, and then removes the original three nodes, all while submitting writes. Initially, the 3 nodes are added as RESERVE and then promoted to PASSIVE members (the backupCount dictates the number of passive members). As ACTIVE servers are shut down, PASSIVE members take their place to allow writes to continue without losing fault tolerance. (This is actually not implemented totally correctly at the moment. While unavailable servers are replaced by first promoting and then demoting, when an active server leaves the cluster it is first removed from the configuration before a new member is promoted. This should not be the case).

The use case for this feature is to simplify the management of embedded Copycat clusters. The number of servers in the cluster does not need to relate to the Raft algorithm in any way. Only the quorumHint and backupCount need relate to the Raft algorithm. One place where this is useful is in Vert.x (to which I'm a significant contributor). With the additional scalability, an Atomix Vert.x ClusterManager becomes much more approachable for Vert.x users who are used to embedded HA systems which scale well.

Note that the use of dynamic quorums is entirely optional. Starting a cluster in the typical way will still yield the expected result. If no quorumHint is specified, all configured members are assumed to be Raft voting members. Once the cluster has been started, the quorum hint remains the same size as the initial number of core members, and any additional servers will be added in the reserve/passive state. So, it's possible to scale down the number of voting members, but it cannot currently be scaled up beyond the initial configuration.

Because of the significant refactoring that went into this, this PR also fixes a couple other issues. It adds a MetaStore which stores the term, vote, and configuration of each server on disk. Additionally, a significant change in this PR is that servers now have separate Servers for client and server communication. So, each server can define both a client and server Address. Clients receive client Addresses through KeepAliveResponses, and servers propagate that information to each other in configurations. I wanted to get the client Address feature in before the 1.0 release because a big goal of post-1.0 development will be to expose an HTTP/REST interface for clients. So, ultimately the client interface will provide an abstraction to allow clients to interact with servers using different protocols without forcing servers to use the same protocol.

@madjam I'm assigning this to you. Let me know if anything seems unclear. I'm pretty tired right now and feel like what I'm writing is all over the place. What are your thoughts on this?
opened by kuujo 18

Lagging follower is forced to truncate its log repeatedly

I noticed this during a test run. This is a 3 node set up. One of the follower keeps resetting itself (truncate log) as it always discovers that the globalIndex is greater than what it is aware of. A brief look at the logic in PassiveState::checkGlobalIndex tells me that this should not happen on start up.

Here are some log lines. (This pattern keeps repeating)

2016-02-18 23:18:49,572 | DEBUG | FollowerState | 10.254.1.207/10.254.1.207:9876 - Received AppendRequest[term=1, leader=184429489, logIndex=0, logTerm=0, entries=[0], commitIndex=3288, globalIndex=3288]
2016-02-18 23:18:49,572 | DEBUG | SegmentManager | Closing segment: 1
2016-02-18 23:18:49,573 | DEBUG | SegmentManager | Created segment: Segment[id=1, version=1, index=0, length=0]
2016-02-18 23:18:49,573 | DEBUG | FollowerState | 10.254.1.207/10.254.1.207:9876 - Sent AppendResponse[status=OK, term=1, succeeded=true, logIndex=0]
2016-02-18 23:18:49,573 | DEBUG | FollowerState | 10.254.1.207/10.254.1.207:9876 - Received AppendRequest[term=1, leader=184429489, logIndex=0, logTerm=0, entries=[0], commitIndex=3292, globalIndex=3292]
2016-02-18 23:18:49,573 | DEBUG | SegmentManager | Closing segment: 1
2016-02-18 23:18:49,573 | DEBUG | SegmentManager | Created segment: Segment[id=1, version=1, index=0, length=0]
2016-02-18 23:18:49,573 | DEBUG | FollowerState | 10.254.1.207/10.254.1.207:9876 - Sent AppendResponse[status=OK, term=1, succeeded=true, logIndex=0]
2016-02-18 23:18:49,574 | DEBUG | FollowerState | 10.254.1.207/10.254.1.207:9876 - Received ConfigureRequest[term=1, leader=184429489, index=3295, members=[ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=/10.254.1.207:9876, clientAddress=/10.254.1.207:9876], ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=/10.254.1.201:9876, clientAddress=/10.254.1.201:9876], ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=/10.254.1.202:9876, clientAddress=/10.254.1.202:9876]]]
2016-02-18 23:18:49,574 | DEBUG | FollowerState | 10.254.1.207/10.254.1.207:9876 - Sent ConfigureResponse[status=OK]
2016-02-18 23:18:49,750 | DEBUG | FollowerState | 10.254.1.207/10.254.1.207:9876 - Received AppendRequest[term=1, leader=184429489, logIndex=0, logTerm=0, entries=[180], commitIndex=3294, globalIndex=3294]
2016-02-18 23:18:49,750 | DEBUG | SegmentManager | Closing segment: 1
2016-02-18 23:18:49,750 | DEBUG | SegmentManager | Created segment: Segment[id=1, version=1, index=0, length=0]

I'll continue looking into this more but thought I'd log an issue in case @kuujo you already know what is causing this.

bug

opened by madjam 15

introduce session unstability timeout

allows user to set a "unstableTimeout" (disabled by default)

if set, then once session reaches UNSTABLE state.... it will switch to UNSTABLE_PLUS after unstableTimeout , once UNSTABLE_PLUS is reached then DefaultCopycatClient is closed automatically.

@kuujo it is missing docs/unit-tests which I will add if the approach looks ok to you. and, feel free to suggest alternate names for the things introduced.

opened by himanshug 14
Full sequential consistency for all operation responses and events

This PR implements sequential consistency for operation responses and events and fixes several bugs with sequencing operations in FIFO order on servers, in particular when clients switch between followers and leaders for reads and writes respectively. To implement sequencing for responses/events, we remove support for retrying client queries. Commands are still retried internally for linearizable semantics, but because failed queries don't have the potential for side effects and because queries can switch between the leader and followers, we found removing retries for queries altogether to be the simplest approach at least for the initial implementation. This forces the user to retry queries externally, implying multiple operations. In the future, we can at least make queries more likely to succeed by implementing retries for a single connection communicating with a single follower, but switching followers poses significant challenges for retrying queries and maintaining FIFO order.

The goal of this PR is to entirely replace linearizable events with sequential consistency for all events received by a client (including responses). Once linearizable events were removed, the algorithms with which servers sequence commands and queries in FIFO order were cleaned up to ensure the correct behavior. We assume that requests arriving on a follower will always arrive in FIFO order given an ordered protocol, so to apply a query on a follower we simply wait for the application of the prior command. In contrast, commands sent to leaders may be proxied through followers and therefore have the potential to arrive out of order when switching servers, and they're therefore sequenced on the leader before being written to the Raft log. Additional issues were resolved in how queries applied on leaders are sequenced as well. Previously, there was some potential that a query arriving after a command could be applied before it if the earlier command is not committed in a single heartbeat. Therefore, leaders now appropriately await the application of the prior command sequence number to the state machine before evaluating a query.

Finally, with command and query requests properly sequenced on servers, client-side sequencing was implemented to ensure that clients see command and query responses in the order in which their counterpart requests were applied to state machines and to sequence events received from the cluster within those responses. The ClientSequencer works by first placing responses in the order in which they were submitted to the cluster. For each response, if the prior event has not yet been received, the response is enqueued until the relevant criteria are met. See the ClientSequencer documentation for a more in-depth explanation of the algorithm it uses.

opened by kuujo 14
Implementing counting state machines

If I want to implement a state machine that tracks the number of times it was updated then the current implementation presents some difficulties. We work under the assumption that entries that no longer contribute to state machine evolution need not be replicated. But for operation counting state machines all operations are meaningful. I guess if we annotate the command entry as a tombstone we can clean it up after it is replicated to all instances. But a cluster configuration change to add new members poses a challenge.

I'm opening this issue to get some thoughts on this subject. My initial thought was that for some state machine types snapshotting is the only way to go if we want to garbage collect log entries from time to time.

opened by madjam 13
Support for a Leader to leave the cluster cleanly

When a node that is the current leader leaves the cluster it will do so without properly updating the cluster configuration. This happens because the state transition from LEADER to LEAVE prevents committing the updating cluster configuration.
bug

opened by madjam 13
Possible log file corruption when a node is terminated
I happened to notice this occasionally during some recent runs. Doesn't happen every time.

Here's a scenario I was running:

3 node cluster. s1, s2 and s3.

A separate client is logging commands on this cluster at the rate of few hundred per second.

Terminate a follower node (kill java process)

Try bringing it back up and you see the below exception trace.

09:04:31.681 [copycat-server-Address[localhost/127.0.0.1:5003]] ERROR i.a.c.u.c.SingleThreadContext - An uncaught exception occurred java.lang.IndexOutOfBoundsException: null at io.atomix.catalyst.buffer.AbstractBytes.checkOffset(AbstractBytes.java:47) ~[catalyst-buffer-1.0.0-rc3.jar:na] at io.atomix.catalyst.buffer.AbstractBytes.checkRead(AbstractBytes.java:54) ~[catalyst-buffer-1.0.0-rc3.jar:na] at io.atomix.catalyst.buffer.FileBytes.readUnsignedShort(FileBytes.java:286) ~[catalyst-buffer-1.0.0-rc3.jar:na] at io.atomix.catalyst.buffer.AbstractBuffer.readUnsignedShort(AbstractBuffer.java:528) ~[catalyst-buffer-1.0.0-rc3.jar:na] at io.atomix.copycat.server.storage.Segment.(Segment.java:81) ~[classes/:na] at io.atomix.copycat.server.storage.SegmentManager.loadDiskSegment(SegmentManager.java:345) ~[classes/:na] at io.atomix.copycat.server.storage.SegmentManager.loadSegment(SegmentManager.java:332) ~[classes/:na] at io.atomix.copycat.server.storage.SegmentManager.loadSegments(SegmentManager.java:404) ~[classes/:na] at io.atomix.copycat.server.storage.SegmentManager.open(SegmentManager.java:95) ~[classes/:na] at io.atomix.copycat.server.storage.SegmentManager.(SegmentManager.java:58) ~[classes/:na] at io.atomix.copycat.server.storage.Log.(Log.java:135) ~[classes/:na] at io.atomix.copycat.server.storage.Storage.open(Storage.java:280) ~[classes/:na] at io.atomix.copycat.server.state.ServerContext.lambda$0(ServerContext.java:72) ~[classes/:na] at io.atomix.copycat.server.state.ServerContext$$Lambda$3/846238611.run(Unknown Source) ~[na:na] at io.atomix.catalyst.util.concurrent.Runnables.lambda$logFailure$17(Runnables.java:20) ~[catalyst-common-1.0.0-rc3.jar:na] at io.atomix.catalyst.util.concurrent.Runnables$$Lambda$4/1663166483.run(Unknown Source) [catalyst-common-1.0.0-rc3.jar:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_25] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_25] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_25] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_25] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_25] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_25] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_25]
opened by madjam 12
Expire client sessions when command/query fail with unknown session

#276 describes an issue wherein clients don't properly expire their session when they receive an UnknownSessionException in response to a command/query. This prevents users of the client API from realistically being able to retry commands/queries across sessions since the client must wait for a KeepAliveRequest to fail before recovering their session. This PR adds a check for UnknownSessionException in ClientSubmitter and proactively expires the session if one is received for any command or query.
bug

opened by kuujo 11
ServerCommit GC Warning

If you don't release a ServerCommit because you don't want it to be compacted. If you don't keep a reference to the commit it is eventually GCed, why does that make the Commit log dirty?

https://github.com/atomix/copycat/blob/master/server/src/main/java/io/atomix/copycat/server/state/ServerCommitPool.java#L73

opened by JPWatson 0
Allow KeepAliveRequest to manage multiple sessions

In ONOS, the Copycat client's sequencing and the variation of queries/commands and consistent levels seems to significantly limit performance across primitives. Ideally, each primitive should be able to have a separate logical client that does sequencing independently of other clients. Currently, though, only one CopycatClient can be used for each session registered with the cluster, and the overhead of keeping a session alive makes it impractical to use multiple clients on a single node.

Multiple Copycat clients on the same node should be able to share a session manager and connections while performing request/response/event sequencing independent of other clients.

opened by kuujo 2
question about pessimistic case

Hi , i trying to understand how to use your library. The documentation is clear but there is a thing i dont understand.It explains when all goes well in the transaction.But in the opposite case? Supposing there is Storage Key/Value S is a LSM-like storage,Nodes F,A,B,C,D. When you receive the Command "put" K,X ( to save a object X) you see there is a conflict with another version of the object X. For example the optimistic lock is not correct. In this case how to do? Commit class don t contains rollback. Maybe throwing a exception (received also by sender)? How to notify the cluster nodes it it is not possible to commit ? How to handle the error in the sender?

In my mind i thought a pseudo-code like it:

node F .put (K,x) { TransactionalLog.putEntry("put",K,X); copycat.send command->{ when command is accepted by quorum-> storage.save(X); TransactionalLog.removeEntry("put",X); when command is reject by quorum ->TransactionalLog.removeEntry("put",X); throw rollbackException() } }

received from nodes A,B,C,D-> (A KO,b KO,C KO,D ok) --> the quorum is KO but i have to rollback Probably TransactionalLog is already present in copycat so it is not necessary. How to intercept Accepted/rejected? example: received from nodes A,B,C,D-> (A ok,b ok,C KO,D ok) --> the quorum is OK but i have to replair node C in this case i have to not close the StateMachine commit in node C until i repaired the inconsistent status.

opened by publicocean0 1
copycat 1.2.4 server fails build test
When building copycat on windows under eclipse the build test fails.

Build environment: Eclipse Neon3 4.6.3 Buildship: Eclipse Plug-ins for Gradle 1.0.21.v20161010-1640 org.eclipse.buildship.feature.group Eclipse Buildship Code Recommenders for Java Developers 2.4.6.v20170307-1041 org.eclipse.recommenders.rcp.feature.feature.group Eclipse Code Recommenders Code Recommenders Mylyn Integration 2.4.6.v20170307-1041 org.eclipse.recommenders.mylyn.rcp.feature.feature.group Eclipse Code Recommenders Code Recommenders Snipmatch 2.4.6.v20170307-1041 org.eclipse.recommenders.snipmatch.rcp.feature.feature.group Eclipse Code Recommenders Eclipse IDE for Java Developers 4.6.3.20170314-1500 epp.package.java Eclipse Packaging Project Eclipse Java Development Tools 3.12.3.v20170301-0400 org.eclipse.jdt.feature.group Eclipse.org Eclipse XML Editors and Tools 3.8.2.v201702270442 org.eclipse.wst.xml_ui.feature.feature.group Eclipse Web Tools Platform Git integration for Eclipse 4.6.1.201703071140-r org.eclipse.egit.feature.group Eclipse EGit Git integration for Eclipse - Task focused interface 4.6.1.201703071140-r org.eclipse.egit.mylyn.feature.group Eclipse EGit Java implementation of Git 4.6.1.201703071140-r org.eclipse.jgit.feature.group Eclipse JGit m2e - Maven Integration for Eclipse (includes Incubating components) 1.7.0.20160603-1933 org.eclipse.m2e.feature.feature.group Eclipse.org - m2e m2e - slf4j over logback logging (Optional) 1.7.0.20160603-1933 org.eclipse.m2e.logback.feature.feature.group Eclipse.org - m2e Mylyn Builds Connector: Hudson/Jenkins 1.13.0.v20160806-1446 org.eclipse.mylyn.hudson.feature.group Eclipse Mylyn Mylyn Context Connector: Eclipse IDE 3.21.0.v20160912-1820 org.eclipse.mylyn.ide_feature.feature.group Eclipse Mylyn Mylyn Context Connector: Java Development 3.21.0.v20160701-1337 org.eclipse.mylyn.java_feature.feature.group Eclipse Mylyn Mylyn Task List 3.21.0.v20160914-0252 org.eclipse.mylyn_feature.feature.group Eclipse Mylyn Mylyn Task-Focused Interface 3.21.0.v20160815-2336 org.eclipse.mylyn.context_feature.feature.group Eclipse Mylyn Mylyn Tasks Connector: Bugzilla 3.21.0.v20160909-1813 org.eclipse.mylyn.bugzilla_feature.feature.group Eclipse Mylyn Mylyn Versions Connector: Git 1.13.0.v20160630-2022 org.eclipse.mylyn.git.feature.group Eclipse Mylyn Mylyn WikiText 2.10.1.v20161129-1925 org.eclipse.mylyn.wikitext_feature.feature.group Eclipse Mylyn

Java version: JDK 1.8.0_121

The following output for copycat server is observed.

22:52:55.318 [test-server] DEBUG i.a.c.server.storage.SegmentManager - Created segment: Segment[id=1, version=1, index=0, length=0] 22:52:55.321 [test-server] DEBUG i.a.c.server.storage.SegmentManager - Created segment: Segment[id=1, version=1, index=0, length=0] 22:52:55.329 [test-server] DEBUG i.a.c.server.storage.SegmentManager - Created segment: Segment[id=1, version=1, index=0, length=0] 22:52:55.334 [test-server] DEBUG i.a.c.server.storage.SegmentManager - Created segment: Segment[id=1, version=1, index=0, length=0] 22:52:55.338 [test-server] DEBUG i.a.c.server.storage.SegmentManager - Created segment: Segment[id=1, version=1, index=0, length=0] 22:52:55.343 [test-server] DEBUG i.a.c.server.storage.SegmentManager - Created segment: Segment[id=1, version=1, index=0, length=0] 22:52:55.345 [test-server] DEBUG i.a.c.server.state.ServerContext - localhost/127.0.0.1:5000 - Set term 2 22:52:55.346 [test-server] DEBUG i.a.c.server.state.PassiveState - localhost/127.0.0.1:5000 - Rejected AppendRequest[term=1, leader=2130712285, logIndex=2, logTerm=2, entries=[0], commitIndex=0, globalIndex=0]: request term is less than the current term (2) 22:52:55.348 [test-server] DEBUG i.a.c.server.storage.SegmentManager - Created segment: Segment[id=1, version=1, index=0, length=0] 22:52:55.351 [test-server] DEBUG i.a.c.server.state.ServerContext - localhost/127.0.0.1:5000 - Set term 1 Tests run: 760, Failures: 5, Errors: 0, Skipped: 7, Time elapsed: 771.047 sec <<< FAILURE! - in TestSuite testDeleteMetaStore(io.atomix.copycat.server.storage.MetaStoreTest) Time elapsed: 0.002 sec <<< FAILURE! java.lang.AssertionError: expected [0] but found [1] at io.atomix.copycat.server.storage.MetaStoreTest.testDeleteMetaStore(MetaStoreTest.java:115)

cleanupStorage(io.atomix.copycat.server.storage.MetaStoreTest) Time elapsed: 0.002 sec <<< FAILURE! java.nio.file.FileSystemException: target\test-logs\6054e526-7877-406c-9518-3a341d428deb\test.meta: The process cannot access the file because it is being used by another process.

at io.atomix.copycat.server.storage.MetaStoreTest.cleanupStorage(MetaStoreTest.java:124)

testDescriptorBuilder(io.atomix.copycat.server.storage.SegmentDescriptorTest) Time elapsed: 0.001 sec <<< FAILURE! java.lang.AssertionError: expected [1491047575354] but found [0] at io.atomix.copycat.server.storage.SegmentDescriptorTest.testDescriptorBuilder(SegmentDescriptorTest.java:59)

deleteDescriptor(io.atomix.copycat.server.storage.SegmentDescriptorTest) Time elapsed: 0.001 sec <<< FAILURE! java.nio.file.FileSystemException: descriptor.log: The process cannot access the file because it is being used by another process.

at io.atomix.copycat.server.storage.SegmentDescriptorTest.deleteDescriptor(SegmentDescriptorTest.java:142)

cleanLogDir(io.atomix.copycat.server.storage.FileLogTest) Time elapsed: 0 sec <<< FAILURE! java.nio.file.FileSystemException: target\test-logs\6054e526-7877-406c-9518-3a341d428deb\test.meta: The process cannot access the file because it is being used by another process.

Results :

Failed tests: FileLogTest>AbstractLogTest.cleanLogDir:124 » FileSystem target\test-logs\6054... io.atomix.copycat.server.storage.MetaStoreTest.cleanupStorage(io.atomix.copycat.server.storage.MetaStoreTest) Run 1: MetaStoreTest.cleanupStorage:124 » FileSystem target\test-logs\6054e526-7877-4... Run 2: PASS Run 3: PASS

MetaStoreTest.testDeleteMetaStore:115 expected [0] but found [1] io.atomix.copycat.server.storage.SegmentDescriptorTest.deleteDescriptor(io.atomix.copycat.server.storage.SegmentDescriptorTest) Run 1: SegmentDescriptorTest.deleteDescriptor:142 » FileSystem descriptor.log: The pr... Run 2: PASS Run 3: PASS

SegmentDescriptorTest.testDescriptorBuilder:59 expected [1491047575354] but found [0]

Tests run: 756, Failures: 5, Errors: 0, Skipped: 3

[INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 13:21 min [INFO] Finished at: 2017-04-01T22:53:43+11:00 [INFO] Final Memory: 16M/162M
opened by nvySub 2
Nodes can't communicate over the internet
I'm trying to deploy a Copycat cluster on Azure and getting issues if the nodes aren't on the same private network.

If I try to start a server with the public IP of the machine, it fails:

[copycat-server-/<publicIp>:10101-copycat] netty.NettyServer.listen - Binding to /<publicIp>:10101 [copycat-server-/<publicIp>:10101-copycat] server.CopycatServer.lambda$start$30 - Failed to start server!

which makes sense, since it's not on any of the interfaces.

If I try to bind it to 0.0.0.0, it works, but then 0.0.0.0:X gets advertised as the server (member) address in requests and nodes can't send back any responses.
opened by adagys 1

Owner

Atomix

A polyglot framework for building fault-tolerant distributed systems

GitHub http://atomix.io/copycat

Reference implementation for MINAS (MultI-class learNing Algorithm for data Streams), an algorithm to address novelty detection in data streams multi-class problems.

4 Sep 7, 2022

JitPack is a novel package repository for JVM and Android projects.

JitPack is a novel package repository for JVM and Android projects. It builds Git projects on demand and provides you with ready-to-use artifacts (jar, aar).

2.3k Dec 27, 2022

a blockchain network simulator aimed at researching consensus algorithms for performance and security

Just Another Blockchain Simulator JABS - Just Another Blockchain Simulator. JABS is a blockchain network simulator aimed at researching consensus algo

49 Jan 1, 2023

A DJL Algorithm used to detect if a Image contains a person such as Dream, Sapnap, George Not Found, TommyInnit, Tubbo or Ranboo. This Project has been created for a YouTube Video which is not yet finished, and neither is the Algorithm.

PissAI Personal Individuality Security Service Artificial Intelligence A DJL Algorithm used to detect if an Image contains a person such as Dream, Sap

5 Nov 19, 2022

Hashids algorithm v1.0.0 implementation in Java

Hashids.java A small Java class to generate YouTube-like hashes from one or many numbers. Ported from javascript hashids.js by Ivan Akimov What is it?

944 Jan 5, 2023

Hashids algorithm v1.0.0 implementation in Java

Hashids.java A small Java class to generate YouTube-like hashes from one or many numbers. Ported from javascript hashids.js by Ivan Akimov What is it?

944 Dec 29, 2022

Just a simple implementation of K-Nearest Neighbour algorithm.

A simple K-Nearest Neighbor (KNN) Java library What is this repository for? Its a very simple implementation of K-Nearest Neighbor algorithm for Super

3 Apr 23, 2021

"Pathfinder" - a small demo app made in Java, using Swing which shows an example implementation of a pathfinding algorithm based on BFS

"Pathfinder" is a small demo app made in Java, using Swing which shows an example implementation of a pathfinding algorithm based on BFS.

2 Mar 9, 2022

A near-real-time Mesh Join Algorithm Implementation provided with a Complete Data warehouse for METRO

Mesh Join Algorithm and Data Warehouse A complete Mesh-Join Algorithm Implementation as provided in the paper R-MESHJOIN . This is demonstrated by the

3 Aug 11, 2022

Search API with spelling correction using ngram-index algorithm: implementation using Java Spring-boot and MySQL ngram full text search index

Search API to handle Spelling-Corrections Based on N-gram index algorithm: using MySQL Ngram Full-Text Parser Sample Screen-Recording Screen.Recording

5 Dec 4, 2021

Simple Java image-scaling library implementing Chris Campbell's incremental scaling algorithm as well as Java2D's "best-practices" image-scaling techniques.

imgscalr - Java Image-Scaling Library http://www.thebuzzmedia.com/software/imgscalr-java-image-scaling-library/ Changelog --------- 4.2 * Added sup

1.1k Jan 5, 2023

Java rate limiting library based on token/leaky-bucket algorithm.

Java rate-limiting library based on token-bucket algorithm. Advantages of Bucket4j Implemented on top of ideas of well known algorithm, which are by d

1.7k Jan 8, 2023

An easy-to-implement library for the GeoHash algorithm

Overview An easy-to-implement library that can assist Java developers in using the GeoHash algorithm in order to create geocodes based on custom latit

63 Mar 25, 2022

Java library for the HyperLogLog algorithm

java-hll A Java implementation of HyperLogLog whose goal is to be storage-compatible with other similar offerings from Aggregate Knowledge. NOTE: This

296 Dec 30, 2022

Solutions for some common algorithm problems written in Java.

Algorithms This repository contains my solution for common algorithms. I've created this repository to learn about algorithms and improve solutions to

2.8k Dec 30, 2022

Fast and stable sort algorithm that uses O(1) memory. Public domain.

WikiSort WikiSort is an implementation of "block merge sort", which is a stable merge sort based on the work described in "Ratio based stable in-place

1.2k Jan 1, 2023

Teaching materials for Algorithm Bootcamp: Object-Oriented Programming.

Object Oriented Programming Materials Materials Topics Code Introduction to Java Variables and Data Types Operators I/O Selection and Repetition 00_in

17 Nov 8, 2022

Ship React Native .jsbundles compressed by Brotli algorithm.

Ship compressed JS bundles Warning: not yet available for Android. Sometimes you need a very small .app, for example if you are building an App Clip.

37 Nov 15, 2022

Sequence Alignment - Aligns two strings optimally as to minimize the cost of alignment. This algorithm has applications in aligning DNA, RNA, or protein.

Sequence_Alignment Aligns two strings optimally as to minimize the cost of alignment. This algorithm has applications in aligning DNA, RNA, or protein

1 Jan 8, 2022

A near real time Data Warehouse using the MeshJoin Algorithm

MeshJoin-Data-Warehouse A near real time Data Warehouse using the MeshJoin Algorithm Steps to run the project: Step 1: Run the createDW.sql file -This

2 Dec 1, 2022

A novel implementation of the Raft consensus algorithm

Related tags

Overview

Copycat

Copycat has moved!

Comments

Owner

Atomix

Reference implementation for MINAS (MultI-class learNing Algorithm for data Streams), an algorithm to address novelty detection in data streams multi-class problems.

JitPack is a novel package repository for JVM and Android projects.

a blockchain network simulator aimed at researching consensus algorithms for performance and security

A DJL Algorithm used to detect if a Image contains a person such as Dream, Sapnap, George Not Found, TommyInnit, Tubbo or Ranboo. This Project has been created for a YouTube Video which is not yet finished, and neither is the Algorithm.

Hashids algorithm v1.0.0 implementation in Java

Hashids algorithm v1.0.0 implementation in Java

Just a simple implementation of K-Nearest Neighbour algorithm.

"Pathfinder" - a small demo app made in Java, using Swing which shows an example implementation of a pathfinding algorithm based on BFS

A near-real-time Mesh Join Algorithm Implementation provided with a Complete Data warehouse for METRO

Search API with spelling correction using ngram-index algorithm: implementation using Java Spring-boot and MySQL ngram full text search index

Simple Java image-scaling library implementing Chris Campbell's incremental scaling algorithm as well as Java2D's "best-practices" image-scaling techniques.

Java rate limiting library based on token/leaky-bucket algorithm.

An easy-to-implement library for the GeoHash algorithm

Java library for the HyperLogLog algorithm

Solutions for some common algorithm problems written in Java.

Fast and stable sort algorithm that uses O(1) memory. Public domain.

Teaching materials for Algorithm Bootcamp: Object-Oriented Programming.

Ship React Native .jsbundles compressed by Brotli algorithm.

Sequence Alignment - Aligns two strings optimally as to minimize the cost of alignment. This algorithm has applications in aligning DNA, RNA, or protein.

A near real time Data Warehouse using the MeshJoin Algorithm