Fault tolerance and resilience patterns for the JVM

Overview

Failsafe

Build Status Maven Central License JavaDoc Join the chat at https://gitter.im/jhalterman/failsafe

Failsafe is a lightweight, zero-dependency library for handling failures in Java 8+, with a concise API for handling everyday use cases and the flexibility to handle everything else. It works by wrapping executable logic with one or more resilience policies, which can be combined and composed as needed. Current policies include Retry, Timeout, Fallback, and CircuitBreaker.

Usage

Visit the Failsafe website.

Contributing

Check out the contributing guidelines.

License

Copyright Jonathan Halterman and friends. Released under the Apache 2.0 license.

Comments
  • RetryPolicy thread safety

    RetryPolicy thread safety

    Is it safe to use RetryPolicy from multiple threads?

    I can see it doesn't set it's own members but it does add members to (array)lists of predicates. This is potentially not thread-safe and can break when using the same RetryPolicy object from multiple threads.

    Is there a standard to do this? I can copy the object for now when I modify it on specific threads, but perhaps making it thread safe is possible on your end.

    enhancement 3.0 
    opened by reutsharabani 49
  • How to reset failsafe?

    How to reset failsafe?

    Hi sir, Could you help me with my requirement? My requirement is to carry out a JDBC operation with retries. I am using the following approach:

    RetryPolicy retryPolicy = (RetryPolicy) new RetryPolicy()
            .withDelay(Duration.ofMillis(retryIntervalMillis))
            .withMaxRetries(retryLimit)
            .onFailedAttempt(e -> {
                ExecutionAttemptedEvent event = (ExecutionAttemptedEvent) e;
                LOG.warn("Error encountered while establishing connection or doing the " +
                                "read/write operation {}", event.getLastFailure().getMessage());
            })
            .onRetry(e -> {
                ExecutionAttemptedEvent event = (ExecutionAttemptedEvent) e;
                Throwable failure = event.getLastFailure();
                // log the retry as it seems to be a network/connection failure
                LOG.warn("Connection error encountered {}", failure.getMessage(),
                        failure);
                LOG.warn("Retrying {}th time to proceed again with a new connection",
                        event.getAttemptCount());
            })
            .handleIf(failure -> isRetryRequired((Throwable) failure));
    

    isRetryRequired() checks the exception's message and decides whether to do retry ot not. I am not showing its body here.

    Next, this is how failsafe is used:

    try {
        Failsafe
                .with(retryPolicy)
                .run(() -> {
                    getJdbcConnection();
                    createStatement();
                    executeQueries();
                });
    } catch (SQLException e1) {
        // todo
    } finally {
        closeConnection();
    }
    
    void executeQueries() {
      for (String query : queryList) {
       // execute the query
      }
    }
    

    My question is, if any of the method fails (getJdbcConnection or createStatement or executeQuery) then I want to retry and thats how I have written the code. However, suppose I had configured three retries and I had successfully obtained the connection in 2nd retry. Once the connection is successful, I want to reset the Failsafe so that it can do three retries again if the connection fails next time. How is this possible? How can I reset the Failsafe? Or what approach do you suggest?

    opened by Syed-SnapLogic 34
  • Dynamic delay

    Dynamic delay

    This PR is in response to #110. It adds a delay function property to RetryPolicy that is used, if set, to compute the next delay from the previous result or exception.

    There are some awkward bits here:

    • I used net.jodah.failsafe.util.Duration in signature of the delay function, though I really wanted to use java.time.Duration.
    • Combining delay factor other than 1 with delay function would be meaningless, but I've done nothing to make them mutually exclusive.
    • The one included test is pretty crude, passing if the actual delay is within a window of the requested delay.
    opened by Tembrel 25
  • Support basic Java Executor interface

    Support basic Java Executor interface

    Currently you can provide either ExecutorService, ScheduledExecutorService or implement custom Failsafe Scheduler. It would be good if it would also accept java.util.concurrent.Executor interface since it is a common interface returned by other libs. My current use case is in gRPC, when splitting context for multi-threading, and gRPC methods #fixedContextExecutor return base Executor.

    enhancement 3.0 
    opened by paulius-p 23
  • Async : How to gracefull shutdown ?

    Async : How to gracefull shutdown ?

    Hello,

    I'm trying FailSafe with:

    final ScheduledExecutorService executor = Executors.newScheduledThreadPool(2);
    final RetryPolicy retryPolicy = new RetryPolicy()
                    .withDelay(100, TimeUnit.MILLISECONDS)
                    .retryOn(RuntimeException.class)
                    .retryWhen(false)
                    ;
    
    // this task will fail 99% of time.
    Failsafe
                    .with(retryPolicy)
                    .with(executor)
                    .get((ctx) -> {
                            double i = Math.random();
                            if( i > 0.99 ) {
                                return true;
                            } else if (i < 0.1 ) {
                                throw new RuntimeException(" i = " + i);
                            }
                           return false;
                        })
            ;
    

    Then i add a ShutdownHook:

    Runtime.getRuntime().addShutdownHook(new Thread(() -> {
           executor.shutdown();
           executor.awaitTermination(1, TimeUnit.HOURS);
    }));
    

    Is there to do something like:

    FailSafe.awaitAllTaskComplete();
    executor.shutdown();
    executor.awaitTermination(1, TimeUnit.HOURS);
    

    Or should i have to store all future created and check if they all have been complete ?

    opened by pgoergler 23
  • Implement Timeout policy

    Implement Timeout policy

    For async, we use ForkJoinPool by default, currently with a CompletableFuture being used internally. Unfortunately, neither CompletableFuture or ForkJoinPool support cancel with interrupts for their tasks. So cancellation will only be effective for tasks waiting to be run.

    • The Timeout policy should be configurable to support interrupts or not.
    • The Timeout policy should (probably) fail with TimeoutException so that it can be bubbled up to outer policies and easily recognized by them as a failure (similar to CircuitBreakerOpenException).
    2.2 
    opened by jhalterman 22
  • Added support to easily create Proxy instances

    Added support to easily create Proxy instances

    It can be beneficial to wrap a whole interface easilyw ith Failsafe and provide RetryPolicy and CircuitBreaker as a whole.

    By leveraging JRE's built-in proxy construction, Failsafe can add support for creating proxy instances for interfaces (byteBuddy and other libraries can provide ability to create proxy classes of concrete types).

    Fixes #107

    opened by fzakaria 22
  • Restrictive time window for circuit breaker to record failures

    Restrictive time window for circuit breaker to record failures

    Right now it's possible to configure failure thresholds in terms of consecutive failures:

    • withFailureThreshold(4, 5), four failures out of five consecutive executions
    • withFailureThreshold(5, 10), five failures out of ten consecutive executions
    • etc.

    There is no notion of time right now. What I'm hoping for would be the ability to specify something like:

    • 10 failures within 1 second
    • 50 failures out of 200 executions within a minute

    My idea behind this is to effectively disable the circuit breaker in low-traffic scenarios but if traffic suddenly increases it should start to kick in. For requests with low traffic (less than ~10 per second) we identified circuit breakers as actually making matters worse, since they tend to stay open for longer, once they open up. And ultimately circuit breaker are means to protect overloading an application which doesn't really happen with few requests.

    opened by whiskeysierra 19
  • Improve handling of shutdown ExecutorService

    Improve handling of shutdown ExecutorService

    Hello all,

    first of all: thanks for this neat library, I appreciate it!

    But for my current usecase, I ran into a somewhat weird issue... Several experiments later, it seems I might have been able to isolate the problem a bit... But from the beginning:

    [Java 16 / Windows / Failsafe 2.4.0]

    I have a task, which will spawn some other tasks inside a local ExecutorService. Therefore, the ExecutorService needs to be shut down at the end. To harden this task, I want to use Failsafe, containing amongst other things a Timeout. But this Timeout does not work as expected in some cases, but gets swallowed so that the Task is not interrupted. (The tasks are responsive to interrupt.)

    I have put together a somewhat verbose example to illustrate the issue:

    import java.time.Duration;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    import java.util.concurrent.ForkJoinPool;
    import java.util.concurrent.TimeUnit;
    import net.jodah.failsafe.Failsafe;
    import net.jodah.failsafe.Timeout;
    
    public class FailsafeShutdownDemo {
    
        public static void main(String[] args) throws InterruptedException {
            
    /* A */ ExecutorService executorService = Executors.newScheduledThreadPool(4);
    /* B */ // ExecutorService executorService = ForkJoinPool.commonPool();
    
            System.out.println("Start");
    
            Failsafe.with(Timeout.of(Duration.ofSeconds(3))
                                 .withCancel(true))
        /* 1 */     .with(executorService)  // TIMEOUT FAILS (ExecutorService)  -  AS EXPECTED (ForkJoinPool)
        /* 2 */  // .with((ScheduledExecutorService) executorService)  // AS EXPECTED (ExecutorService cast to ScheduledExecutorService)
                    .onComplete(complete -> System.out.println("onComplete  ->  " + (complete.getFailure() == null
                                                                                     ? "No Failure / Timeout didn't work. :-("
                                                                                     : "Failure is " + complete.getFailure() + " as expected. :-)")))
                    .getAsync(() -> {
                        System.out.println("runAsync()");
                        TimeUnit.SECONDS.sleep(5);
                        System.out.println("Hello World!  <- (after Timeout!)");
                        return "Success!";
                    });
    
            // prevent race-condition being a cause of this problem
            TimeUnit.SECONDS.sleep(1);
    
            System.out.println("Shutdown executorService  <- prevents Failsafe from executing the Timeout, "
                               + "IF a DelegatingScheduler is used (depending on method-overloading, see [1]). \n\t"
                               + "IF the ScheduledExecutorService is used directly (choose via cast in [2]), the Timeout works as expected.");
    /* X */ executorService.shutdown();
    
            // prevent daemons from exiting early
            TimeUnit.SECONDS.sleep(5);
    
            System.out.println("EXIT");
        }
    
    }
    
    

    The core is getAsync(CheckedSupplyer), whereas the Supplier represents my potentially long running heavy lifting task, which shall be interrupted.

    I have set up Failsafe to interrupt the Thread after the Timeout, which works - as long, as I do not shut down the executorService (marked by comment X).

    But when I enable the shutdown (as needed - and as it's not shutdownNow(), so I want to keep it early to prevent adding further tasks), things get interesting:

    • If I supply the ScheduledExecutorService as its superclass ExecutorService to Failsafe (see A + 1), the Task won't be cancelled.
    • If I supply the same ScheduledExecutorService (A + 2), the Task will be cancelled! This led me to the overloaded methods FailsafeExecutor.with(.), which decides, if a DelegatingScheduler is used.
    • If I use a ForkJoinPool (B + 1), it also works as expected and the Task gets cancelled.

    From these observations, it seems to be a Problem somewhat related to the scheduling of the timeout-trigger in the DelegatingScheduler...? I think it cannot be a race-condition between shutdown and adding the trigger, as I used a sleep for that as well. And shutdown() does not interrupt any existing tasks, so this should be fine as well (and it is for non-DelegatingScheduler)...

    Did I miss something? I hope the example makes it clear. In any case, it seems dangerous to get those different results, just by different executors or different method-overloads for the same (!) scheduler...

    opened by brainbytes42 18
  • Null policy

    Null policy

    Is there a way to get null/noop version of FailsafeExecutor and perhaps even of the individual policies? I would like to add a requirement to pass FailsafeExecutor to some APIs, but I have to make sure this feature can be easily disabled by submitting some kind of null/noop FailsafeExecutor that just executes everything as if the code was executed directly.

    I see that Failsafe.with() will throw IllegalArgumentException if I give it an empty list of policies. I could probably find a workaround, for example by calling .handleIf(e -> false) on RetryPolicy. But is there a clean, concise solution with minimal overhead?

    opened by robertvazan 18
  • Fallback success and failure policy listeners

    Fallback success and failure policy listeners

    Hi,

    I've a question on using policy listeners with Fallback policy. I understand that onSuccess() is executed when the fallback is executed successfully.

    However I'm observing something I didn't quite expect. For e.g., with the below Fallback policy configured to execute on null result, I would not expect Got from fallback to be printed because the main call returns non-null and so fallback logic itself should not be executed.

            Fallback<String> fallback = Fallback.of("hello")
                    .handleResult(null)
                    .onSuccess(e -> System.out.println("Got from fallback"))
                    .onFailure(e -> System.out.println("Failed to get from fallback"));
    
            String result = Failsafe.with(fallback)
                    .get(() -> "world");
    
            System.out.println("Result is " + result);
    

    But I get the below output -

    Got from fallback
    Result is world
    

    Why did the onSuccess() listener get executed?

    And if I change the main call to return null, then the onFailure() listener is getting executed even though the fallback executes successfully and returns the fallback value.

            Fallback<String> fallback = Fallback.of("hello")
                    .handleResult(null)
                    .onSuccess(e -> System.out.println("Got from fallback"))
                    .onFailure(e -> System.out.println("Failed to get from fallback"));
    
            String result = Failsafe.with(fallback)
                    .get(() -> null);
    
            System.out.println("Result is " + result);
    

    Output -

    Failed to get from fallback
    Result is hello
    

    Perhaps my understanding is incorrect or I'm being daft. 😅

    bug 
    opened by sanoopps 17
  • FYI: Very compact

    FYI: Very compact "lean" version of DelegatingScheduler

    This is the continuation of https://github.com/failsafe-lib/failsafe/issues/349

    Here: https://github.com/magicprinc/failsafe/commits/leap_of_faith

    Final memory balance: -1 fat object CompletableFuture -1 lambda Callable in DelegatingScheduler.schedule -1 Callable-Runnable wrapper in delayer().schedule (Runnables are wrapped as Callables in FutureTask ctor)

    +1 very lean object ScheduledCompletableFuture implements ScheduledFuture, Callable (not a CompletableFuture anymore)

    I am sure this is the final step and one can't optimize this class further. Not a single unused byte in memory!

    opened by magicprinc 1
  • Feature: micrometer.io metrics integration

    Feature: micrometer.io metrics integration

    If you are looking for new ideas: https://micrometer.io/ Metrics would be great!

    It is new SLF4J for metrics and all people I know use it as standard de facto.

    If you need something for an inspiration: https://github.com/brettwooldridge/HikariCP/tree/dev/src/main/java/com/zaxxer/hikari/metrics/micrometer

    https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/cache/CaffeineCacheMetrics.java

    https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/okhttp3/OkHttpMetricsEventListener.java

    opened by magicprinc 3
  • FailsafeCall micro refactoring, plus World-Wide Nr.1 duplicated utility method for OkHttp

    FailsafeCall micro refactoring, plus World-Wide Nr.1 duplicated utility method for OkHttp

    As you can see here: https://github.com/magicprinc/failsafe/commit/c517e3ef01aec35cd6b6aaa23779873f8e89ffab

    FailsafeCall micro refactoring:

    1. AtomicBoolean fields are final

    2. lambda expression instead of code block

    3. World-Wide Nr.1 duplicated utility method for OkHttp `/** [OkHttp Callback to JDK CompletableFuture]
      Helps eliminate dozens of utility classes World-wide with exactly this same method.
      Can be the first small step towards FailSafe.
      Returns normal JDK {@link CompletableFuture} without FailSafe policies. */

    public static CompletableFuture asPromise (okhttp3.Call call)`

    All around the World, people write this method again and again. I have done it too. We really need "The Chosen One". I recommend you to be this one :-)

    If you like it, I will send it as PR.

    opened by magicprinc 4
  • DelegatingScheduler singletons in modern style

    DelegatingScheduler singletons in modern style

    DelegatingScheduler uses an old singleton idiom with double volatile check and synchronized. Bill Pugh Singleton Implementation is better, shorter and uses (in some cases) less memory. Plus fields become "static final" so JVM can do some other optimizations.

    opened by magicprinc 9
  • Support accrual failure detection

    Support accrual failure detection

    As Failsafe already supports policies that are useful for networked operations, it would make sense to support phi accrural (or other accural algorithms) failure detection for situations where fixed timeouts don't adequately account for changing load conditions.

    This could be implemented as a new policy which measures execution times over a number of executions, to determine if some threshold is crossed which represents a failure. Phi accrual could be one strategy supported by the policy, but there could be others. When the threshold is crossed, a fallback-like function could be called, for example, to fail over a system from one node that has failed to another. In that sense, the policy would be like a time-based fallback (rather than result based), except unlike a fallback it would be stateful.

    Alternatively, this could be implemented as a Timeout option, where the timeout is stateful and adapts to execution time distributions.

    One open question for this policy is, similar to a circuit breaker or rate limiter, at what point should it "reset" after triggering a failure, or should it even reset?

    Any ideas for how this should work or what the policy should be named are welcome!

    enhancement new-policy 
    opened by jhalterman 4
Owner
Jonathan Halterman
Jonathan Halterman
A fast object pool for the JVM

Stormpot Stormpot is an object pooling library for Java. Use it to recycle objects that are expensive to create. The library will take care of creatin

Chris Vest 302 Nov 14, 2022
jproblemgenerator creates scenarios in which Java programs leak memory or crash the JVM

jproblemgenerator creates scenarios in which Java programs leak memory or crash the JVM. It is intended to train the use of debugging tools

null 1 Jan 6, 2022
Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low latency stream processing and data analysis framework. Milliseconds latency and 10+ times faster than Flink for complicated use cases.

Table-Computing Welcome to the Table-Computing GitHub. Table-Computing (Simplified as TC) is a distributed light weighted, high performance and low la

Alibaba 34 Oct 14, 2022
Eclipse Collections is a collections framework for Java with optimized data structures and a rich, functional and fluent API.

English | 中文 | Deutsch | Español | Ελληνικά | Français | 日本語 | Norsk (bokmål) | Português-Brasil | Русский | हिंदी Eclipse Collections is a comprehens

Eclipse Foundation 2.1k Dec 29, 2022
A Java library for quickly and efficiently parsing and writing UUIDs

fast-uuid fast-uuid is a Java library for quickly and efficiently parsing and writing UUIDs. It yields the most dramatic performance gains when compar

Jon Chambers 142 Jan 1, 2023
Immutable key/value store with efficient space utilization and fast reads. They are ideal for the use-case of tables built by batch processes and shipped to multiple servers.

Minimal Perfect Hash Tables About Minimal Perfect Hash Tables are an immutable key/value store with efficient space utilization and fast reads. They a

Indeed Engineering 92 Nov 22, 2022
gRPC and protocol buffers for Android, Kotlin, and Java.

Wire “A man got to have a code!” - Omar Little See the project website for documentation and APIs. As our teams and programs grow, the variety and vol

Square 3.9k Jan 5, 2023
The Java collections framework provides a set of interfaces and classes to implement various data structures and algorithms.

Homework #14 Table of Contents General Info Technologies Used Project Status Contact General Information Homework contains topics: Sorting an ArrayLis

Mykhailo 1 Feb 12, 2022
High Performance data structures and utility methods for Java

Agrona Agrona provides a library of data structures and utility methods that are a common need when building high-performance applications in Java. Ma

Real Logic 2.5k Jan 5, 2023
Replicate your Key Value Store across your network, with consistency, persistance and performance.

Chronicle Map Version Overview Chronicle Map is a super-fast, in-memory, non-blocking, key-value store, designed for low-latency, and/or multi-process

Chronicle Software : Open Source 2.5k Dec 29, 2022
fasttuple - Collections that are laid out adjacently in both on- and off-heap memory.

FastTuple Introduction There are lots of good things about working on the JVM, like a world class JIT, operating system threads, and a world class gar

BMC TrueSight Pulse (formerly Boundary) 137 Sep 30, 2022
Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-only access.

Hollow Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-on

Netflix, Inc. 1.1k Dec 25, 2022
A fork of Cliff Click's High Scale Library. Improved with bug fixes and a real build system.

High Scale Lib This is Boundary's fork of Cliff Click's high scale lib. We will be maintaining this fork with bug fixes, improvements and versioned bu

BMC TrueSight Pulse (formerly Boundary) 402 Jan 2, 2023
Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...

java-string-similarity A library implementing different string similarity and distance measures. A dozen of algorithms (including Levenshtein edit dis

Thibault Debatty 2.5k Dec 29, 2022
Java Collections till the last breadcrumb of memory and performance

Koloboke A family of projects around collections in Java (so far). The Koloboke Collections API A carefully designed extension of the Java Collections

Roman Leventov 967 Nov 14, 2022
LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan), audio (OpenAL), parallel computing (OpenCL, CUDA) and XR (OpenVR, LibOVR) applications.

LWJGL - Lightweight Java Game Library 3 LWJGL (https://www.lwjgl.org) is a Java library that enables cross-platform access to popular native APIs usef

Lightweight Java Game Library 4k Dec 29, 2022
A modern I/O library for Android, Kotlin, and Java.

Okio See the project website for documentation and APIs. Okio is a library that complements java.io and java.nio to make it much easier to access, sto

Square 8.2k Dec 31, 2022