Fault tolerance and resilience patterns for the JVM

Overview

Failsafe

Build Status Maven Central License JavaDoc Join the chat at https://gitter.im/jhalterman/failsafe

Failsafe is a lightweight, zero-dependency library for handling failures in Java 8+, with a concise API for handling everyday use cases and the flexibility to handle everything else. It works by wrapping executable logic with one or more resilience policies, which can be combined and composed as needed. Current policies include Retry, CircuitBreaker, RateLimiter, Timeout, Bulkhead, and Fallback.

Usage

Visit failsafe.dev for usage info, docs, and additional resources.

Contributing

Check out the contributing guidelines.

License

Copyright Jonathan Halterman and friends. Released under the Apache 2.0 license.

Comments
  • RetryPolicy thread safety

    RetryPolicy thread safety

    Is it safe to use RetryPolicy from multiple threads?

    I can see it doesn't set it's own members but it does add members to (array)lists of predicates. This is potentially not thread-safe and can break when using the same RetryPolicy object from multiple threads.

    Is there a standard to do this? I can copy the object for now when I modify it on specific threads, but perhaps making it thread safe is possible on your end.

    enhancement 3.0 
    opened by reutsharabani 49
  • How to reset failsafe?

    How to reset failsafe?

    Hi sir, Could you help me with my requirement? My requirement is to carry out a JDBC operation with retries. I am using the following approach:

    RetryPolicy retryPolicy = (RetryPolicy) new RetryPolicy()
            .withDelay(Duration.ofMillis(retryIntervalMillis))
            .withMaxRetries(retryLimit)
            .onFailedAttempt(e -> {
                ExecutionAttemptedEvent event = (ExecutionAttemptedEvent) e;
                LOG.warn("Error encountered while establishing connection or doing the " +
                                "read/write operation {}", event.getLastFailure().getMessage());
            })
            .onRetry(e -> {
                ExecutionAttemptedEvent event = (ExecutionAttemptedEvent) e;
                Throwable failure = event.getLastFailure();
                // log the retry as it seems to be a network/connection failure
                LOG.warn("Connection error encountered {}", failure.getMessage(),
                        failure);
                LOG.warn("Retrying {}th time to proceed again with a new connection",
                        event.getAttemptCount());
            })
            .handleIf(failure -> isRetryRequired((Throwable) failure));
    

    isRetryRequired() checks the exception's message and decides whether to do retry ot not. I am not showing its body here.

    Next, this is how failsafe is used:

    try {
        Failsafe
                .with(retryPolicy)
                .run(() -> {
                    getJdbcConnection();
                    createStatement();
                    executeQueries();
                });
    } catch (SQLException e1) {
        // todo
    } finally {
        closeConnection();
    }
    
    void executeQueries() {
      for (String query : queryList) {
       // execute the query
      }
    }
    

    My question is, if any of the method fails (getJdbcConnection or createStatement or executeQuery) then I want to retry and thats how I have written the code. However, suppose I had configured three retries and I had successfully obtained the connection in 2nd retry. Once the connection is successful, I want to reset the Failsafe so that it can do three retries again if the connection fails next time. How is this possible? How can I reset the Failsafe? Or what approach do you suggest?

    opened by Syed-SnapLogic 34
  • Dynamic delay

    Dynamic delay

    This PR is in response to #110. It adds a delay function property to RetryPolicy that is used, if set, to compute the next delay from the previous result or exception.

    There are some awkward bits here:

    • I used net.jodah.failsafe.util.Duration in signature of the delay function, though I really wanted to use java.time.Duration.
    • Combining delay factor other than 1 with delay function would be meaningless, but I've done nothing to make them mutually exclusive.
    • The one included test is pretty crude, passing if the actual delay is within a window of the requested delay.
    opened by Tembrel 25
  • Support basic Java Executor interface

    Support basic Java Executor interface

    Currently you can provide either ExecutorService, ScheduledExecutorService or implement custom Failsafe Scheduler. It would be good if it would also accept java.util.concurrent.Executor interface since it is a common interface returned by other libs. My current use case is in gRPC, when splitting context for multi-threading, and gRPC methods #fixedContextExecutor return base Executor.

    enhancement 3.0 
    opened by paulius-p 23
  • Async : How to gracefull shutdown ?

    Async : How to gracefull shutdown ?

    Hello,

    I'm trying FailSafe with:

    final ScheduledExecutorService executor = Executors.newScheduledThreadPool(2);
    final RetryPolicy retryPolicy = new RetryPolicy()
                    .withDelay(100, TimeUnit.MILLISECONDS)
                    .retryOn(RuntimeException.class)
                    .retryWhen(false)
                    ;
    
    // this task will fail 99% of time.
    Failsafe
                    .with(retryPolicy)
                    .with(executor)
                    .get((ctx) -> {
                            double i = Math.random();
                            if( i > 0.99 ) {
                                return true;
                            } else if (i < 0.1 ) {
                                throw new RuntimeException(" i = " + i);
                            }
                           return false;
                        })
            ;
    

    Then i add a ShutdownHook:

    Runtime.getRuntime().addShutdownHook(new Thread(() -> {
           executor.shutdown();
           executor.awaitTermination(1, TimeUnit.HOURS);
    }));
    

    Is there to do something like:

    FailSafe.awaitAllTaskComplete();
    executor.shutdown();
    executor.awaitTermination(1, TimeUnit.HOURS);
    

    Or should i have to store all future created and check if they all have been complete ?

    opened by pgoergler 23
  • Implement Timeout policy

    Implement Timeout policy

    For async, we use ForkJoinPool by default, currently with a CompletableFuture being used internally. Unfortunately, neither CompletableFuture or ForkJoinPool support cancel with interrupts for their tasks. So cancellation will only be effective for tasks waiting to be run.

    • The Timeout policy should be configurable to support interrupts or not.
    • The Timeout policy should (probably) fail with TimeoutException so that it can be bubbled up to outer policies and easily recognized by them as a failure (similar to CircuitBreakerOpenException).
    2.2 
    opened by jhalterman 22
  • Added support to easily create Proxy instances

    Added support to easily create Proxy instances

    It can be beneficial to wrap a whole interface easilyw ith Failsafe and provide RetryPolicy and CircuitBreaker as a whole.

    By leveraging JRE's built-in proxy construction, Failsafe can add support for creating proxy instances for interfaces (byteBuddy and other libraries can provide ability to create proxy classes of concrete types).

    Fixes #107

    opened by fzakaria 22
  • Restrictive time window for circuit breaker to record failures

    Restrictive time window for circuit breaker to record failures

    Right now it's possible to configure failure thresholds in terms of consecutive failures:

    • withFailureThreshold(4, 5), four failures out of five consecutive executions
    • withFailureThreshold(5, 10), five failures out of ten consecutive executions
    • etc.

    There is no notion of time right now. What I'm hoping for would be the ability to specify something like:

    • 10 failures within 1 second
    • 50 failures out of 200 executions within a minute

    My idea behind this is to effectively disable the circuit breaker in low-traffic scenarios but if traffic suddenly increases it should start to kick in. For requests with low traffic (less than ~10 per second) we identified circuit breakers as actually making matters worse, since they tend to stay open for longer, once they open up. And ultimately circuit breaker are means to protect overloading an application which doesn't really happen with few requests.

    opened by whiskeysierra 19
  • Improve handling of shutdown ExecutorService

    Improve handling of shutdown ExecutorService

    Hello all,

    first of all: thanks for this neat library, I appreciate it!

    But for my current usecase, I ran into a somewhat weird issue... Several experiments later, it seems I might have been able to isolate the problem a bit... But from the beginning:

    [Java 16 / Windows / Failsafe 2.4.0]

    I have a task, which will spawn some other tasks inside a local ExecutorService. Therefore, the ExecutorService needs to be shut down at the end. To harden this task, I want to use Failsafe, containing amongst other things a Timeout. But this Timeout does not work as expected in some cases, but gets swallowed so that the Task is not interrupted. (The tasks are responsive to interrupt.)

    I have put together a somewhat verbose example to illustrate the issue:

    import java.time.Duration;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    import java.util.concurrent.ForkJoinPool;
    import java.util.concurrent.TimeUnit;
    import net.jodah.failsafe.Failsafe;
    import net.jodah.failsafe.Timeout;
    
    public class FailsafeShutdownDemo {
    
        public static void main(String[] args) throws InterruptedException {
            
    /* A */ ExecutorService executorService = Executors.newScheduledThreadPool(4);
    /* B */ // ExecutorService executorService = ForkJoinPool.commonPool();
    
            System.out.println("Start");
    
            Failsafe.with(Timeout.of(Duration.ofSeconds(3))
                                 .withCancel(true))
        /* 1 */     .with(executorService)  // TIMEOUT FAILS (ExecutorService)  -  AS EXPECTED (ForkJoinPool)
        /* 2 */  // .with((ScheduledExecutorService) executorService)  // AS EXPECTED (ExecutorService cast to ScheduledExecutorService)
                    .onComplete(complete -> System.out.println("onComplete  ->  " + (complete.getFailure() == null
                                                                                     ? "No Failure / Timeout didn't work. :-("
                                                                                     : "Failure is " + complete.getFailure() + " as expected. :-)")))
                    .getAsync(() -> {
                        System.out.println("runAsync()");
                        TimeUnit.SECONDS.sleep(5);
                        System.out.println("Hello World!  <- (after Timeout!)");
                        return "Success!";
                    });
    
            // prevent race-condition being a cause of this problem
            TimeUnit.SECONDS.sleep(1);
    
            System.out.println("Shutdown executorService  <- prevents Failsafe from executing the Timeout, "
                               + "IF a DelegatingScheduler is used (depending on method-overloading, see [1]). \n\t"
                               + "IF the ScheduledExecutorService is used directly (choose via cast in [2]), the Timeout works as expected.");
    /* X */ executorService.shutdown();
    
            // prevent daemons from exiting early
            TimeUnit.SECONDS.sleep(5);
    
            System.out.println("EXIT");
        }
    
    }
    
    

    The core is getAsync(CheckedSupplyer), whereas the Supplier represents my potentially long running heavy lifting task, which shall be interrupted.

    I have set up Failsafe to interrupt the Thread after the Timeout, which works - as long, as I do not shut down the executorService (marked by comment X).

    But when I enable the shutdown (as needed - and as it's not shutdownNow(), so I want to keep it early to prevent adding further tasks), things get interesting:

    • If I supply the ScheduledExecutorService as its superclass ExecutorService to Failsafe (see A + 1), the Task won't be cancelled.
    • If I supply the same ScheduledExecutorService (A + 2), the Task will be cancelled! This led me to the overloaded methods FailsafeExecutor.with(.), which decides, if a DelegatingScheduler is used.
    • If I use a ForkJoinPool (B + 1), it also works as expected and the Task gets cancelled.

    From these observations, it seems to be a Problem somewhat related to the scheduling of the timeout-trigger in the DelegatingScheduler...? I think it cannot be a race-condition between shutdown and adding the trigger, as I used a sleep for that as well. And shutdown() does not interrupt any existing tasks, so this should be fine as well (and it is for non-DelegatingScheduler)...

    Did I miss something? I hope the example makes it clear. In any case, it seems dangerous to get those different results, just by different executors or different method-overloads for the same (!) scheduler...

    opened by brainbytes42 18
  • Null policy

    Null policy

    Is there a way to get null/noop version of FailsafeExecutor and perhaps even of the individual policies? I would like to add a requirement to pass FailsafeExecutor to some APIs, but I have to make sure this feature can be easily disabled by submitting some kind of null/noop FailsafeExecutor that just executes everything as if the code was executed directly.

    I see that Failsafe.with() will throw IllegalArgumentException if I give it an empty list of policies. I could probably find a workaround, for example by calling .handleIf(e -> false) on RetryPolicy. But is there a clean, concise solution with minimal overhead?

    opened by robertvazan 18
  • Fallback success and failure policy listeners

    Fallback success and failure policy listeners

    Hi,

    I've a question on using policy listeners with Fallback policy. I understand that onSuccess() is executed when the fallback is executed successfully.

    However I'm observing something I didn't quite expect. For e.g., with the below Fallback policy configured to execute on null result, I would not expect Got from fallback to be printed because the main call returns non-null and so fallback logic itself should not be executed.

            Fallback<String> fallback = Fallback.of("hello")
                    .handleResult(null)
                    .onSuccess(e -> System.out.println("Got from fallback"))
                    .onFailure(e -> System.out.println("Failed to get from fallback"));
    
            String result = Failsafe.with(fallback)
                    .get(() -> "world");
    
            System.out.println("Result is " + result);
    

    But I get the below output -

    Got from fallback
    Result is world
    

    Why did the onSuccess() listener get executed?

    And if I change the main call to return null, then the onFailure() listener is getting executed even though the fallback executes successfully and returns the fallback value.

            Fallback<String> fallback = Fallback.of("hello")
                    .handleResult(null)
                    .onSuccess(e -> System.out.println("Got from fallback"))
                    .onFailure(e -> System.out.println("Failed to get from fallback"));
    
            String result = Failsafe.with(fallback)
                    .get(() -> null);
    
            System.out.println("Result is " + result);
    

    Output -

    Failed to get from fallback
    Result is hello
    

    Perhaps my understanding is incorrect or I'm being daft. 😅

    bug 
    opened by sanoopps 17
  • FYI: Very compact

    FYI: Very compact "lean" version of DelegatingScheduler

    This is the continuation of https://github.com/failsafe-lib/failsafe/issues/349

    Here: https://github.com/magicprinc/failsafe/commits/leap_of_faith

    Final memory balance: -1 fat object CompletableFuture -1 lambda Callable in DelegatingScheduler.schedule -1 Callable-Runnable wrapper in delayer().schedule (Runnables are wrapped as Callables in FutureTask ctor)

    +1 very lean object ScheduledCompletableFuture implements ScheduledFuture, Callable (not a CompletableFuture anymore)

    I am sure this is the final step and one can't optimize this class further. Not a single unused byte in memory!

    opened by magicprinc 1
  • Feature: micrometer.io metrics integration

    Feature: micrometer.io metrics integration

    If you are looking for new ideas: https://micrometer.io/ Metrics would be great!

    It is new SLF4J for metrics and all people I know use it as standard de facto.

    If you need something for an inspiration: https://github.com/brettwooldridge/HikariCP/tree/dev/src/main/java/com/zaxxer/hikari/metrics/micrometer

    https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/cache/CaffeineCacheMetrics.java

    https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/okhttp3/OkHttpMetricsEventListener.java

    opened by magicprinc 3
  • FailsafeCall micro refactoring, plus World-Wide Nr.1 duplicated utility method for OkHttp

    FailsafeCall micro refactoring, plus World-Wide Nr.1 duplicated utility method for OkHttp

    As you can see here: https://github.com/magicprinc/failsafe/commit/c517e3ef01aec35cd6b6aaa23779873f8e89ffab

    FailsafeCall micro refactoring:

    1. AtomicBoolean fields are final

    2. lambda expression instead of code block

    3. World-Wide Nr.1 duplicated utility method for OkHttp `/** [OkHttp Callback to JDK CompletableFuture]
      Helps eliminate dozens of utility classes World-wide with exactly this same method.
      Can be the first small step towards FailSafe.
      Returns normal JDK {@link CompletableFuture} without FailSafe policies. */

    public static CompletableFuture asPromise (okhttp3.Call call)`

    All around the World, people write this method again and again. I have done it too. We really need "The Chosen One". I recommend you to be this one :-)

    If you like it, I will send it as PR.

    opened by magicprinc 4
  • DelegatingScheduler singletons in modern style

    DelegatingScheduler singletons in modern style

    DelegatingScheduler uses an old singleton idiom with double volatile check and synchronized. Bill Pugh Singleton Implementation is better, shorter and uses (in some cases) less memory. Plus fields become "static final" so JVM can do some other optimizations.

    opened by magicprinc 9
  • Support accrual failure detection

    Support accrual failure detection

    As Failsafe already supports policies that are useful for networked operations, it would make sense to support phi accrural (or other accural algorithms) failure detection for situations where fixed timeouts don't adequately account for changing load conditions.

    This could be implemented as a new policy which measures execution times over a number of executions, to determine if some threshold is crossed which represents a failure. Phi accrual could be one strategy supported by the policy, but there could be others. When the threshold is crossed, a fallback-like function could be called, for example, to fail over a system from one node that has failed to another. In that sense, the policy would be like a time-based fallback (rather than result based), except unlike a fallback it would be stateful.

    Alternatively, this could be implemented as a Timeout option, where the timeout is stateful and adapts to execution time distributions.

    One open question for this policy is, similar to a circuit breaker or rate limiter, at what point should it "reset" after triggering a failure, or should it even reset?

    Any ideas for how this should work or what the policy should be named are welcome!

    enhancement new-policy 
    opened by jhalterman 4
A fault tolerant, protocol-agnostic RPC system

Finagle Status This project is used in production at Twitter (and many other organizations), and is being actively developed and maintained. Releases

Twitter 8.5k Jan 4, 2023
G&C (Good & Cheap) is a web application with the objective of ensuring sustainable consumption and production patterns in our cities.

MUBISOFT ECO Table of Contents G&C, Keep It Fresh! Sustainable Development Goals Application Requirements G&C, Keep It Fresh! G&C (Good & Cheap) is a

null 4 May 2, 2022
A sideproject to learn more about object-oriented programming, design patterns and Java meanwhile studying an OOP-course.

MyBank Description A console application that simulates a bank with very simple functions. Potential story could be an employee using this application

null 2 Mar 23, 2022
Create different patterns and designs using your favorite programming language for this project.

Patterns project for Hacktoberfest Create different patterns and designs using your favourite programming language weather it be a square pattern, sta

Pulkit Handa 5 Oct 5, 2022
Creational design patterns written in Java

Java Design Patterns Creational design patterns implementation: Singleton, Factory Method, Builder written in Java. Getting Started No additional step

Tamerlan Satualdypov 11 Mar 7, 2022
شرح الـ Design Patterns باللغة العربية

بسم الله الرحمن الرحيم السلام عليكم ورحمة الله وبركاته تعلم نماذج التصميم باللغة العربية إن شاءالله نبدأ سلسلة بسيطة لشرح الـ Design Pattern باللغة ال

أحمد الطبراني 17 Feb 3, 2022
Java Design Patterns code examples

Java Design Patterns code examples Behavioral In software engineering, behavioral design patterns are design patterns that identify common communicati

Gaboso™ 3 Jun 29, 2022
A collection of design patterns implemented in Java

Design Patterns A collection of design patterns implemented in Java and referenced from this book: Design Patterns: Elements of Reusable Object-Orient

Karim Elghamry 6 Sep 5, 2022
Design Patterns: Elements of Reusable Object-Oriented Software

GoF Design Patterns Design Patterns: Elements of Reusable Object-Oriented Software Task 싱글톤패턴 싱글톤 패턴 구현 방법을 깨뜨리는 방법 리플렉션을 통해 싱글톤 패턴을 깨뜨리다 역직렬화를 통해 싱글톤

전지환 11 Jul 19, 2022
Simple examples for various Design patterns

About Simple examples for various Design patterns. Design patterns represent the best practices used by experienced object-oriented software developer

Mohsen Teymouri 1 Jan 26, 2022
The repository is created to showcase examples of microservices patterns using different technologies.

Repository Objective The goal of this repository is to demonstrate coding examples in different languages mainly Java and .NET core. These examples wi

Roland Salloum 13 Nov 17, 2022
This project contains many sample codes for demonstrating the usage of some common design patterns.

STUDY COMMON DESIGN PATTERNS 1. About this project This project contains many sample codes for demonstrating the usage of the following design pattern

Võ Trần Minh Quân 21 Jan 2, 2023
Aula da série Design Patterns 4Devs sobre os padrões singleton e monostate

DesignPatterns4Devs - Singleton & Monostate Nessa aula o intuíto é abordar um padrão bem polêmico na comunidade dev: Singleton. Apesar de ser um consi

Rocketseat Creators Program 3 Sep 23, 2022
A Sentry SDK for Java, Android and other JVM languages.

Bad software is everywhere, and we're tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoy

Sentry 912 Dec 28, 2022
Spring-boot application to demo JVM HEAP and Native memory leak

Description This repo can be used as demo repo for finding memory leaks. Example spring-boot project to show how to find and fix JVM HEAP memory leak

Iranna Nk 4 Jul 22, 2022
Cloud Runtimes Specification for the JVM

cloud-runtimes-jvm Cloud Runtimes Specification for the JVM. Introduction Standard API for dapr / layotto / capa / .... Motivation [Discussion] Future

Reactive Group 6 Jul 28, 2022
Nrich is a Java library developed at CROZ whose purpose is to make development of applications on JVM a little easier.

nrich Nrich is a Java library developed at CROZ whose purpose is to make development of applications on JVM a little easier. It contains modules that

CROZ 44 Nov 12, 2022
Java agent that enables class reloading in a running JVM

Welcome to Spring-Loaded What is Spring Loaded? Spring Loaded is a JVM agent for reloading class file changes whilst a JVM is running. It transforms c

Spring 2.7k Dec 26, 2022
Reactive Streams Specification for the JVM

Reactive Streams The purpose of Reactive Streams is to provide a standard for asynchronous stream processing with non-blocking backpressure. The lates

null 4.5k Dec 30, 2022