Hello all,
first of all: thanks for this neat library, I appreciate it!
But for my current usecase, I ran into a somewhat weird issue... Several experiments later, it seems I might have been able to isolate the problem a bit... But from the beginning:
[Java 16 / Windows / Failsafe 2.4.0]
I have a task, which will spawn some other tasks inside a local ExecutorService. Therefore, the ExecutorService needs to be shut down at the end. To harden this task, I want to use Failsafe, containing amongst other things a Timeout. But this Timeout does not work as expected in some cases, but gets swallowed so that the Task is not interrupted. (The tasks are responsive to interrupt.)
I have put together a somewhat verbose example to illustrate the issue:
import java.time.Duration;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.TimeUnit;
import net.jodah.failsafe.Failsafe;
import net.jodah.failsafe.Timeout;
public class FailsafeShutdownDemo {
public static void main(String[] args) throws InterruptedException {
/* A */ ExecutorService executorService = Executors.newScheduledThreadPool(4);
/* B */ // ExecutorService executorService = ForkJoinPool.commonPool();
System.out.println("Start");
Failsafe.with(Timeout.of(Duration.ofSeconds(3))
.withCancel(true))
/* 1 */ .with(executorService) // TIMEOUT FAILS (ExecutorService) - AS EXPECTED (ForkJoinPool)
/* 2 */ // .with((ScheduledExecutorService) executorService) // AS EXPECTED (ExecutorService cast to ScheduledExecutorService)
.onComplete(complete -> System.out.println("onComplete -> " + (complete.getFailure() == null
? "No Failure / Timeout didn't work. :-("
: "Failure is " + complete.getFailure() + " as expected. :-)")))
.getAsync(() -> {
System.out.println("runAsync()");
TimeUnit.SECONDS.sleep(5);
System.out.println("Hello World! <- (after Timeout!)");
return "Success!";
});
// prevent race-condition being a cause of this problem
TimeUnit.SECONDS.sleep(1);
System.out.println("Shutdown executorService <- prevents Failsafe from executing the Timeout, "
+ "IF a DelegatingScheduler is used (depending on method-overloading, see [1]). \n\t"
+ "IF the ScheduledExecutorService is used directly (choose via cast in [2]), the Timeout works as expected.");
/* X */ executorService.shutdown();
// prevent daemons from exiting early
TimeUnit.SECONDS.sleep(5);
System.out.println("EXIT");
}
}
The core is getAsync(CheckedSupplyer)
, whereas the Supplier represents my potentially long running heavy lifting task, which shall be interrupted.
I have set up Failsafe to interrupt the Thread after the Timeout, which works - as long, as I do not shut down the executorService (marked by comment X).
But when I enable the shutdown (as needed - and as it's not shutdownNow()
, so I want to keep it early to prevent adding further tasks), things get interesting:
- If I supply the
ScheduledExecutorService
as its superclass ExecutorService
to Failsafe (see A + 1), the Task won't be cancelled.
- If I supply the same
ScheduledExecutorService
(A + 2), the Task will be cancelled! This led me to the overloaded methods FailsafeExecutor.with(.)
, which decides, if a DelegatingScheduler
is used.
- If I use a
ForkJoinPool
(B + 1), it also works as expected and the Task gets cancelled.
From these observations, it seems to be a Problem somewhat related to the scheduling of the timeout-trigger in the DelegatingScheduler
...? I think it cannot be a race-condition between shutdown and adding the trigger, as I used a sleep for that as well. And shutdown()
does not interrupt any existing tasks, so this should be fine as well (and it is for non-DelegatingScheduler)...
Did I miss something? I hope the example makes it clear. In any case, it seems dangerous to get those different results, just by different executors or different method-overloads for the same (!) scheduler...