- All Superinterfaces:
- Thread.UncaughtExceptionHandler
- All Known Implementing Classes:
- RabitTracker
public interface IRabitTracker
extends Thread.UncaughtExceptionHandler
Interface for Rabit tracker implementations with three public methods:
- start(timeout): Start the Rabit tracker awaiting for worker connections, with a given
timeout value (in milliseconds.)
- getWorkerEnvs(): Return the environment variables needed to initialize Rabit clients.
- waitFor(timeout): Wait for the task execution by the worker nodes for at most `timeout`
milliseconds.
Each implementation is expected to implement a callback function
public void uncaughtException(Threat t, Throwable e) { ... }
to interrupt waitFor() in order to prevent the tracker from hanging indefinitely.
The Rabit tracker handles connections from distributed workers, assigns ranks to workers, and
brokers connections between workers.