ramp_engine.remote.DaskWorker

class ramp_engine.remote.DaskWorker(config, submission)

Dask distributed worker

This workers relies on dask.distributed and can be used both on a local machine and on a remote cluster.

It uses conda environment to dispatch submission on the remote machine. It needs the dask worker to run on the remote machine with the same version of dependencies as on the local machine.

Parameters
configdict

Configuration dictionary to set the worker. The following parameter should be set:

  • ‘conda_env’: the name of the remote conda environment to use.

  • ‘kit_dir’: path to the remote directory of the RAMP kit;

  • ‘data_dir’: path to the remote directory of the data;

  • ‘submissions_dir’: path to the local directory containing the submissions;

  • logs_dir: path to the local directory where the log of the submission will be stored;

  • predictions_dir: path to the local directory where the predictions of the submission will be stored.

  • dask_scheduler: URL of the dask scheduler used for submissions.

  • ‘timeout’: timeout after a given number of seconds when running the worker. If not provided, a default of 7200 is used.

submissionstr

Name of the RAMP submission to be handle by the worker.

Attributes
statusstr

The status of the worker. It should be one of the following state:

  • ‘initialized’: the worker has been instantiated.

  • ‘setup’: the worker has been set up.

  • ‘error’: setup failed / training couldn’t be started

  • ‘running’: the worker is training the submission.

  • ‘finished’: the worker finished to train the submission.

  • ‘collected’: the results of the training have been collected.

__init__(config, submission)

Initialize self. See help(type(self)) for accurate signature.