qlauncher.workflow.slurm_job_manager#
Summary#
Classes:
Reference#
- class qlauncher.workflow.slurm_job_manager.SlurmJobManager(sbatch_exe: str = 'sbatch', scancel_exe: str = 'scancel', slurm_options: dict[str, Any] | None = None, env_setup: list[str] | None = None)[source]#
Bases:
BaseJobManager- submit(function, cores: int = 1, **kwargs) str[source]#
Creates a
QLauncherinstance fromproblem,algorithmandbackendand forwards it tosubmit_launcher().- Parameters:
- Returns:
Slurm job ID returned by
sbatch.- Return type:
str
- Raises:
RuntimeError – If
sbatchreturns a non-zero exit code.
- wait_for_a_job(job_id: str | None = None, timeout: float | None = None)[source]#
Waits until a Slurm job finishes and returns its ID.
- Parameters:
job_id (str | None, optional) – ID of the job to wait for. If
None, the first job injobsthat is not yet marked as finished is selected. Defaults toNone.timeout (float | None, optional) – Maximum time to wait in seconds. If
None, wait indefinitely. Defaults toNone.
- Raises:
ValueError – If
job_idisNoneand there are no jobs left.TimeoutError – If the timeout is exceeded before the job finishes.
RuntimeError – If the job disappears from
squeuewithout producing a result file, or if it finishes in a non-successful state.
- Returns:
ID of the finished job.
- Return type:
str
- cancel(job_id: str) None[source]#
Cancel a given Slurm job via scancel.
- Parameters:
job_id (str) – Slurm job id returned by submit().
- Raises:
KeyError – If job_id is not known to this manager.
RuntimeError – If scancel fails.
- run(function: Callable[[...], Any], cores: int = 1, **kwargs) Any[source]#
Convenience method: submit job, wait for completion, read results, and cleanup.
This method handles the complete lifecycle of a job execution.
- Parameters:
function (Callable[[...], Any]) – Function to be executed.
cores (int) – Number of CPU cores per task.
**kwargs – Manager-specific additional arguments.
- Returns:
Result object produced by the job.
- Return type:
Any