pymatgen.io.abinit.scheduler_error_parsers module¶

class AbstractError(errmsg, meta_data)[source]

Bases: object

Error base class

application_adapter_solutions

to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolApplication, the second element should contain the arguments.

last_resort_solution()[source]

what to do if every thing else fails…

name
scheduler_adapter_solutions

to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolScheduler, the second element should contain the arguments.

class AbstractErrorParser(err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]

Bases: object

Abstract class for parsing errors originating from the scheduler system and error that are not reported by the program itself, i.e. segmentation faults.

A concrete implementation of this class for a specific scheduler needs a class attribute ERRORS for containing a dictionary specifying error:

ERRORS = {ErrorClass: {
‘file_specifier’ : {
‘string’: “the string to be looked for”, ‘meta_filter’: “string specifing the regular expression to obtain the meta data” }

}

error_definitions
static extract_metadata(lines, meta_filter)[source]
parse()[source]

Parse for the occurens of all errors defined in ERRORS

parse_single(errmsg)[source]

Parse the provided files for the corresponding strings.

class CorrectorProtocolApplication[source]

Bases: object

Abstract class to define the protocol / interface for correction operators. The client code quadapters / submission script generator method / … should implement these methods.

decrease_mem()[source]

Method to increase then memory in the calculation. It is called when a calculation seemed to have been crashed due to a insufficient memory.

returns True if the memory could be increased False otherwise

name
speed_up()[source]

Method to speed_up the calculation. It is called when a calculation seemed to time limits being broken.

returns True if the memory could be increased False otherwise

class CorrectorProtocolScheduler[source]

Bases: object

Abstract class to define the protocol / interface for correction operators. The client code quadapters / submission script generator method / … should implement these methods.

exclude_nodes(nodes)[source]

Method to exclude certain nodes from being used in the calculation. It is called when a calculation seemed to have been crashed due to a hardware failure at the nodes specified.

nodes: list of node numbers that were found to cause problems

returns True if the memory could be increased False otherwise

increase_cpus()[source]

Method to increse the number of cpus being used in the calculation. It is called when a calculation seemed to have been crashed due to time or memory limits being broken.

returns True if the memory could be increased False otherwise

increase_mem()[source]

Method to increase then memory in the calculation. It is called when a calculation seemed to have been crashed due to a insufficient memory.

returns True if the memory could be increased False otherwise

increase_time()[source]

Method to increase te time for the calculation. It is called when a calculation seemed to have been crashed due to a time limit.

returns True if the memory could be increased False otherwise

name
class DiskError(errmsg, meta_data)[source]

Errors involving problems writing to disk.

class FullQueueError(errmsg, meta_data)[source]

Errors occurring at submission. To many jobs in the queue / total cpus / .. .

class MasterProcessMemoryCancelError(errmsg, meta_data)[source]

Error due to exceeding the memory limit for the job on the master node.

class MemoryCancelError(errmsg, meta_data)[source]
Error due to exceeding the memory limit for the job.
.limit will return a list of limits that were broken, None if it could not be determined.
application_adapter_solutions
limit
scheduler_adapter_solutions
class NodeFailureError(errmsg, meta_data)[source]
Error due the hardware failure of a specific node.
.node will return a list of problematic nodes, None if it could not be determined.
nodes
scheduler_adapter_solutions
class PBSErrorParser(err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]
Implementation for the PBS scheduler
PBS: job killed: walltime 932 exceeded limit 900 PBS: job killed: walltime 46 exceeded limit 30 PBS: job killed: vmem 2085244kb exceeded limit 1945600kb
error_definitions
class SlaveProcessMemoryCancelError(errmsg, meta_data)[source]

Error due to exceeding the memory limit for the job on a node different from the master.

class SlurmErrorParser(err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]

Implementation of the error definitions for the Slurm scheduler

error_definitions
class SubmitError(errmsg, meta_data)[source]

Errors occurring at submission. The limits on the cluster may have changed.

class TimeCancelError(errmsg, meta_data)[source]
Error due to exceeding the time limit for the job.
.limit will return a list of limits that were broken, None if it could not be determined.
application_adapter_solutions
limit
scheduler_adapter_solutions
get_parser(scheduler, err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]

Factory function to provide the parser for the specified scheduler. If the scheduler is not implemented None is returned. The files, string, correspond to file names of the out and err files: err_file stderr of the scheduler out_file stdout of the scheduler run_err_file stderr of the application batch_err_file stderr of the submission

Returns: None if scheduler is not supported.