pymatgen.analysis.functional_groups module

class FunctionalGroupExtractor(molecule, optimize=False)[source]

Bases: object

This class is used to algorithmically parse a molecule (represented by an instance of pymatgen.analysis.graphs.MoleculeGraph) and determine arbitrary functional groups.

Instantiation method for FunctionalGroupExtractor.

Parameters
  • molecule – Either a filename, a pymatgen.core.structure.Molecule object, or a pymatgen.analysis.graphs.MoleculeGraph object.

  • optimize – Default False. If True, then the input molecule will be modified, adding Hydrogens, performing a simple conformer search, etc.

categorize_functional_groups(groups)[source]

Determine classes of functional groups present in a set.

Parameters

groups – Set of functional groups.

Returns

dict containing representations of the groups, the indices of where the group occurs in the MoleculeGraph, and how many of each type of group there is.

get_all_functional_groups(elements=None, func_groups=None, catch_basic=True)[source]

Identify all functional groups (or all within a certain subset) in the molecule, combining the methods described above.

Parameters
  • elements – List of elements that will qualify a carbon as special (if only certain functional groups are of interest). Default None.

  • func_groups – List of strs representing the functional groups of interest. Default to None, meaning that all of the functional groups defined in this function will be sought.

  • catch_basic – bool. If True, use get_basic_functional_groups and other methods

Returns

list of sets of ints, representing groups of connected atoms

get_basic_functional_groups(func_groups=None)[source]

Identify functional groups that cannot be identified by the Ertl method of get_special_carbon and get_heteroatoms, such as benzene rings, methyl groups, and ethyl groups.

TODO: Think of other functional groups that are important enough to be added (ex: do we need ethyl, butyl, propyl?)

Parameters

func_groups – List of strs representing the functional groups of interest. Default to None, meaning that all of the functional groups defined in this function will be sought.

Returns

list of sets of ints, representing groups of connected atoms

get_heteroatoms(elements=None)[source]

Identify non-H, non-C atoms in the MoleculeGraph, returning a list of their node indices.

Parameters

elements – List of elements to identify (if only certain functional groups are of interest).

Returns

set of ints representing node indices

get_special_carbon(elements=None)[source]

Identify Carbon atoms in the MoleculeGraph that fit the characteristics defined Ertl (2017), returning a list of their node indices.

The conditions for marking carbon atoms are (quoted from Ertl):

“- atoms connected by non-aromatic double or triple bond to any heteroatom - atoms in nonaromatic carbon–carbon double or triple bonds - acetal carbons, i.e. sp3 carbons connected to two or more oxygens, nitrogens or sulfurs; these O, N or S atoms must have only single bonds - all atoms in oxirane, aziridine and thiirane rings”

Parameters

elements – List of elements that will qualify a carbon as special (if only certain functional groups are of interest). Default None.

Returns

set of ints representing node indices

Take a list of marked “interesting” atoms (heteroatoms, special carbons) and attempt to connect them, returning a list of disjoint groups of special atoms (and their connected hydrogens).

Parameters

atoms – set of marked “interesting” atoms, presumably identified using other functions in this class.

Returns

list of sets of ints, representing groups of connected atoms