The OELibraryGen
was designed to give programmers a high degree
of control when applying chemical transformations. It was also
designed for efficiency. Potentially costly preprocessing is performed
a single time before transformations can be carried out. The relative
setup cost of a OELibraryGen
instance may be high, and the
memory use large as preprocessed reactants are stored in memory.
Subsequent generation of products,however, is very efficient because
setup costs are paid in advance. The OELibraryGen
class serves
a dual purpose of managing sets of preprocessed starting materials,
and storing a list of chemical transform operations defined by a
reaction molecule.
Chemical transform operations are carried out on starting materials.
Starting materials provide most of the virtual matter that goes into
making virtual product molecules. The OELibraryGen
class
provides an interface to associate starting materials with reactant
patterns using the OELibraryGen::SetStartingMaterial
and
OELibraryGen::AddStartingMaterial
methods. These methods
associate starting materials to reactant patterns using the index
(reactant number) of the pattern. Reactant patterns are numbered
starting at zero for the lowest atom index and all atoms that are a
members of the same connected component. The next reactant pattern
begins with the next lowest atom index that is not a member of the
first component. In a SMIRKS pattern the first reactant (reactant
number zero) is the furthest reactant on the left. Disconnected
reactant patterns may be grouped into a single component using
component level grouping in SMIRKS denoted by parentheses.
Once a reaction has been defined, and starting materials have been
associated with each of the reactant patterns, chemical
transformations can be applied to combinations of starting materials.
To achieve a chemically reasonable output attention should be given to
the mode of valence (or hydrogen count) correction that matches the
reaction. The OELibraryGen
class has three possible modes of
valence correction: explicit hydrogen, implicit hydrogen, and
automatic. The default mode for valence correction and SMIRKS
interpretation is to emulate the Daylight Reaction Toolkit. Hydrogen
counts are adjusted using explicit hydrogens in SMIRKS patterns.
Reactions are carried out using explicit hydrogens, and valence
correction occurs when explicit hydrogens are added or deleted as
defined by a reaction. The following example demonstrates strict
SMIRKS and explicit hydrogen handling.
#!/usr/bin/env python # ch28-2.py from openeye.oechem import * libgen = OELibraryGen("[O:1]=[C:2][Cl:3].[N:4][H:5]>>[O:1]=[C:2][N:4]") mol = OEGraphMol() OEParseSmiles(mol, "CC(=O)Cl") libgen.SetStartingMaterial(mol, 0) mol.Clear() OEParseSmiles(mol, "NCC") libgen.SetStartingMaterial(mol, 1) for product in libgen.GetProducts(): smi = OECreateCanSmiString(product) print smi
In the amide bond forming reaction a hydrogen atom attached to the nitrogen in the amine pattern is explicitly deleted when forming the product. When executed, the example generates two products in total. Each product corresponds to the equivalent protons attached to the amine. If a unique set of products is desired, canonical smiles strings may be stored for verification that products generated are indeed unique.
The following demonstrates how the same basic reaction given in the previous example can be carried out using the implicit hydrogen correction mode. Notice that no explicit hydrogens appear in the reaction. Instead, the SMARTS implicit hydrogen count operator appears on the right hand side of the reaction and is used to assign the implicit hydrogen count of the product nitrogen.
#!/usr/bin/env python # ch28-3.py from openeye.oechem import * libgen = OELibraryGen("[O:1]=[C:2][Cl:3].[N:4]>>[O:1]=[C:2][Nh1:4]") libgen.SetExplicitHydrogens(False) mol = OEGraphMol() OEParseSmiles(mol, "CC(=O)Cl") libgen.SetStartingMaterial(mol, 0) mol.Clear() OEParseSmiles(mol, "NCC") libgen.SetStartingMaterial(mol, 1) for product in libgen.GetProducts(): smi = OECreateCanSmiString(product) print smi
The reaction is written to work with implicit hydrogens (using the
lowercase 'h' primitive), and the OELibraryGen
instance is set
to work in implicit hydrogen mode using the
OELibraryGen::SetExplicitHydrogens
method.
The final example demonstrates automatic valence correction. In
implicit hydrogen mode (set using the
OELibraryGen::SetExplicitHydrogens
method) automatic valence
correction attempts to add or subtract implicit hydrogens in order to
retain the valence state observed in the starting materials. Before
chemical transformations commence, the valence state for each reacting
atom is recorded. After the transform operations are complete the
implicit hydrogen count is adjusted to match the beginning state of
the reacting atoms. Changes in formal charge are taken into account
during the valence correction.
#!/usr/bin/env python # ch28-4.py from openeye.oechem import * libgen = OELibraryGen("[O:1]=[C:2][Cl:3].[N:4]>>[O:1]=[C:2][N:4]") libgen.SetExplicitHydrogens(False) libgen.SetValenceCorrection(True) mol = OEGraphMol() OEParseSmiles(mol, "CC(=O)Cl") libgen.SetStartingMaterial(mol, 0) mol.Clear() OEParseSmiles(mol, "NCC") libgen.SetStartingMaterial(mol, 1) for product in libgen.GetProducts(): smi = OECreateCanSmiString(product) print smi
In general, automatic valence correction is a convenience that allows straightforward reactions to be written in simplified manner and reduces the onus of valence state bookkeeping. Reactions that alter the preferred valence state of an atom, oxidation for example, may not be automatically correctable.