Package CHEM :: Package Common :: Module ChemicalDetail :: Class ChemicalDetail
[hide private]
[frames] | no frames]

Class ChemicalDetail



Various utility methods for retrieving details for some chemical from the database, usually keyed by the canonical SMILES string. See individual methods for details.

Instance Methods [hide private]
 
__init__(self)
Constructor
 
chemicalDetails(self, searchColumn, searchValue)
Find an item in the chemical table based on a specified searchColumn and value.
 
annotationDictsBySmiles(self, smiles)
Find an item in the chemical table based on the given SMILES string.
 
separateAnnotationDictsByMetadata(self, annotationDicts, metadataColumn)
Given a list of dictionaries (RowItemModels) representing Annotation rows from the annotationDictsBySmiles method, see if any of the rows match items in the annotation_metadata table whose specified metadataColumn is True.
 
annotatedChemicalMols(self, smilesList, idInfoOnly=False)
Given one or a list of chemical can_smiles values, return an iterator over OEMolBase objects representing the chemicals.
 
retrieveIsomersBySmiles(self, smiles, targetOEOS=None)
Find an item in the chemical table based on the given SMILES string.
 
__applySDTags(self, isomerMol, isomerModel)
Given an isomerMol OEMolBase object, add SD annotation tags on it based on the contents of the isomerModel RowItemModel (dictionary) representing column values from the database.
Static Methods [hide private]
 
__parseSDFString(mol, sdfString)
Given an SDF string, parse it into the given OEMolBase molecule object as if the string represented an SDF file.
Class Variables [hide private]
  ANNOTATION_COLS = ["can_smiles", "abbrev", "website", "externa...
Method Details [hide private]

chemicalDetails(self, searchColumn, searchValue)

 

Find an item in the chemical table based on a specified searchColumn and value. Will only return the first chemical result so typically only want to search by a unique identifier like "can_smiles" or "chemical_id."

Returns a RowItemModel dictionary representing all of the data found in that row. In addition, count up the number of isomer3d records under the chemical and store that in an entry named count_isomer3d.

annotationDictsBySmiles(self, smiles)

 
Find an item in the chemical table based on the given SMILES string.
Then trace through the whole DB schema
(chemical - mixturecomponent - chemicalmix - source2chemicalmix - source - annotation)
to find all of the chemicalmix and sources this chemical belongs to.
Furthermore, find all of the names and values of annotations applied to such
chemical mixes, ordered by the source that specified the annotation.

Do an outer-join.  That is, if a chemical exists under some
source, but no annotations exist, still at least report a row for that
source with blank annotation values.

The results should be a list of RowItemModel dictionaries, at least containing
the following "columns" in each model object.

    chemicalmix.can_smiles
    source.abbrev
    source2chemicalmix.external_id
    annotation.name
    annotation.svalue
    annotation.fvalue

separateAnnotationDictsByMetadata(self, annotationDicts, metadataColumn)

 

Given a list of dictionaries (RowItemModels) representing Annotation rows from the annotationDictsBySmiles method, see if any of the rows match items in the annotation_metadata table whose specified metadataColumn is True.

If so, remove those items from the original list, and return them in a new, separate list of annotation dictionary models.

annotatedChemicalMols(self, smilesList, idInfoOnly=False)

 
Given one or a list of chemical can_smiles values, return an iterator over OEMolBase
objects representing the chemicals.  Extra value of running the
annotationDictsBySmiles method for each one and recording these values as
SD Tags on the molecule object.  If the molecule is subsequently output in SDF format,
the caller can view these in file output.

Note that this returns an iterator over the same molecule object, it just gets cleared
and rewritten at each iteration.  If for some reason you want a separate copy for each,
you'll need to do that yourself by doing something like "copy = OEGraphMol(mol)"

Since a chemical can belong to multiple chemical mixes, this is a little
sloppy situation.  May only want to count when the chemical is the
"first largest component" of the mix or something.  In the meantime, only record 
sources and external IDs this is under.

idInfoOnly: If set to True, just load the found chemical_ids and molecule titles.
    Don't spend time searching for 3D structures or source information.

retrieveIsomersBySmiles(self, smiles, targetOEOS=None)

 

Find an item in the chemical table based on the given SMILES string. Then find respective records in the isomer3d table and any SDF 3D molecular data files in particular. If so, then return an oemolistream of OEMolBase objects to represent the contents of these SD Files.

If the smiles provided is actually a list, will return all SDFs under those chemical SMILES.

Furthermore, include SD annotation tags for:
  • All primary chemical annotations / columns (except mixturecomponent_id)
  • Isomer #
  • Isomeric SMILES
If a targetOEOS (oemolostream) is provided, will write the output directly there instead of managing it in memory. No OEIS will be returned then. This is in case the query size is huge (i.e., the whole database) and should be processed as a stream of data.

__parseSDFString(mol, sdfString)
Static Method

 
Given an SDF string, parse it into the given OEMolBase molecule object as if the string represented an SDF file. Will only get the first molecule in the SDF if it contains multiple.

Class Variable Details [hide private]

ANNOTATION_COLS

Value:
["can_smiles", "abbrev", "website", "external_id", "name", "svalue", "\
fvalue"]