8.4 Reading Multi-conformer molecules

Molecule streams, which were introduced above, can read both single and multi-conformer molecules from any file format. Many of the file formats supported by OEChem are inherently a single conformer format (SDF and MOL2, for example). However, a common practice is to store multiple conformers in these files. OEChem supports a rather advanced mechanism for recovering these separate conformers into a single, multi-conformer OEMol. Note that this does not apply to file formats where conformers are stored together. For example, all molecules in old OEBinary file are multi-conformer molecules. New OEBinary (.oeb) stores either single conformer or multi-conformer molecules explicitly so the file itself determines how to deal with conformers. Additionally, file formats that have no notion of conformers (i.e. SMILES files) are also unaffected by this feature.

In early versions of OEChem, the default behavior for reading into an OEMol was to attempt to combine conformers of the same molecules together into a single OEMol. This is no longer the default, but is instead something controlled by the programmer.

oemolistreams have a method, SetConfTest, that sets a functor to be used to compare the graphs of incoming molecules to determine whether to combine them. These functors are instances of OEConfTest. Several predefined versions include:

OEDefaultConfTest
Never combine connection tables into multi-conformer molecules.

OEIsomericConfTest
This implementation of OEConfTest combines subsequent connection tables into a multi-conformer molecule if they:
  1. Have the same title (optional)
  2. Have the same numbers of atoms and bonds in the same order
  3. Each atom and bond must have identical properties with its order correspondent in the subsequent connection table
  4. Have the same atom and bond stereochemistry

No change to the connection table are made.

The constructor for OEIsomericConfTest has a default argument for whether or not to compare titles. If the constructor is called with no arguments or with the argument true, the titles will be required to be the same. Otherwise, the titles will not be compared. In the latter instance, each conformer will have the individual title of its original connection table and the multi-conformer molecule will reflect the title of the active conformer.

OEAbsoluteConfTest
This implementation of OEConfTest combines subsequent connection tables into a multi-conformer molecule if they:
  1. Have the same title (optional)
  2. Have the same number of atoms and bonds in the same order
  3. Each atom and bond must have identical properties with its order correspondent in the subsequent connection table

This conformer test sets all fully specified isomeric values to UNDEFINED.

The constructor for OEAbsoluteConfTest has a default argument for whether or not to compare titles. If the constructor is called with no arguments or with the argument true, the titles will be required to be the same. Otherwise, the titles will not be compared. In the latter instance, each conformer will have the individual title of its original connection table and the multi-conformer molecule will reflect the title of the active conformer.

OEAbsCanonicalConfTest
This implementation of OEConfTest combines subsequent connection tables into a multi-conformer molecule if they:
  1. Have the same absolute (non-isomeric) graph

This conformer test puts all of the molecules in their canonical atom order. In addition, all fully specified isomeric values are set to UNDEFINED.

The following example will attempt to read multi-conformer molecules from an input file based on OEAbsoluteConfTest. Note that creating the instance of OEAbsoluteConfTest with the default constructor argument (0 or false), allows conformers to be combined when they have different titles. This is very useful when dealing with files created by programs that modify molecule titles to indicate conformer number (i.e. acetsali_1, acetsali_2, acetsali_3). The resulting multi-conformer molecule will have the title associated with the first connection table read from the file.

#include "openeye.h"
#include "oesystem.h"
#include "oechem.h"

using namespace OESystem;
using namespace OEChem;

int main()
{
  OEIter<OEMCMolBase> mol;
  oemolistream ims;
  ims.SetConfTest(OEIsomericConfTest());

  for (mol=ims.GetMCMolBases(); mol; ++mol)
  {
    cout << mol->GetTitle() << " has " << mol->NumConfs()
         << " conformers" << endl;
  }
  return 0;
}