12.5 Identifying Connected Components

To aid in splitting molecules into discrete connected components, for example to separate a parent compound from its salt, or a ligand from a protein, OEChem provides the function OEDetermineComponents. This function arbitrarily assigns an integer index, starting from one, to each disconnected part in the OEMolBase. On return this provides a mapping from each atom's index, obtained by GetIdx, to its component index. Unused atom indices are mapped to zero. The function itself also returns the total number of components found, i.e. the maximum part index stored in the array.

The following provides a short example of how to use this function.

def MyReportParts(mol):
    size = mol.GetMaxAtomIdx()
    count,parts = OEDetermineComponents(mol)

    print "The molecule has %d components\n" % count

    for atom in mol.GetAtoms():
        print "atom %d is in part %d\n" % (atom.GetIdx(),parts[atom.GetIdx()])