5.3.5 SD Data Example

The next example will show how to use the tagged data methods.

#!/usr/bin/env python
# ch5-2.py
from openeye.oechem import *
import os, sys

mol = OEGraphMol()
OEParseSmiles(mol, "c1ccccc1")
mol.SetTitle("benzene")

# now set some tagged data
OESetSDData(mol, 'color', 'brown')
OESetSDData(mol, 'size', 'small')
OESetSDData(mol, 'natoms', str(mol.NumAtoms()))

# loop over data and print it out
for dp in OEGetSDDataPairs(mol):
   sys.stdout.write('%s : %s\n' % (dp.GetTag(), dp.GetValue()))

# check for existence of a field, then delete it
if OEHasSDData(mol, 'color') == 1:
    OEDeleteSDData(mol, 'color')

# one last loop shows no 'color' field
for dp in OEGetSDDataPairs(mol):
    sys.stdout.write('%s : %s\n' % (dp.GetTag(), dp.GetValue()))

Note that SD tagged data is specific to MDL's SD file format. Any data added to a molecule will only be written out to SD files or OEBinary files. The SD data fields will only be filled when reading from SD files that contain SD tagged data or from OEBinary files previously created to contain this data.

Two more examples are provided specifically dealing with tagged data. sdf2csv.py takes an SD file as input and outputs a comma-delimited file (.csv) for importing into Excel or other spreadsheet programs. The other, mergecsv.py, takes a csv file and adds the data as tags to molecules in an input stream. This simple script assumes that the first column is the molecule title matching titles found in the incoming molecule file. It also assumes the first row contains names to be used as the tags.