3.5 Compressed Molecule Input and Output

For any of the molecular file formats supported by OEChem it is often convenient to read and write compressed files or strings. Molecule streams support gzipped input and output via the zlib library. The ".gz" suffix on any filename used to open a stream is recognized and the stream is read or written in compressed format. This mechanism does not interfere with the format perception. For instance, "fn.sdf.gz" is recognized as a gzipped file with MDL's SD format.

The following example demonstrates use of compressed input and output

#include "oechem.h"
#include <iostream>

using namespace OEChem;
using namespace OESystem;
using namespace std;

int main()
{
  OEMol mol;
  oemolistream ims("input.sdf.gz");
  oemolostream oms("output.oeb.gz");

  if (ims)
  {
    if (oms)
    {
      while (OEReadMolecule(ims,mol))
        OEWriteMolecule(oms,mol);
    }
    else cerr << "Error: Unable to create output.oeb.gz" << endl;
  }
  else cerr << "Error: Unable to read input.sdf.gz" << endl;
  return 0;
}

The example above converts all of the molecules in a gzipped SD format file into an OEBinary version 2 format gzipped file.