Using the methods outlined above, it is possible to allow the stream format to be controlled from the command line. OEChem's oemolstreams control the format by interpreting the input and output file names.
The following is a simple example of using command-line arguments to allow OEChem programs to support many file formats at run-time.
#include "oechem.h" #include <iostream> using namespace OEChem; using namespace OESystem; using namespace std; int main(int argc,char *argv[]) { if(argc != 3) return 1; oemolistream ims(argv[1]); oemolostream oms(argv[2]); if (!ims) { cerr << "Error: Unable to read " << argv[1] << endl; return 1; } if (!oms) { cerr << "Error: Unable to create " << argv[2] << endl; return 1; } OEMol mol; while (OEReadMolecule(ims,mol)) OEWriteMolecule(oms,mol); return 0; }
The example above allows a user to specify the input and output files and formats from the command line.
For instance, if the above listing is a program called convert
:
prompt>convert file1.sdf file1.smi
will convert the file1.sdf
from MDL's SD format to Daylight's SMILES
format.
A first extension of this idea allows access to cin and cout via the "-" filename.
For instance:
prompt>convert file2.mol2 -
This command will read file2.mol2
in MOL2 format and write the
molecules to cout in SMILES, the default format.
Thus if you have another program GetFromDatabase
which gets
molecules from a database and writes them in SMILES format, you can chain
it with any OEChem program. Using your operating systems redirection
commands (e.g. - Unix pipe "|"
or redirect ">"
) you can
move molecules directly from GetFromDatabase
to convert
without a temporary file.
prompt>GetFromDatabase | convert - file3.sdf
This convert command will take the SMILES format output from
GetFromDatabase
and generate an SD format file.
However, to make this concept of using cin and cout for piping data really useful, one needs to be able to control the format of cin and cout similarly to the way it would be controlled for temporary files. To facilitate this, oemolstreams interpret filenames which are ONLY format extensions to indicate format control for cin and cout.
The following example shows use of file extensions as filenames
#include "oechem.h" #include <iostream> using namespace OEChem; using namespace OESystem; using namespace std; int main() { OEMol mol; oemolistream ims(".sdf"); oemolostream oms(".mol2"); if (ims) { if (oms) { while (OEReadMolecule(ims,mol)) OEWriteMolecule(oms,mol); } else cerr << "Error: Unable to write OEBinary to cout" << endl; } else cerr << "Error: Unable to read SD format from cin" << endl; return 0; }
In the example above, the input oemolstream is cin and the format is set to
SDF. The output oemolstream is cout and the format is MOL2. This is exactly
equivalent to listing 4.4. However, this method is extensible to format
control of cin and cout from the command line. Note: this prevents you
from naming files ".mol2", ".sdf", etc
.
Now, using our program convert
from listing 4.7 above:
prompt>convert .smi .mol2
This command opens cin with SMILES format and open cout with MOL2 format.
Now we have complete format control of cin and cout from the command line.
If we have a program GenerateStructures
, which only writes MOL2
format and another program GenerateData
, which only reads SD format,
we can use them from the command line with any OEChem program which uses
command-line arguments for file specification.
prompt> GenerateStructures | MyOEChemProgram .mol2 .sd | GenerateData
This command demonstrates how any OEChem program with command-line file specification can be used to pipe formated input and output.