The standard way of processing each item or member of a set or collection in OEChem is by the use of an iterator. The use of iterators is a common abstraction (or design pattern) in object oriented programming, that hides the way the collection/container is implemented from the user. Hence a set of atoms could be implemented internally as a array, a linked list, a hash table or any similar data structure, but its behavior to the programmer is independent of the actual implementation. An iterator can be thought of as a current position indicator.
OEChem iterators make use of C++'s template mechanism. The use of
templates allows the functionality of an iterator to be specified
(implemented) independently of the type of the collection being
iterated over. An iterator over a type T
, has the type
OEIter<T>
. Hence, an iterator over the atoms of a molecule
(represented by OEAtomBase) has type OEIter<OEAtomBase>
and an
iterator over the bonds of a molecule has type OEIter<OEBondBase>
.
The three most common operations of an OEIter are assignment, testing
and increment. These three iterator methods allow OEChem iterators to
resemble conventional for
loops in high level programming languages.
Assignment specifies which collection/container the iterator is
intended to loop over, testing determines whether the iterator has
seen all of the items, and increment advances the iterator to the next
position.
One possible source of confusion is that most functions and methods
that return an iterator, actually return a result of type
OEIterBase<T>
rather than OEIter<T>
. The template class
OEIterBase<T>
is an internal abstraction used by OEChem, and should
be treated as an opaque type by the user. Suffice to say that values
of type OEIterBase<T>
can be assigned to variables of type
OEIter<T>
as created by the user.
A second minor point is that OEChem iterators only support the prefix
++
operator, and not the suffix ++
operator. This means that to
use the advance the iterator, users must write ++i
and not
i++
. This is actually a performance issue, since in C and C++
the operator i++
must make a copy of its argument. This is to
support the syntax j = i++
where j is assigned the value
of i before the increment. This copying may potentially be
expensive, and must be performed even if the value is not assigned.
For primitive types such as integers, most C/C++ compilers can
determine the value is not used and optimize i++
to ++i
.
Alas for C++ classes, most compilers are unable to perform this
optimization and as such i++
and ++i
could do totally
different things, hence ++i
is the preferred idiom. Even if
OEChem changed the semantics of i++
to perform the same thing
as ++i
and return the value after the increment, the i++
form is marginally less efficient (requiring an ``invisible'' integer
argument to be passed to the operator). Hence OpenEye's policy is to
only implement the ``correct'' behavior and hope that users of OEChem
will adopt ++i
even for integer loops as good coding style.
Finally, the template OEIter is defined in the OESystem namespace
rather than the OEChem namespace. This is because iterators (like
random number generators) are not chemistry specific, and the use of
two namespaces makes this explicit. It does however mean an extra
using namespace OESystem;
in our examples.