17.2 Using Query Molecules

An alternate method of describing a substructure query uses a Query molecule. Using a molecule (or piece of one) along with graph modifier, you are able to define very flexible queries in a manner subtly different from SMARTS. Loading a query molecule is almost the same as a OEGraphMol. They can be used directly in the OERead*() commands or they can be created from an already existing OEGraphMol.

# create an OEGraphMol
mol = OEGraphMol()
OEParseSmiles(mol, "c1ccccc1")

# create a query molecule
qmol = OEQMol(mol)

Once a query molecule is created, before it can be used in a substructure search, we have to define the level of matching between the graph in the query molecule and any target molecule. We do this with the BuildExpressions() method of the OEQMol. The method takes two arguments, the first describing atom matches and the second bond matches. The following table shows switches that are OR'd together to give the desired level of matching. All of the constants are defined in the OEExprOpts namespace in C++ hence the ``OEExprOpts_'' prefix.

Modifiers of atom expressions:

OEExprOpts_Mass
OEExprOpts_Hcount
OEExprOpts_ImplicitHCount
OEExprOpts_FormalCharge
OEExprOpts_StrictFormalCharge
OEExprOpts_Degree
OEExprOpts_HvyDegree
OEExprOpts_Valence
OEExprOpts_Hybridization
OEExprOpts_AtomicNumber

Extra modifiers used with AtomicNumber:

OEExprOpts_EqMetal
OEExprOpts_EqHalogen
OEExprOpts_EqON
OEExprOpts_EqONS
OEExprOpts_EqPS
OEExprOpts_EqAromatic
OEExprOpts_EqCHalogen
OEExprOpts_EqCAliphaticON
OEExprOpts_EqCPSAcidRoot
OEExprOpts_EqKetoneSulfoneRoot

Modifiers of bond expressions:

OEExprOpts_BondOrder

Extra modifiers used with BondOrder:

OEExprOpts_EqSingleDouble
OEExprOpts_EqDoubleTriple
OEExprOpts_EqNotAromatic

Modifiers of both atom and bond expressions:

OEExprOpts_Aromaticity
OEExprOpts_RingMember
OEExprOpts_Chiral

There are some pre-defined expressions:

OEExprOpts_DefaultAtoms = AtomicNumber | Aromaticity | FormalCharge
OEExprOpts_DefaultBonds = BondOrder | Aromaticity

OEExprOpts_ExactAtoms =
  AtomicNumber|Aromaticity|StrictFormalCharge|Degree|HCount|Chiral|Mass|RingMember
OEExprOpts_ExactBonds = BondOrder|Aromaticity|RingMember|Chiral

OEExprOpts_AutomorphAtoms = AtomicNumber|Aromaticity|Degree|Chiral|HCount
OEExprOpts_AutomorphBonds = Aromaticity

A few examples are perhaps the best way to describe how query molecules work. Starting with a simple example that provides for uncolored graph matching from an input molecule. Note that in this example, an OEQMol is created from an existing molecule. Then, before using it to create an OESubSearch instance, the user must call the BuildExpressions method giving two integers. The first is modifiers on how atoms will be matched. In this case '0' means all match all. The second is a modifier on how bonds match. In this case as well, '0' means all bonds match. So, an OEQMol constructed as such will basically be a graph only match on any target. Since started construction of the qmol using benzene, the subsequent OESubSearch will match any six-membered ring.

# ch17-3.py
from openeye.oechem import *

import os,sys

m1 = OEMol()
m2 = OEMol()

OEParseSmiles(m1, 'c1ccccc1')
OEParseSmiles(m2, 'C1CCCCC1')

qmol = OEQMol(m1)
qmol.BuildExpressions(0,0)
pat = OESubSearch(qmol)

if pat.SingleMatch(m2) == 1:
    print 'got a match'
else:
    print 'darn, no match'

The best way to understand what the switches to BuildExpressions is to try the above example, changing the molecules and using different switch combinations. To get a match much like a SMARTS search, change the above code to ``qmol.BuildExpressions(DefaultAtoms, DefaultBonds)''. Since OEQMols are used to create an OESubSearch object, once it is created, the same mechanism as in Figure 17-2 can be used to extract matching atoms/bonds and to restrict the match to unique matches only.