Extension of simple SpectrumKernel. Basically just does the simple
SpectrumKernel for all values of k (length of substrings to compare) from
2 up to the length of the strings to compare and adds up these component
scores.
Option available to sum these up by some weight factor such that each
substring of longer length will be worth another weight factor more
towards the final score. Alternatively, using a default weight factor of
1 is equivalent to just taking a simple linear sum of the component
scores.
|
__init__(self,
weightFactor=1,
normalize=False)
Constructor. |
|
|
|
similarity(self,
obj1,
obj2)
Primary abstract method where, given two objects, should return an
appropriate, non-negative, similarity score between the two. |
|
|
|
buildFeatureDictionary(self,
aString)
Create a dictionary keyed by all the k-mers (k-length substrings)
of aString, with values equal to the number of times that k-mer
appears in aString. |
|
|
|
weightCalc(self,
stringLen)
This function will determine the weight that a string of length
stringLen (int) should be given |
|
|
Inherited from BaseKernel.BaseKernel :
dictionaryDotProduct ,
dictionaryEuclideanDistanceSquared ,
ensureListCapacity ,
getFeatureDictionary ,
normalizeFeatureDictionary ,
outputMatrix ,
prepareFeatureDictionaryList
|