algo
Class AlgoClassifier
java.lang.Object
|
+--algo.AlgoClassifier
-
Direct Known Subclasses:
-
GiniClassificator
-
public abstract class AlgoClassifier
-
extends java.lang.Object
Base class for the algorithmic classifiers. All implemented classifiers
inherit from this abstract class. If just the abstract methods 'calculateAttribute()',
'sortCategories(CategoryArray cat, Dataset[] sets, int nAttributIndex)'
and getName() are overridden, the implemented pruning is cost-complexity
pruning (Breiman, et al.: CART). A new pruning technique can be implemented
by overridding the methods 'initPruning()' and 'optimizeTree()'.
-
Since:
-
PBC 2.1
-
Author:
-
Mihael Ankerst
Field Summary |
protected
float |
m_fPurity
The maximum
purity-threshold. |
protected
int |
m_nSetNumber
The threshold
for the number of records in a node. |
protected
boolean |
m_proposeSplit
Flag indicating
whether 'Propose Split' has been invoked. |
protected
Pruner |
m_pruner
|
Method Summary |
abstract
void |
calculateAttribute(Node
n, java.util.Vector vAttributes)
Computes the
best attribute and the best split points for the given node and the gives
set of attributes. |
void |
classifyLevels(Step
s, int nLevels)
Constructs a
tree with 'nLevels' levels for the data corresponding to 'Step s'. |
void |
classifyTree(Step
s)
Constructs a
tree for the data corresponding to 'Step s'. |
abstract
java.lang.String |
getName()
Returns the
name of the classifier that appears in the Combobox for selection. |
void |
initPruning(AlgoClassifier
classifier, Dataset[] validationDataset)
Initializes
the pruning method which is based on a separate validation set. |
void |
initPruning(AlgoClassifier
classifier, int nFoldNumber)
Initializes
the pruning method which is based on cross-validation. |
Step |
optimizeTree(Step
s)
Prunes the tree
which has 'Step s' as a root node. |
Step |
optimizeTree(Step
s, JStatusBar bar)
Prunes the tree
which has 'Step s' as a root node. |
void |
setParameters(float
fPurity, int nSetNumber)
Paremeter setting
for forward pruning. |
void |
setProposeSplitFlag(boolean
ps)
|
abstract
int[] |
sortCategories(CategoryArray
cat, Dataset[] sets, int nAttributIndex)
Sorts the categories
of a categorical attribute. |
Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify,
notifyAll, toString, wait, wait, wait |
m_pruner
protected Pruner m_pruner
m_fPurity
protected float m_fPurity
-
The maximum purity-threshold. During decision tree construction, this threshold
can be used to assign the majority class to the node, if the threshold
is exceeded.
m_nSetNumber
protected int m_nSetNumber
-
The threshold for the number of records in a node. During decision tree
construction, this threshold can be used to assign the majority class to
the node, if there are less than 'm_nSetNumber' associated records.
m_proposeSplit
protected boolean m_proposeSplit
-
Flag indicating whether 'Propose Split' has been invoked.
AlgoClassifier
public AlgoClassifier()
setParameters
public final void setParameters(float fPurity,
int nSetNumber)
-
Paremeter setting for forward pruning.
-
Parameters:
-
fPurity - purity threshold
-
nSetNumber - Threshold for the number of records
setProposeSplitFlag
public final void setProposeSplitFlag(boolean ps)
initPruning
public void initPruning(AlgoClassifier classifier,
int nFoldNumber)
-
Initializes the pruning method which is based on cross-validation.
-
Parameters:
-
classifier - classifer which was used to build the tree.
-
nFoldNumber - Number of folds for cross-validation.
initPruning
public void initPruning(AlgoClassifier classifier,
Dataset[] validationDataset)
-
Initializes the pruning method which is based on a separate validation
set.
-
Parameters:
-
classifier - classifer which was used to build the tree.
-
validationDataset. -
optimizeTree
public Step optimizeTree(Step s)
-
Prunes the tree which has 'Step s' as a root node.
-
Parameters:
-
s - Root of the tree.
-
Returns:
-
The pruned tree.
optimizeTree
public Step optimizeTree(Step s,
JStatusBar bar)
-
Prunes the tree which has 'Step s' as a root node. The progression during
the computation is reflected by the JStatusBar.
-
Parameters:
-
s - Root of the tree.
-
bar. -
sortCategories
public abstract int[] sortCategories(CategoryArray cat,
Dataset[] sets,
int nAttributIndex)
-
Sorts the categories of a categorical attribute. This method is also invoked
before the data visualization to determine the order in which the categories
are visualized. This category order must be the same as the one which is
used to calculate/represent the split points.
-
Parameters:
-
cat - Array with the indizes of the categories.
-
sets - Orininal data records.
-
nAttributIndex - Index of the attribute for which the sorting
should be computed.
-
Returns:
-
Array with the sorted indizes of the categories.
calculateAttribute
public abstract void calculateAttribute(Node n,
java.util.Vector vAttributes)
-
Computes the best attribute and the best split points for the given node
and the gives set of attributes. If the set of attributes is empty, all
attributes are considered.
-
Parameters:
-
n - Node, for which the split is calculated.
-
vAttributes - The set of candidate attributes for which the best
split is computed.
getName
public abstract java.lang.String getName()
-
Returns the name of the classifier that appears in the Combobox for selection.
classifyTree
public final void classifyTree(Step s)
-
Constructs a tree for the data corresponding to 'Step s'.
-
Parameters:
-
s - Step which becomes the root node of the tree.
classifyLevels
public final void classifyLevels(Step s,
int nLevels)
-
Constructs a tree with 'nLevels' levels for the data corresponding to 'Step
s'.
-
Parameters:
-
s - Root of the tree.
-
nLevels - Number of levels.