The BinaryNumericSplit class implements the numeric feature splitting strategy devised by Gama, Rocha, and Medas in the following paper: More...
Public Types | |
typedef BinaryNumericSplitInfo< ObservationType > | SplitInfo |
The splitting information required by the BinaryNumericSplit. More... | |
Public Member Functions | |
BinaryNumericSplit (const size_t numClasses=0) | |
Create the BinaryNumericSplit object with the given number of classes. More... | |
BinaryNumericSplit (const size_t numClasses, const BinaryNumericSplit &other) | |
Create the BinaryNumericSplit object with the given number of classes, using information from the given other split for other parameters. More... | |
void | EvaluateFitnessFunction (double &bestFitness, double &secondBestFitness) |
Given the points seen so far, evaluate the fitness function, returning the best possible gain of a binary split. More... | |
size_t | MajorityClass () const |
The majority class of the points seen so far. More... | |
double | MajorityProbability () const |
The probability of the majority class given the points seen so far. More... | |
size_t | NumChildren () const |
template < typename Archive > | |
void | serialize (Archive &ar, const uint32_t) |
Serialize the object. More... | |
void | Split (arma::Col< size_t > &childMajorities, SplitInfo &splitInfo) |
Given that a split should happen, return the majority classes of the (two) children and an initialized SplitInfo object. More... | |
void | Train (ObservationType value, const size_t label) |
Train on the given value with the given label. More... | |
The BinaryNumericSplit class implements the numeric feature splitting strategy devised by Gama, Rocha, and Medas in the following paper:
This splitting procedure builds a binary tree on points it has seen so far, and then EvaluateFitnessFunction() returns the best possible split in O(n) time, where n is the number of samples seen so far. Every split with this split type returns only two splits (greater than or equal to the split point, and less than the split point). The Train() function should take O(1) time.
FitnessFunction | Fitness function to use for calculating gain. |
ObservationType | Type of observation used by this dimension. |
Definition at line 47 of file binary_numeric_split.hpp.
typedef BinaryNumericSplitInfo<ObservationType> SplitInfo |
The splitting information required by the BinaryNumericSplit.
Definition at line 51 of file binary_numeric_split.hpp.
BinaryNumericSplit | ( | const size_t | numClasses = 0 | ) |
Create the BinaryNumericSplit object with the given number of classes.
numClasses | Number of classes in dataset. |
BinaryNumericSplit | ( | const size_t | numClasses, |
const BinaryNumericSplit< FitnessFunction, ObservationType > & | other | ||
) |
Create the BinaryNumericSplit object with the given number of classes, using information from the given other split for other parameters.
In this case, there are no other parameters, but this function is required by the HoeffdingTree class.
void EvaluateFitnessFunction | ( | double & | bestFitness, |
double & | secondBestFitness | ||
) |
Given the points seen so far, evaluate the fitness function, returning the best possible gain of a binary split.
Note that this takes O(n) time, where n is the number of points seen so far. So this may not exactly be fast...
The best possible split will be stored in bestFitness, and the second best possible split will be stored in secondBestFitness.
bestFitness | Fitness function value for best possible split. |
secondBestFitness | Fitness function value for second best possible split. |
size_t MajorityClass | ( | ) | const |
The majority class of the points seen so far.
Referenced by BinaryNumericSplit< FitnessFunction, ObservationType >::NumChildren().
double MajorityProbability | ( | ) | const |
The probability of the majority class given the points seen so far.
Referenced by BinaryNumericSplit< FitnessFunction, ObservationType >::NumChildren().
|
inline |
Definition at line 93 of file binary_numeric_split.hpp.
References BinaryNumericSplit< FitnessFunction, ObservationType >::MajorityClass(), BinaryNumericSplit< FitnessFunction, ObservationType >::MajorityProbability(), BinaryNumericSplit< FitnessFunction, ObservationType >::serialize(), and BinaryNumericSplit< FitnessFunction, ObservationType >::Split().
void serialize | ( | Archive & | ar, |
const uint32_t | |||
) |
Serialize the object.
Referenced by BinaryNumericSplit< FitnessFunction, ObservationType >::NumChildren().
void Split | ( | arma::Col< size_t > & | childMajorities, |
SplitInfo & | splitInfo | ||
) |
Given that a split should happen, return the majority classes of the (two) children and an initialized SplitInfo object.
childMajorities | Majority classes of the children after the split. |
splitInfo | Split information. |
Referenced by BinaryNumericSplit< FitnessFunction, ObservationType >::NumChildren().
void Train | ( | ObservationType | value, |
const size_t | label | ||
) |
Train on the given value with the given label.
value | The value to train on. |
label | The label to train on. |