The AllCategoricalSplit is a splitting function that will split categorical features into many children: one child for each category. More...
Classes | |
class | AuxiliarySplitInfo |
Static Public Member Functions | |
template < typename ElemType > | |
static size_t | CalculateDirection (const ElemType &point, const double &splitInfo, const AuxiliarySplitInfo &) |
Calculate the direction a point should percolate to. More... | |
static size_t | NumChildren (const double &splitInfo, const AuxiliarySplitInfo &) |
Return the number of children in the split. More... | |
template<bool UseWeights, typename VecType , typename LabelsType , typename WeightVecType > | |
static double | SplitIfBetter (const double bestGain, const VecType &data, const size_t numCategories, const LabelsType &labels, const size_t numClasses, const WeightVecType &weights, const size_t minimumLeafSize, const double minimumGainSplit, arma::vec &splitInfo, AuxiliarySplitInfo &aux) |
Check if we can split a node. More... | |
template<bool UseWeights, typename VecType , typename ResponsesType , typename WeightVecType > | |
static double | SplitIfBetter (const double bestGain, const VecType &data, const size_t numCategories, const ResponsesType &responses, const WeightVecType &weights, const size_t minimumLeafSize, const double minimumGainSplit, double &splitInfo, AuxiliarySplitInfo &aux, FitnessFunction &fitnessFunction) |
Check if we can split a node. More... | |
The AllCategoricalSplit is a splitting function that will split categorical features into many children: one child for each category.
This is a generic splitting strategy and can be used for both regression and classification trees.
FitnessFunction | Fitness function to evaluate gain with. |
Definition at line 30 of file all_categorical_split.hpp.
|
static |
Calculate the direction a point should percolate to.
point | the Point to use. |
splitInfo | Auxiliary information for the split. |
* | (aux) Auxiliary information for the split (Unused). |
|
static |
Return the number of children in the split.
splitInfo | Auxiliary information for the split. |
* | (aux) Auxiliary information for the split (Unused). |
|
static |
Check if we can split a node.
If we can split a node in a way that improves on 'bestGain', then we return the improved gain. Otherwise we return the value 'bestGain'. If a split is made, then splitInfo and aux may be modified. For this particular split type, aux will be empty and splitInfo will store the number of children of the node.
This overload is used only for classification.
bestGain | Best gain seen so far (we'll only split if we find gain better than this). |
data | The dimension of data points to check for a split in. |
numCategories | Number of categories in the categorical data. |
labels | Labels for each point. |
numClasses | Number of classes in the dataset. |
weights | Weights associated with labels. |
minimumLeafSize | Minimum number of points in a leaf node for splitting. |
splitInfo | Stores split information on a successful split. |
minimumGainSplit | Minimum gain split. |
aux | Auxiliary split information, which may be modified on a successful split. |
|
static |
Check if we can split a node.
If we can split a node in a way that improves on 'bestGain', then we return the improved gain. Otherwise we return the value 'bestGain'. If a split is made, then splitInfo and aux may be modified. For this particular split type, aux will be empty and splitInfo will store the number of children of the node.
This overload is used only for regression.
bestGain | Best gain seen so far (we'll only split if we find gain better than this). |
data | The dimension of data points to check for a split in. |
numCategories | Number of categories in the categorical data. |
responses | Responses for each point. |
weights | Weights associated with responses. |
minimumLeafSize | Minimum number of points in a leaf node for splitting. |
splitInfo | Stores split information on a successful split. |
minimumGainSplit | Minimum gain split. |
aux | Auxiliary split information, which may be modified on a successful split. |
fitnessFunction | The FitnessFunction object instance. It it used to evaluate the gain for the split. |