AllCategoricalSplit< FitnessFunction > Class Template Reference

The AllCategoricalSplit is a splitting function that will split categorical features into many children: one child for each category. More...

Classes

class  AuxiliarySplitInfo
 

Static Public Member Functions

template
<
typename
ElemType
>
static size_t CalculateDirection (const ElemType &point, const double &splitInfo, const AuxiliarySplitInfo &)
 Calculate the direction a point should percolate to. More...

 
static size_t NumChildren (const double &splitInfo, const AuxiliarySplitInfo &)
 Return the number of children in the split. More...

 
template<bool UseWeights, typename VecType , typename LabelsType , typename WeightVecType >
static double SplitIfBetter (const double bestGain, const VecType &data, const size_t numCategories, const LabelsType &labels, const size_t numClasses, const WeightVecType &weights, const size_t minimumLeafSize, const double minimumGainSplit, arma::vec &splitInfo, AuxiliarySplitInfo &aux)
 Check if we can split a node. More...

 
template<bool UseWeights, typename VecType , typename ResponsesType , typename WeightVecType >
static double SplitIfBetter (const double bestGain, const VecType &data, const size_t numCategories, const ResponsesType &responses, const WeightVecType &weights, const size_t minimumLeafSize, const double minimumGainSplit, double &splitInfo, AuxiliarySplitInfo &aux, FitnessFunction &fitnessFunction)
 Check if we can split a node. More...

 

Detailed Description


template
<
typename
FitnessFunction
>

class mlpack::tree::AllCategoricalSplit< FitnessFunction >

The AllCategoricalSplit is a splitting function that will split categorical features into many children: one child for each category.

This is a generic splitting strategy and can be used for both regression and classification trees.

Template Parameters
FitnessFunctionFitness function to evaluate gain with.

Definition at line 30 of file all_categorical_split.hpp.

Member Function Documentation

◆ CalculateDirection()

static size_t CalculateDirection ( const ElemType &  point,
const double &  splitInfo,
const AuxiliarySplitInfo  
)
static

Calculate the direction a point should percolate to.

Parameters
pointthe Point to use.
splitInfoAuxiliary information for the split.
*(aux) Auxiliary information for the split (Unused).

◆ NumChildren()

static size_t NumChildren ( const double &  splitInfo,
const AuxiliarySplitInfo  
)
static

Return the number of children in the split.

Parameters
splitInfoAuxiliary information for the split.
*(aux) Auxiliary information for the split (Unused).

◆ SplitIfBetter() [1/2]

static double SplitIfBetter ( const double  bestGain,
const VecType &  data,
const size_t  numCategories,
const LabelsType &  labels,
const size_t  numClasses,
const WeightVecType &  weights,
const size_t  minimumLeafSize,
const double  minimumGainSplit,
arma::vec &  splitInfo,
AuxiliarySplitInfo aux 
)
static

Check if we can split a node.

If we can split a node in a way that improves on 'bestGain', then we return the improved gain. Otherwise we return the value 'bestGain'. If a split is made, then splitInfo and aux may be modified. For this particular split type, aux will be empty and splitInfo will store the number of children of the node.

This overload is used only for classification.

Parameters
bestGainBest gain seen so far (we'll only split if we find gain better than this).
dataThe dimension of data points to check for a split in.
numCategoriesNumber of categories in the categorical data.
labelsLabels for each point.
numClassesNumber of classes in the dataset.
weightsWeights associated with labels.
minimumLeafSizeMinimum number of points in a leaf node for splitting.
splitInfoStores split information on a successful split.
minimumGainSplitMinimum gain split.
auxAuxiliary split information, which may be modified on a successful split.

◆ SplitIfBetter() [2/2]

static double SplitIfBetter ( const double  bestGain,
const VecType &  data,
const size_t  numCategories,
const ResponsesType &  responses,
const WeightVecType &  weights,
const size_t  minimumLeafSize,
const double  minimumGainSplit,
double &  splitInfo,
AuxiliarySplitInfo aux,
FitnessFunction &  fitnessFunction 
)
static

Check if we can split a node.

If we can split a node in a way that improves on 'bestGain', then we return the improved gain. Otherwise we return the value 'bestGain'. If a split is made, then splitInfo and aux may be modified. For this particular split type, aux will be empty and splitInfo will store the number of children of the node.

This overload is used only for regression.

Parameters
bestGainBest gain seen so far (we'll only split if we find gain better than this).
dataThe dimension of data points to check for a split in.
numCategoriesNumber of categories in the categorical data.
responsesResponses for each point.
weightsWeights associated with responses.
minimumLeafSizeMinimum number of points in a leaf node for splitting.
splitInfoStores split information on a successful split.
minimumGainSplitMinimum gain split.
auxAuxiliary split information, which may be modified on a successful split.
fitnessFunctionThe FitnessFunction object instance. It it used to evaluate the gain for the split.

The documentation for this class was generated from the following file: