RandomBinaryNumericSplit< FitnessFunction > Class Template Reference

The RandomBinaryNumericSplit is a splitting function for decision trees that will split based on a randomly selected point between the minimum and maximum value of the numerical dimension. More...

Classes

class  AuxiliarySplitInfo
 

Static Public Member Functions

template
<
typename
ElemType
>
static size_t CalculateDirection (const ElemType &point, const double &splitInfo, const AuxiliarySplitInfo &)
 Given a point, calculate which child it should go to (left or right). More...

 
static size_t NumChildren (const double &, const AuxiliarySplitInfo &)
 Returns 2, since the binary split always has two children. More...

 
template<bool UseWeights, typename VecType , typename WeightVecType >
static double SplitIfBetter (const double bestGain, const VecType &data, const arma::Row< size_t > &labels, const size_t numClasses, const WeightVecType &weights, const size_t minimumLeafSize, const double minimumGainSplit, arma::vec &splitInfo, AuxiliarySplitInfo &aux, const bool splitIfBetterGain=false)
 Check if we can split a node. More...

 
template<bool UseWeights, typename VecType , typename WeightVecType >
static double SplitIfBetter (const double bestGain, const VecType &data, const arma::rowvec &responses, const WeightVecType &weights, const size_t minimumLeafSize, const double minimumGainSplit, double &splitInfo, AuxiliarySplitInfo &aux, FitnessFunction &fitnessFunction, const bool splitIfBetterGain=false)
 Check if we can split a node. More...

 

Detailed Description


template
<
typename
FitnessFunction
>

class mlpack::tree::RandomBinaryNumericSplit< FitnessFunction >

The RandomBinaryNumericSplit is a splitting function for decision trees that will split based on a randomly selected point between the minimum and maximum value of the numerical dimension.

Template Parameters
FitnessFunctionFitness function to use to calculate gain.

Definition at line 28 of file random_binary_numeric_split.hpp.

Member Function Documentation

◆ CalculateDirection()

static size_t CalculateDirection ( const ElemType &  point,
const double &  splitInfo,
const AuxiliarySplitInfo  
)
static

Given a point, calculate which child it should go to (left or right).

Parameters
pointPoint to calculate direction of.
splitInfoAuxiliary information for the split.
*(aux) Auxiliary information for the split (Unused).

Referenced by RandomBinaryNumericSplit< FitnessFunction >::NumChildren().

◆ NumChildren()

static size_t NumChildren ( const double &  ,
const AuxiliarySplitInfo  
)
inlinestatic

Returns 2, since the binary split always has two children.

Parameters
splitInfoAuxiliary information for the split.
auxAuxiliary split information, which may be modified on a successful split.

Definition at line 136 of file random_binary_numeric_split.hpp.

References RandomBinaryNumericSplit< FitnessFunction >::CalculateDirection().

◆ SplitIfBetter() [1/2]

static double SplitIfBetter ( const double  bestGain,
const VecType &  data,
const arma::Row< size_t > &  labels,
const size_t  numClasses,
const WeightVecType &  weights,
const size_t  minimumLeafSize,
const double  minimumGainSplit,
arma::vec &  splitInfo,
AuxiliarySplitInfo aux,
const bool  splitIfBetterGain = false 
)
static

Check if we can split a node.

If we can split a node in a way that improves on 'bestGain', then we return the improved gain. Otherwise we return the value 'bestGain'. If a split is made, then splitInfo and aux may be modified.

This overload is used only for classification tasks.

@article{10.1007/s10994-006-6226-1,
author = {Geurts, Pierre and Ernst, Damien and Wehenkel, Louis},
title = {Extremely Randomized Trees},
year = {2006},
issue_date = {April 2006},
publisher = {Kluwer Academic Publishers},
address = {USA},
volume = {63},
number = {1},
issn = {0885-6125},
url = {https://doi.org/10.1007/s10994-006-6226-1},
doi = {10.1007/s10994-006-6226-1},
journal = {Mach. Learn.},
month = apr,
pages = {3–42},
numpages = {40},
}
Parameters
bestGainBest gain seen so far (we'll only split if we find gain better than this).
dataThe dimension of data points to check for a split in.
labelsLabels for each point.
numClassesNumber of classes in the dataset.
weightsWeights associated with labels.
minimumLeafSizeMinimum number of points in a leaf node for splitting.
minimumGainSplitMinimum gain split.
splitInfoStores split information on a successful split.
auxAuxiliary split information, which may be modified on a successful split.
splitIfBetterGainWhen set to true, it will split only when gain is better than the current best gain. Otherwise, it always makes a split regardless of gain.

◆ SplitIfBetter() [2/2]

static double SplitIfBetter ( const double  bestGain,
const VecType &  data,
const arma::rowvec &  responses,
const WeightVecType &  weights,
const size_t  minimumLeafSize,
const double  minimumGainSplit,
double &  splitInfo,
AuxiliarySplitInfo aux,
FitnessFunction &  fitnessFunction,
const bool  splitIfBetterGain = false 
)
static

Check if we can split a node.

If we can split a node in a way that improves on 'bestGain', then we return the improved gain. Otherwise we return the value 'bestGain'. If a split is made, then splitInfo and aux may be modified.

This overload is used only for regression tasks.

Parameters
bestGainBest gain seen so far (we'll only split if we find gain better than this).
dataThe dimension of data points to check for a split in.
responsesResponses for each point.
weightsWeights associated with responses.
minimumLeafSizeMinimum number of points in a leaf node for splitting.
minimumGainSplitMinimum gain split.
splitInfoStores split information on a successful split.
auxAuxiliary split information, which may be modified on a successful split.
fitnessFunctionThe FitnessFunction object instance. It it used to evaluate the gain for the split.
splitIfBetterGainWhen set to true, it will split only when gain is better than the current best gain. Otherwise, it always makes a split regardless of gain.

The documentation for this class was generated from the following file: