mlpack::tree Namespace Reference

Trees and tree-building procedures. More...

Namespaces

 enumerate
 
 split
 

Classes

class  AllCategoricalSplit
 The AllCategoricalSplit is a splitting function that will split categorical features into many children: one child for each category. More...

 
class  AllDimensionSelect
 This dimension selection policy allows any dimension to be selected for splitting. More...

 
class  AxisParallelProjVector
 AxisParallelProjVector defines an axis-parallel projection vector. More...

 
class  BestBinaryNumericSplit
 The BestBinaryNumericSplit is a splitting function for decision trees that will exhaustively search a numeric dimension for the best binary split. More...

 
class  BinaryNumericSplit
 The BinaryNumericSplit class implements the numeric feature splitting strategy devised by Gama, Rocha, and Medas in the following paper: More...

 
class  BinaryNumericSplitInfo
 
class  BinarySpaceTree
 A binary space partitioning tree, such as a KD-tree or a ball tree. More...

 
class  CategoricalSplitInfo
 
class  CompareCosineNode
 
class  CosineTree
 
class  CoverTree
 A cover tree is a tree specifically designed to speed up nearest-neighbor computation in high-dimensional spaces. More...

 
class  DecisionTree
 This class implements a generic decision tree learner. More...

 
class  DecisionTreeRegressor
 This class implements a generic decision tree learner. More...

 
class  DiscreteHilbertValue
 The DiscreteHilbertValue class stores Hilbert values for all of the points in a RectangleTree node, and calculates Hilbert values for new points. More...

 
class  EmptyStatistic
 Empty statistic if you are not interested in storing statistics in your tree. More...

 
class  ExampleTree
 This is not an actual space tree but instead an example tree that exists to show and document all the functions that mlpack trees must implement. More...

 
class  FirstPointIsRoot
 This class is meant to be used as a choice for the policy class RootPointPolicy of the CoverTree class. More...

 
class  GiniGain
 The Gini gain, a measure of set purity usable as a fitness function (FitnessFunction) for decision trees. More...

 
class  GiniImpurity
 
class  GreedySingleTreeTraverser
 
struct  HasOptimizedBinarySplitForms
 
class  HilbertRTreeAuxiliaryInformation
 
class  HilbertRTreeDescentHeuristic
 This class chooses the best child of a node in a Hilbert R tree when inserting a new point. More...

 
class  HilbertRTreeSplit
 The splitting procedure for the Hilbert R tree. More...

 
class  HoeffdingCategoricalSplit
 This is the standard Hoeffding-bound categorical feature proposed in the paper below: More...

 
class  HoeffdingInformationGain
 
class  HoeffdingNumericSplit
 The HoeffdingNumericSplit class implements the numeric feature splitting strategy alluded to by Domingos and Hulten in the following paper: More...

 
class  HoeffdingTree
 The HoeffdingTree object represents all of the necessary information for a Hoeffding-bound-based decision tree. More...

 
class  HoeffdingTreeModel
 This class is a serializable Hoeffding tree model that can hold four different types of Hoeffding trees. More...

 
class  HyperplaneBase
 HyperplaneBase defines a splitting hyperplane based on a projection vector and projection value. More...

 
class  InformationGain
 The standard information gain criterion, used for calculating gain in decision trees. More...

 
struct  IsSpillTree
 
struct  IsSpillTree< tree::SpillTree< MetricType, StatisticType, MatType, HyperplaneType, SplitType > >
 
class  MADGain
 The MAD (Mean absolute deviation) gain, is a measure of set purity based on the deviation of dependent values present in the node. More...

 
class  MeanSpaceSplit
 
class  MeanSplit
 A binary space partitioning tree node is split into its left and right child. More...

 
class  MidpointSpaceSplit
 
class  MidpointSplit
 A binary space partitioning tree node is split into its left and right child. More...

 
class  MinimalCoverageSweep
 The MinimalCoverageSweep class finds a partition along which we can split a node according to the coverage of two resulting nodes. More...

 
class  MinimalSplitsNumberSweep
 The MinimalSplitsNumberSweep class finds a partition along which we can split a node according to the number of required splits of the node. More...

 
class  MSEGain
 The MSE (Mean squared error) gain, is a measure of set purity based on the variance of response values present in the node. More...

 
class  MultipleRandomDimensionSelect
 This dimension selection policy allows the selection from a few random dimensions. More...

 
class  NoAuxiliaryInformation
 
class  NumericSplitInfo
 
class  Octree
 
class  ProjVector
 ProjVector defines a general projection vector (not necessarily axis-parallel). More...

 
struct  QueueFrame
 
class  RandomBinaryNumericSplit
 The RandomBinaryNumericSplit is a splitting function for decision trees that will split based on a randomly selected point between the minimum and maximum value of the numerical dimension. More...

 
class  RandomDimensionSelect
 This dimension selection policy only selects one single random dimension. More...

 
class  RandomForest
 The RandomForest class provides an implementation of random forests, described in Breiman's seminal paper: More...

 
class  RectangleTree
 A rectangle type tree tree, such as an R-tree or X-tree. More...

 
class  RPlusPlusTreeAuxiliaryInformation
 
class  RPlusPlusTreeDescentHeuristic
 
class  RPlusPlusTreeSplitPolicy
 The RPlusPlusTreeSplitPolicy helps to determine the subtree into which we should insert a child of an intermediate node that is being split. More...

 
class  RPlusTreeDescentHeuristic
 
class  RPlusTreeSplit
 The RPlusTreeSplit class performs the split process of a node on overflow. More...

 
class  RPlusTreeSplitPolicy
 The RPlusPlusTreeSplitPolicy helps to determine the subtree into which we should insert a child of an intermediate node that is being split. More...

 
class  RPTreeMaxSplit
 This class splits a node by a random hyperplane. More...

 
class  RPTreeMeanSplit
 This class splits a binary space tree. More...

 
class  RStarTreeDescentHeuristic
 When descending a RectangleTree to insert a point, we need to have a way to choose a child node when the point isn't enclosed by any of them. More...

 
class  RStarTreeSplit
 A Rectangle Tree has new points inserted at the bottom. More...

 
class  RTreeDescentHeuristic
 When descending a RectangleTree to insert a point, we need to have a way to choose a child node when the point isn't enclosed by any of them. More...

 
class  RTreeSplit
 A Rectangle Tree has new points inserted at the bottom. More...

 
class  SpaceSplit
 
class  SpillTree
 A hybrid spill tree is a variant of binary space trees in which the children of a node can "spill over" each other, and contain shared datapoints. More...

 
class  TraversalInfo
 The TraversalInfo class holds traversal information which is used in dual-tree (and single-tree) traversals. More...

 
class  TreeTraits
 The TreeTraits class provides compile-time information on the characteristics of a given tree type. More...

 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, bound::BallBound, SplitType > >
 This is a specialization of the TreeType class to the BallTree tree type. More...

 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, bound::CellBound, SplitType > >
 This is a specialization of the TreeType class to the UBTree tree type. More...

 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, bound::HollowBallBound, SplitType > >
 This is a specialization of the TreeType class to an arbitrary tree with HollowBallBound (currently only the vantage point tree is supported). More...

 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, BoundType, RPTreeMaxSplit > >
 This is a specialization of the TreeType class to the max-split random projection tree. More...

 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, BoundType, RPTreeMeanSplit > >
 This is a specialization of the TreeType class to the mean-split random projection tree. More...

 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, BoundType, SplitType > >
 This is a specialization of the TreeTraits class to the BinarySpaceTree tree type. More...

 
class  TreeTraits< CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > >
 The specialization of the TreeTraits class for the CoverTree tree type. More...

 
class  TreeTraits< Octree< MetricType, StatisticType, MatType > >
 This is a specialization of the TreeTraits class to the Octree tree type. More...

 
class  TreeTraits< RectangleTree< MetricType, StatisticType, MatType, RPlusTreeSplit< SplitPolicyType, SweepType >, DescentType, AuxiliaryInformationType > >
 Since the R+/R++ tree can not have overlapping children, we should define traits for the R+/R++ tree. More...

 
class  TreeTraits< RectangleTree< MetricType, StatisticType, MatType, SplitType, DescentType, AuxiliaryInformationType > >
 This is a specialization of the TreeType class to the RectangleTree tree type. More...

 
class  TreeTraits< SpillTree< MetricType, StatisticType, MatType, HyperplaneType, SplitType > >
 This is a specialization of the TreeType class to the SpillTree tree type. More...

 
class  UBTreeSplit
 Split a node into two parts according to the median address of points contained in the node. More...

 
class  VantagePointSplit
 The class splits a binary space partitioning tree node according to the median distance to the vantage point. More...

 
class  XTreeAuxiliaryInformation
 The XTreeAuxiliaryInformation class provides information specific to X trees for each node in a RectangleTree. More...

 
class  XTreeSplit
 A Rectangle Tree has new points inserted at the bottom. More...

 

Typedefs

template
<
typename
MetricType
>
using AxisOrthogonalHyperplane = HyperplaneBase< bound::HRectBound< MetricType >, AxisParallelProjVector >
 AxisOrthogonalHyperplane represents a hyperplane orthogonal to an axis. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using BallTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::BallBound, MidpointSplit >
 A midpoint-split ball tree. More...

 
template
<
typename
FitnessFunction
>
using BinaryDoubleNumericSplit = BinaryNumericSplit< FitnessFunction, double >
 
typedef boost::heap::priority_queue< CosineTree *, boost::heap::compare< CompareCosineNode > > CosineNodeQueue
 
template
<
typename
FitnessFunction
=
GiniGain
,
template
<
typename
>
class
NumericSplitType
=
BestBinaryNumericSplit
,
template
<
typename
>
class
CategoricalSplitType
=
AllCategoricalSplit
,
typename
DimensionSelectType
=
AllDimensionSelect
>
using DecisionStump = DecisionTree< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectType, false >
 Convenience typedef for decision stumps (single level decision trees). More...

 
template
<
typename
TreeType
>
using DiscreteHilbertRTreeAuxiliaryInformation = HilbertRTreeAuxiliaryInformation< TreeType, DiscreteHilbertValue >
 The Hilbert R-tree, a variant of the R tree with an ordering along the Hilbert curve. More...

 
template
<
typename
FitnessFunction
=
GiniGain
,
typename
DimensionSelectionType
=
MultipleRandomDimensionSelect
,
template
<
typename
>
class
CategoricalSplitType
=
AllCategoricalSplit
>
using ExtraTrees = RandomForest< FitnessFunction, DimensionSelectionType, RandomBinaryNumericSplit, CategoricalSplitType, false >
 Convenience typedef for Extra Trees. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using HilbertRTree = RectangleTree< MetricType, StatisticType, MatType, HilbertRTreeSplit< 2 >, HilbertRTreeDescentHeuristic, DiscreteHilbertRTreeAuxiliaryInformation >
 
template
<
typename
FitnessFunction
>
using HoeffdingDoubleNumericSplit = HoeffdingNumericSplit< FitnessFunction, double >
 Convenience typedef. More...

 
typedef StreamingDecisionTree< HoeffdingTree<> > HoeffdingTreeType
 
template
<
typename
MetricType
>
using Hyperplane = HyperplaneBase< bound::BallBound< MetricType >, ProjVector >
 Hyperplane represents a general hyperplane (not necessarily axis-orthogonal). More...

 
typedef DecisionTree< InformationGain, BestBinaryNumericSplit, AllCategoricalSplit, AllDimensionSelect, true > ID3DecisionStump
 Convenience typedef for ID3 decision stumps (single level decision trees made with the ID3 algorithm). More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using KDTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, MidpointSplit >
 The standard midpoint-split kd-tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using MaxRPTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMaxSplit >
 A max-split random projection tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using MeanSplitBallTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::BallBound, MeanSplit >
 A mean-split ball tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using MeanSplitKDTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, MeanSplit >
 A mean-split kd-tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using MeanSPTree = SpillTree< MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MeanSpaceSplit >
 A mean-split hybrid spill tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using NonOrtMeanSPTree = SpillTree< MetricType, StatisticType, MatType, Hyperplane, MeanSpaceSplit >
 A mean-split hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal). More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using NonOrtSPTree = SpillTree< MetricType, StatisticType, MatType, Hyperplane, MidpointSpaceSplit >
 A hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal). More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using RPlusPlusTree = RectangleTree< MetricType, StatisticType, MatType, RPlusTreeSplit< RPlusPlusTreeSplitPolicy, MinimalSplitsNumberSweep >, RPlusPlusTreeDescentHeuristic, RPlusPlusTreeAuxiliaryInformation >
 The R++ tree, a variant of the R+ tree with maximum buonding rectangles. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using RPlusTree = RectangleTree< MetricType, StatisticType, MatType, RPlusTreeSplit< RPlusTreeSplitPolicy, MinimalCoverageSweep >, RPlusTreeDescentHeuristic, NoAuxiliaryInformation >
 The R+ tree, a variant of the R tree that avoids overlapping rectangles. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using RPTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMeanSplit >
 A mean-split random projection tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using RStarTree = RectangleTree< MetricType, StatisticType, MatType, RStarTreeSplit, RStarTreeDescentHeuristic, NoAuxiliaryInformation >
 The R*-tree, a more recent variant of the R tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using RTree = RectangleTree< MetricType, StatisticType, MatType, RTreeSplit, RTreeDescentHeuristic, NoAuxiliaryInformation >
 An implementation of the R tree that satisfies the TreeType policy API. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using SPTree = SpillTree< MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MidpointSpaceSplit >
 The hybrid spill tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using StandardCoverTree = CoverTree< MetricType, StatisticType, MatType, FirstPointIsRoot >
 The standard cover tree, as detailed in the original cover tree paper: More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using UBTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::CellBound, UBTreeSplit >
 The Universal B-tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using VPTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HollowBallBound, VPTreeSplit >
 
template
<
typename
BoundType
,
typename
MatType
=
arma::mat
>
using VPTreeSplit = VantagePointSplit< BoundType, MatType, 100 >
 The vantage point tree (which is also called the metric tree. More...

 
template
<
typename
MetricType
,
typename
StatisticType
,
typename
MatType
>
using XTree = RectangleTree< MetricType, StatisticType, MatType, XTreeSplit, RTreeDescentHeuristic, XTreeAuxiliaryInformation >
 The X-tree, a variant of the R tree with supernodes. More...

 

Functions

template<bool UseWeights, typename MatType , typename LabelsType , typename WeightsType >
void Bootstrap (const MatType &dataset, const LabelsType &labels, const WeightsType &weights, MatType &bootstrapDataset, LabelsType &bootstrapLabels, WeightsType &bootstrapWeights)
 Given a dataset, create another dataset via bootstrap sampling, with labels. More...

 
template
<
class
TreeType
,
class
Walker
>
void EnumerateTree (TreeType *tree, Walker &walker)
 Traverses all nodes of the tree, including the inner ones. More...

 
 HAS_MEM_FUNC (BinaryGains, HasBinaryGains)
 

Variables

const double MAX_OVERLAP = 0.2
 The X-tree paper says that a maximum allowable overlap of 20% works well. More...

 

Detailed Description

Trees and tree-building procedures.

Typedef Documentation

◆ AxisOrthogonalHyperplane

AxisOrthogonalHyperplane represents a hyperplane orthogonal to an axis.

Definition at line 145 of file hyperplane.hpp.

◆ BallTree

using BallTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::BallBound, MidpointSplit>

A midpoint-split ball tree.

This tree holds its points only in the leaves, similar to the KDTree and MeanSplitKDTree. However, the bounding shape of each node is a ball, not a hyper-rectangle. This can make the ball tree advantageous in some higher-dimensional situations and for some datasets. The tree construction algorithm here is the same as Omohundro's 'K-d construction algorithm', except the splitting value is the midpoint, not the median. This can result in trees that better reflect the data, although they may be unbalanced.

@techreport{omohundro1989five,
author={S.M. Omohundro},
title={Five balltree construction algorithms},
year={1989},
institution={University of California, Berkeley International Computer
Science Institute Technical Reports},
number={TR-89-063}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, KDTree, MeanSplitBallTree

Definition at line 112 of file typedef.hpp.

◆ BinaryDoubleNumericSplit

using BinaryDoubleNumericSplit = BinaryNumericSplit<FitnessFunction, double>

Definition at line 128 of file binary_numeric_split.hpp.

◆ CosineNodeQueue

typedef boost::heap::priority_queue<CosineTree*, boost::heap::compare<CompareCosineNode> > CosineNodeQueue

Definition at line 23 of file cosine_tree.hpp.

◆ DecisionStump

using DecisionStump = DecisionTree<FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectType, false>

Convenience typedef for decision stumps (single level decision trees).

Definition at line 585 of file decision_tree.hpp.

◆ DiscreteHilbertRTreeAuxiliaryInformation

The Hilbert R-tree, a variant of the R tree with an ordering along the Hilbert curve.

This template typedef satisfies the TreeType policy API.

@inproceedings{kamel1994r,
author = {Kamel, Ibrahim and Faloutsos, Christos},
title = {Hilbert R-tree: An Improved R-tree Using Fractals},
booktitle = {Proceedings of the 20th International Conference on Very Large Data Bases},
series = {VLDB '94},
year = {1994},
isbn = {1-55860-153-8},
pages = {500--509},
numpages = {10},
url = {http://dl.acm.org/citation.cfm?id=645920.673001},
acmid = {673001},
publisher = {Morgan Kaufmann Publishers Inc.},
address = {San Francisco, CA, USA}
}
See also
The TreeType policy in mlpack, RTree, DiscreteHilbertRTree

Definition at line 128 of file typedef.hpp.

◆ ExtraTrees

using ExtraTrees = RandomForest<FitnessFunction, DimensionSelectionType, RandomBinaryNumericSplit, CategoricalSplitType, false>

Convenience typedef for Extra Trees.

(Extremely Randomized Trees Forest)

@article{10.1007/s10994-006-6226-1,
author = {Geurts, Pierre and Ernst, Damien and Wehenkel, Louis},
title = {Extremely Randomized Trees},
year = {2006},
issue_date = {April 2006},
publisher = {Kluwer Academic Publishers},
address = {USA},
volume = {63},
number = {1},
issn = {0885-6125},
url = {https://doi.org/10.1007/s10994-006-6226-1},
doi = {10.1007/s10994-006-6226-1},
journal = {Mach. Learn.},
month = apr,
pages = {3–42},
numpages = {40},
}

Definition at line 443 of file random_forest.hpp.

◆ HilbertRTree

Definition at line 136 of file typedef.hpp.

◆ HoeffdingDoubleNumericSplit

using HoeffdingDoubleNumericSplit = HoeffdingNumericSplit<FitnessFunction, double>

Convenience typedef.

Definition at line 148 of file hoeffding_numeric_split.hpp.

◆ HoeffdingTreeType

typedef StreamingDecisionTree<HoeffdingTree<> > HoeffdingTreeType

Definition at line 21 of file typedef.hpp.

◆ Hyperplane

Hyperplane represents a general hyperplane (not necessarily axis-orthogonal).

Definition at line 151 of file hyperplane.hpp.

◆ ID3DecisionStump

Convenience typedef for ID3 decision stumps (single level decision trees made with the ID3 algorithm).

Definition at line 595 of file decision_tree.hpp.

◆ KDTree

using KDTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, MidpointSplit>

The standard midpoint-split kd-tree.

This is not the original formulation by Bentley but instead the later formulation by Deng and Moore, which only holds points in the leaves of the tree. When recursively splitting nodes, the KDTree class select the dimension with maximum variance to split on, and picks the midpoint of the range in that dimension as the value on which to split nodes.

For more information, see the following papers.

@article{bentley1975multidimensional,
title={Multidimensional binary search trees used for associative searching},
author={Bentley, J.L.},
journal={Communications of the ACM},
volume={18},
number={9},
pages={509--517},
year={1975},
publisher={ACM}
}
@inproceedings{deng1995multiresolution,
title={Multiresolution instance-based learning},
author={Deng, K. and Moore, A.W.},
booktitle={Proceedings of the 1995 International Joint Conference on AI
(IJCAI-95)},
pages={1233--1239},
year={1995}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, MeanSplitKDTree

Definition at line 63 of file typedef.hpp.

◆ MaxRPTree

using MaxRPTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMaxSplit>

A max-split random projection tree.

When recursively splitting nodes, the MaxSplitRPTree class selects a random hyperplane and splits a node by the hyperplane. The tree holds points in leaf nodes. In contrast to the k-d tree, children of a MaxSplitRPTree node may overlap.

@inproceedings{dasgupta2008,
author = {Dasgupta, Sanjoy and Freund, Yoav},
title = {Random Projection Trees and Low Dimensional Manifolds},
booktitle = {Proceedings of the Fortieth Annual ACM Symposium on Theory of
Computing},
series = {STOC '08},
year = {2008},
pages = {537--546},
numpages = {10},
publisher = {ACM},
address = {New York, NY, USA},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

Definition at line 232 of file typedef.hpp.

◆ MeanSplitBallTree

using MeanSplitBallTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::BallBound, MeanSplit>

A mean-split ball tree.

This tree, like the BallTree, holds its points only in the leaves. The tree construction algorithm here is the same as Omohundro's 'K-dc onstruction algorithm', except the splitting value is the mean, not the median. This can result in trees that better reflect the data, although they may be unbalanced.

@techreport{omohundro1989five,
author={S.M. Omohundro},
title={Five balltree construction algorithms},
year={1989},
institution={University of California, Berkeley International Computer
Science Institute Technical Reports},
number={TR-89-063}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

Definition at line 141 of file typedef.hpp.

◆ MeanSplitKDTree

using MeanSplitKDTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, MeanSplit>

A mean-split kd-tree.

This is the same as the KDTree, but this particular implementation will use the mean of the data in the split dimension as the value on which to split, instead of the midpoint. This can sometimes give better performance, but it is not always clear which type of tree is best.

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, KDTree

Definition at line 80 of file typedef.hpp.

◆ MeanSPTree

using MeanSPTree = SpillTree<MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MeanSpaceSplit>

A mean-split hybrid spill tree.

This is the same as the SPTree, but this particular implementation will use the mean of the data in the split dimension as the value on which to split, instead of the midpoint. This can sometimes give better performance, but it is not always clear which type of tree is best.

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, SPTree

Definition at line 80 of file typedef.hpp.

◆ NonOrtMeanSPTree

using NonOrtMeanSPTree = SpillTree<MetricType, StatisticType, MatType, Hyperplane, MeanSpaceSplit>

A mean-split hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal).

This is the same as the NonOrtSPTree, but this particular implementation will use the mean of the data in the split projection as the value on which to split, instead of the midpoint. This can sometimes give better performance, but it is not always clear which type of tree is best.

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, MeanSPTree, NonOrtSPTree

Definition at line 119 of file typedef.hpp.

◆ NonOrtSPTree

using NonOrtSPTree = SpillTree<MetricType, StatisticType, MatType, Hyperplane, MidpointSpaceSplit>

A hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal).

This particular implementation will consider the midpoint of the projection of the data in the vector determined by the farthest pair of points. This can sometimes give better performance, but generally it doesn't because it takes O(d) to calculate the projection of the query point when deciding which node to traverse, while when using a axis-orthogonal hyperplane, as SPTree does, we can do it in O(1).

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, SPTree

Definition at line 100 of file typedef.hpp.

◆ RPlusPlusTree

The R++ tree, a variant of the R+ tree with maximum buonding rectangles.

This template typedef satisfies the TreeType policy API.

@inproceedings{sumak2014r,
author = {{\v{S}}um{\'a}k, Martin and Gursk{\'y}, Peter},
title = {R++-Tree: An Efficient Spatial Access Method for Highly Redundant
Point Data},
booktitle = {New Trends in Databases and Information Systems: 17th East
European Conference on Advances in Databases and Information Systems},
year = {2014},
isbn = {978-3-319-01863-8},
pages = {37--44},
publisher = {Springer International Publishing},
}
See also
The TreeType policy in mlpack, RTree, RTree, RPlusTree, RPlusPlusTree

Definition at line 197 of file typedef.hpp.

◆ RPlusTree

The R+ tree, a variant of the R tree that avoids overlapping rectangles.

The implementation is modified from the original paper implementation. This template typedef satisfies the TreeType policy API.

@inproceedings{sellis1987r,
author = {Sellis, Timos K. and Roussopoulos, Nick and Faloutsos, Christos},
title = {The R+-Tree: A Dynamic Index for Multi-Dimensional Objects},
booktitle = {Proceedings of the 13th International Conference on Very
Large Data Bases},
series = {VLDB '87},
year = {1987},
isbn = {0-934613-46-X},
pages = {507--518},
numpages = {12},
publisher = {Morgan Kaufmann Publishers Inc.},
address = {San Francisco, CA, USA},
}
See also
The TreeType policy in mlpack, RTree, RTree, RPlusTree

Definition at line 168 of file typedef.hpp.

◆ RPTree

using RPTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMeanSplit>

A mean-split random projection tree.

When recursively splitting nodes, the RPTree class may perform one of two different kinds of split. Depending on the diameter and the average distance between points, the node may be split by a random hyperplane or according to the distance from the mean point. The tree holds points in leaf nodes. In contrast to the k-d tree, children of a MaxSplitRPTree node may overlap.

@inproceedings{dasgupta2008,
author = {Dasgupta, Sanjoy and Freund, Yoav},
title = {Random Projection Trees and Low Dimensional Manifolds},
booktitle = {Proceedings of the Fortieth Annual ACM Symposium on Theory of
Computing},
series = {STOC '08},
year = {2008},
pages = {537--546},
numpages = {10},
publisher = {ACM},
address = {New York, NY, USA},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

Definition at line 266 of file typedef.hpp.

◆ RStarTree

The R*-tree, a more recent variant of the R tree.

This template typedef satisfies the TreeType policy API.

@inproceedings{beckmann1990r,
title={The R*-tree: an efficient and robust access method for points and
rectangles},
author={Beckmann, N. and Kriegel, H.-P. and Schneider, R. and Seeger, B.},
booktitle={Proceedings of the 1990 ACM SIGMOD International Conference on
Management of Data (SIGMOD '90)},
volume={19},
number={2},
year={1990},
publisher={ACM}
}
See also
The TreeType policy in mlpack, RTree

Definition at line 75 of file typedef.hpp.

◆ RTree

using RTree = RectangleTree<MetricType, StatisticType, MatType, RTreeSplit, RTreeDescentHeuristic, NoAuxiliaryInformation>

An implementation of the R tree that satisfies the TreeType policy API.

This is the same R-tree structure as proposed by Guttman:

@inproceedings{guttman1984r,
title={R-trees: a dynamic index structure for spatial searching},
author={Guttman, A.},
booktitle={Proceedings of the 1984 ACM SIGMOD International Conference on
Management of Data (SIGMOD '84)},
volume={14},
number={2},
year={1984},
publisher={ACM}
}
See also
The TreeType policy in mlpack, RStarTree

Definition at line 47 of file typedef.hpp.

◆ SPTree

using SPTree = SpillTree<MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MidpointSpaceSplit>

The hybrid spill tree.

It is a variant of metric-trees in which the children of a node can "spill over" onto each other, and contain shared datapoints.

When recursively splitting nodes, the SPTree class select the dimension with maximum width to split on, and picks the midpoint of the range in that dimension as the value on which to split nodes.

In each case a "overlapping buffer" is defined, included points at a distance less than tau from the decision boundary defined by the midpoint.

For each node, we first split the points considering the overlapping buffer. If either of its children contains more than rho fraction of the total points we undo the overlapping splitting. Instead a conventional partition is used. In this way, we can ensure that each split reduces the number of points of a node by at least a constant factor.

For more information, see the following paper.

@inproceedings{
author = {Ting Liu, Andrew W. Moore, Alexander Gray and Ke Yang},
title = {An Investigation of Practical Approximate Nearest Neighbor
Algorithms},
booktitle = {Advances in Neural Information Processing Systems 17},
year = {2005},
pages = {825--832}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, MeanSPTree

Definition at line 62 of file typedef.hpp.

◆ StandardCoverTree

using StandardCoverTree = CoverTree<MetricType, StatisticType, MatType, FirstPointIsRoot>

The standard cover tree, as detailed in the original cover tree paper:

@inproceedings{
author={Beygelzimer, A. and Kakade, S. and Langford, J.},
title={Cover trees for nearest neighbor},
booktitle={Proceedings of the 23rd International Conference on Machine
Learning (ICML 2006)},
pages={97--104},
year={2006}
}

This template typedef satisfies the requirements of the TreeType API.

See also
The TreeType policy in mlpack, CoverTree

Definition at line 42 of file typedef.hpp.

◆ UBTree

using UBTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::CellBound, UBTreeSplit>

The Universal B-tree.

When recursively splitting nodes, the class calculates addresses of all points and splits each node according to the median address. Children may overlap since the implementation of a tighter bound requires a lot of arithmetic operations. In order to get a tighter bound increase the CellBound::maxNumBounds constant.

@inproceedings{bayer1997,
author = {Bayer, Rudolf},
title = {The Universal B-Tree for Multidimensional Indexing: General
Concepts},
booktitle = {Proceedings of the International Conference on Worldwide
Computing and Its Applications},
series = {WWCA '97},
year = {1997},
isbn = {3-540-63343-X},
pages = {198--209},
numpages = {12},
publisher = {Springer-Verlag},
address = {London, UK, UK},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

Definition at line 301 of file typedef.hpp.

◆ VPTree

using VPTree = BinarySpaceTree<MetricType, StatisticType, MatType, bound::HollowBallBound, VPTreeSplit>

Definition at line 199 of file typedef.hpp.

◆ VPTreeSplit

using VPTreeSplit = VantagePointSplit<BoundType, MatType, 100>

The vantage point tree (which is also called the metric tree.

Vantage point trees and metric trees were invented independently by Yianilos an Uhlmann) is a kind of the binary space tree. When recursively splitting nodes, the VPTree class selects the vantage point and splits the node according to the distance to this point. Thus, points that are closer to the vantage point form the inner subtree. Other points form the outer subtree. The vantage point is contained in the first (inner) node.

This implementation differs from the original algorithms. Namely, vantage points are not contained in intermediate nodes. The tree has points only in the leaves of the tree.

For more information, see the following papers.

@inproceedings{yianilos1993vptrees,
author = {Yianilos, Peter N.},
title = {Data Structures and Algorithms for Nearest Neighbor Search in
General Metric Spaces},
booktitle = {Proceedings of the Fourth Annual ACM-SIAM Symposium on
Discrete Algorithms},
series = {SODA '93},
year = {1993},
isbn = {0-89871-313-7},
pages = {311--321},
numpages = {11},
publisher = {Society for Industrial and Applied Mathematics},
address = {Philadelphia, PA, USA}
}
@article{uhlmann1991metrictrees,
author = {Jeffrey K. Uhlmann},
title = {Satisfying general proximity / similarity queries with metric
trees},
journal = {Information Processing Letters},
volume = {40},
number = {4},
pages = {175 - 179},
year = {1991},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, VantagePointTree, VPTree

Definition at line 192 of file typedef.hpp.

◆ XTree

using XTree = RectangleTree<MetricType, StatisticType, MatType, XTreeSplit, RTreeDescentHeuristic, XTreeAuxiliaryInformation>

The X-tree, a variant of the R tree with supernodes.

This template typedef satisfies the TreeType policy API.

@inproceedings{berchtold1996r,
title = {The X-Tree: An Index Structure for High--Dimensional Data},
author = {Berchtold, Stefan and Keim, Daniel A. and Kriegel, Hans-Peter},
booktitle = {Proc. 22th Int. Conf. on Very Large Databases (VLDB'96), Bombay, India},
editor = {Vijayaraman, T. and Buchmann, Alex and Mohan, C. and Sarda, N.},
pages = {28--39},
year = {1996},
publisher = {Morgan Kaufmann}
}
See also
The TreeType policy in mlpack, RTree, RStarTree

Definition at line 101 of file typedef.hpp.

Function Documentation

◆ Bootstrap()

void mlpack::tree::Bootstrap ( const MatType &  dataset,
const LabelsType &  labels,
const WeightsType &  weights,
MatType &  bootstrapDataset,
LabelsType &  bootstrapLabels,
WeightsType &  bootstrapWeights 
)

Given a dataset, create another dataset via bootstrap sampling, with labels.

Definition at line 26 of file bootstrap.hpp.

◆ EnumerateTree()

void mlpack::tree::EnumerateTree ( TreeType *  tree,
Walker &  walker 
)
inline

Traverses all nodes of the tree, including the inner ones.

On each node two methods of the enumer are called:

Enter(TreeType* node, TreeType* parent); Leave(TreeType* node, TreeType* parent);

Parameters
treeThe tree to traverse.
walkerAn instance of custom class, receiver of the enumeration.

Definition at line 56 of file enumerate_tree.hpp.

References mlpack::tree::enumerate::EnumerateTreeImpl().

◆ HAS_MEM_FUNC()

mlpack::tree::HAS_MEM_FUNC ( BinaryGains  ,
HasBinaryGains   
)

Variable Documentation

◆ MAX_OVERLAP

const double MAX_OVERLAP = 0.2

The X-tree paper says that a maximum allowable overlap of 20% works well.

This code should eventually be refactored so as to avoid polluting mlpack::tree with this random double.

Definition at line 29 of file x_tree_split.hpp.