The simple Naive Bayes classifier. More...
Public Types | |
typedef ModelMatType::elem_type | ElemType |
Public Member Functions | |
template < typename MatType > | |
NaiveBayesClassifier (const MatType &data, const arma::Row< size_t > &labels, const size_t numClasses, const bool incrementalVariance=false, const double epsilon=1e-10) | |
Initializes the classifier as per the input and then trains it by calculating the sample mean and variances. More... | |
NaiveBayesClassifier (const size_t dimensionality=0, const size_t numClasses=0, const double epsilon=1e-10) | |
Initialize the Naive Bayes classifier without performing training. More... | |
template < typename VecType > | |
size_t | Classify (const VecType &point) const |
Classify the given point, using the trained NaiveBayesClassifier model. More... | |
template < typename VecType , typename ProbabilitiesVecType > | |
void | Classify (const VecType &point, size_t &prediction, ProbabilitiesVecType &probabilities) const |
Classify the given point using the trained NaiveBayesClassifier model and also return estimates of the probability for each class in the given vector. More... | |
template < typename MatType > | |
void | Classify (const MatType &data, arma::Row< size_t > &predictions) const |
Classify the given points using the trained NaiveBayesClassifier model. More... | |
template < typename MatType , typename ProbabilitiesMatType > | |
void | Classify (const MatType &data, arma::Row< size_t > &predictions, ProbabilitiesMatType &probabilities) const |
Classify the given points using the trained NaiveBayesClassifier model and also return estimates of the probabilities for each class in the given matrix. More... | |
const ModelMatType & | Means () const |
Get the sample means for each class. More... | |
ModelMatType & | Means () |
Modify the sample means for each class. More... | |
const ModelMatType & | Probabilities () const |
Get the prior probabilities for each class. More... | |
ModelMatType & | Probabilities () |
Modify the prior probabilities for each class. More... | |
template < typename Archive > | |
void | serialize (Archive &ar, const uint32_t) |
Serialize the classifier. More... | |
template < typename MatType > | |
void | Train (const MatType &data, const arma::Row< size_t > &labels, const size_t numClasses, const bool incremental=true) |
Train the Naive Bayes classifier on the given dataset. More... | |
template < typename VecType > | |
void | Train (const VecType &point, const size_t label) |
Train the Naive Bayes classifier on the given point. More... | |
const ModelMatType & | Variances () const |
Get the sample variances for each class. More... | |
ModelMatType & | Variances () |
Modify the sample variances for each class. More... | |
The simple Naive Bayes classifier.
This class trains on the data by calculating the sample mean and variance of the features with respect to each of the labels, and also the class probabilities. The class labels are assumed to be positive integers (starting with 0), and are expected to be the last row of the data input to the constructor.
Mathematically, it computes P(X_i = x_i | Y = y_j) for each feature X_i for each of the labels y_j. Along with this, it also computes the class probabilities P(Y = y_j).
For classifying a data point (x_1, x_2, ..., x_n), it computes the following: arg max_y(P(Y = y)*P(X_1 = x_1 | Y = y) * ... * P(X_n = x_n | Y = y))
Example use:
The ModelMatType template parameter specifies the internal matrix type that NaiveBayesClassifier will use to hold the means, variances, and weights that make up the Naive Bayes model. This can be arma::mat, arma::fmat, or any other Armadillo (or Armadillo-compatible) object. Because ModelMatType may be different than the type of the data the model is trained on, now training is possible with subviews, sparse matrices, or anything else, while still storing the model as a ModelMatType internally.
ModelMatType | Internal matrix type to use to store the model. |
Definition at line 58 of file naive_bayes_classifier.hpp.
typedef ModelMatType::elem_type ElemType |
Definition at line 62 of file naive_bayes_classifier.hpp.
NaiveBayesClassifier | ( | const MatType & | data, |
const arma::Row< size_t > & | labels, | ||
const size_t | numClasses, | ||
const bool | incrementalVariance = false , |
||
const double | epsilon = 1e-10 |
||
) |
Initializes the classifier as per the input and then trains it by calculating the sample mean and variances.
Example use:
data | Training data points. |
labels | Labels corresponding to training data points. |
numClasses | Number of classes in this classifier. |
incrementalVariance | If true, an incremental algorithm is used to calculate the variance; this can prevent loss of precision in some cases, but will be somewhat slower to calculate. |
epsilon | Small value to prevent log of zero. |
NaiveBayesClassifier | ( | const size_t | dimensionality = 0 , |
const size_t | numClasses = 0 , |
||
const double | epsilon = 1e-10 |
||
) |
Initialize the Naive Bayes classifier without performing training.
All of the parameters of the model will be initialized to zero. Be sure to use Train() before calling Classify(), otherwise the results may be meaningless.
size_t Classify | ( | const VecType & | point | ) | const |
Classify the given point, using the trained NaiveBayesClassifier model.
The predicted label is returned.
point | Point to classify. |
void Classify | ( | const VecType & | point, |
size_t & | prediction, | ||
ProbabilitiesVecType & | probabilities | ||
) | const |
Classify the given point using the trained NaiveBayesClassifier model and also return estimates of the probability for each class in the given vector.
point | Point to classify. |
prediction | This will be set to the predicted class of the point. |
probabilities | This will be filled with class probabilities for the point. |
void Classify | ( | const MatType & | data, |
arma::Row< size_t > & | predictions | ||
) | const |
Classify the given points using the trained NaiveBayesClassifier model.
The predicted labels for each point are stored in the given vector.
data | List of data points. |
predictions | Vector that class predictions will be placed into. |
void Classify | ( | const MatType & | data, |
arma::Row< size_t > & | predictions, | ||
ProbabilitiesMatType & | probabilities | ||
) | const |
Classify the given points using the trained NaiveBayesClassifier model and also return estimates of the probabilities for each class in the given matrix.
The predicted labels for each point are stored in the given vector.
data | Set of points to classify. |
predictions | This will be filled with predictions for each point. |
probabilities | This will be filled with class probabilities for each point. Each row represents a point. |
MatType | Type of data to be classified. |
ProbabilitiesMatType | Type to store output probabilities in. |
|
inline |
Get the sample means for each class.
Definition at line 203 of file naive_bayes_classifier.hpp.
|
inline |
Modify the sample means for each class.
Definition at line 205 of file naive_bayes_classifier.hpp.
|
inline |
Get the prior probabilities for each class.
Definition at line 213 of file naive_bayes_classifier.hpp.
|
inline |
Modify the prior probabilities for each class.
Definition at line 215 of file naive_bayes_classifier.hpp.
References NaiveBayesClassifier< ModelMatType >::serialize().
void serialize | ( | Archive & | ar, |
const uint32_t | |||
) |
Serialize the classifier.
Referenced by NaiveBayesClassifier< ModelMatType >::Probabilities().
void Train | ( | const MatType & | data, |
const arma::Row< size_t > & | labels, | ||
const size_t | numClasses, | ||
const bool | incremental = true |
||
) |
Train the Naive Bayes classifier on the given dataset.
If the incremental algorithm is used, the current model is used as a starting point (this is the default). If the incremental algorithm is not used, then the current model is ignored and the new model will be trained only on the given data. Note that even if the incremental algorithm is not used, the data must have the same dimensionality and number of classes that the model was initialized with. If you want to change the dimensionality or number of classes, either re-initialize or call Means(), Variances(), and Probabilities() individually to set them to the right size.
data | The dataset to train on. |
labels | The labels for the dataset. |
numClasses | The numbe of classes in the dataset. |
incremental | Whether or not to use the incremental algorithm for training. |
void Train | ( | const VecType & | point, |
const size_t | label | ||
) |
Train the Naive Bayes classifier on the given point.
This will use the incremental algorithm for updating the model parameters. The data must be the same dimensionality as the existing model parameters.
point | Data point to train on. |
label | Label of data point. |
|
inline |
Get the sample variances for each class.
Definition at line 208 of file naive_bayes_classifier.hpp.
|
inline |
Modify the sample variances for each class.
Definition at line 210 of file naive_bayes_classifier.hpp.