Simple Sample mlpack Programs

Introduction

On this page, several simple mlpack examples are contained, in increasing order of complexity. If you compile from the command-line, be sure that your compiler is in C++11 mode. With modern gcc and clang, this should already be the default.

Note
The command-line programs like knn_main.cpp and logistic_regression_main.cpp from the directory src/mlpack/methods/ cannot be compiled easily by hand (the same is true for the individual tests in src/mlpack/tests/); instead, those should be compiled with CMake, by running, e.g., make mlpack_knn or make mlpack_test; see Building mlpack From Source. However, any program that uses mlpack (and is not a part of the library itself) can be compiled easily with g++ or clang from the command line.

Covariance Computation

A simple program to compute the covariance of a data matrix ("data.csv"), assuming that the data is already centered, and save it to file.

// Includes all relevant components of mlpack.
#include <mlpack/core.hpp>
// Convenience.
using namespace mlpack;
int main()
{
// First, load the data.
arma::mat data;
// Use data::Load() which transposes the matrix.
data::Load("data.csv", data, true);
// Now compute the covariance. We assume that the data is already centered.
// Remember, because the matrix is column-major, the covariance operation is
// transposed.
arma::mat cov = data * trans(data) / data.n_cols;
// Save the output.
data::Save("cov.csv", cov, true);
}

Nearest Neighbor

This simple program uses the mlpack::neighbor::NeighborSearch object to find the nearest neighbor of each point in a dataset using the L1 metric, and then print the index of the neighbor and the distance of it to stdout.

#include <mlpack/core.hpp>
using namespace mlpack;
using namespace mlpack::neighbor; // NeighborSearch and NearestNeighborSort
using namespace mlpack::metric; // ManhattanDistance
int main()
{
// Load the data from data.csv (hard-coded). Use IO for simple command-line
// parameter handling.
arma::mat data;
data::Load("data.csv", data, true);
// Use templates to specify that we want a NeighborSearch object which uses
// the Manhattan distance.
// Create the object we will store the nearest neighbors in.
arma::Mat<size_t> neighbors;
arma::mat distances; // We need to store the distance too.
// Compute the neighbors.
nn.Search(1, neighbors, distances);
// Write each neighbor and distance using Log.
for (size_t i = 0; i < neighbors.n_elem; ++i)
{
std::cout << "Nearest neighbor of point " << i << " is point "
<< neighbors[i] << " and the distance is " << distances[i] << ".\n";
}
}

Other examples

For more complex examples, it is useful to refer to the main executables, found in src/mlpack/methods/. A few are listed below.

  • methods/neighbor_search/knn_main.cpp
  • methods/neighbor_search/kfn_main.cpp
  • methods/emst/emst_main.cpp
  • methods/radical/radical_main.cpp
  • methods/nca/nca_main.cpp
  • methods/naive_bayes/nbc_main.cpp
  • methods/pca/pca_main.cpp
  • methods/lars/lars_main.cpp
  • methods/linear_regression/linear_regression_main.cpp
  • methods/gmm/gmm_main.cpp
  • methods/kmeans/kmeans_main.cpp