An implementation of Sparse Coding with Dictionary Learning that achieves sparsity via an l1-norm regularizer on the codes (LASSO) or an (l1+l2)-norm regularizer on the codes (the Elastic Net). More...
Public Member Functions | |
template < typename DictionaryInitializer = DataDependentRandomInitializer > | |
SparseCoding (const arma::mat &data, const size_t atoms, const double lambda1, const double lambda2=0, const size_t maxIterations=0, const double objTolerance=0.01, const double newtonTolerance=1e-6, const DictionaryInitializer &initializer=DictionaryInitializer()) | |
Set the parameters to SparseCoding. More... | |
SparseCoding (const size_t atoms=0, const double lambda1=0, const double lambda2=0, const size_t maxIterations=0, const double objTolerance=0.01, const double newtonTolerance=1e-6) | |
Set the parameters to SparseCoding. More... | |
size_t | Atoms () const |
Access the number of atoms. More... | |
size_t & | Atoms () |
Modify the number of atoms. More... | |
const arma::mat & | Dictionary () const |
Access the dictionary. More... | |
arma::mat & | Dictionary () |
Modify the dictionary. More... | |
void | Encode (const arma::mat &data, arma::mat &codes) |
Sparse code each point in the given dataset via LARS, using the current dictionary and store the encoded data in the codes matrix. More... | |
double | Lambda1 () const |
Access the L1 regularization term. More... | |
double & | Lambda1 () |
Modify the L1 regularization term. More... | |
double | Lambda2 () const |
Access the L2 regularization term. More... | |
double & | Lambda2 () |
Modify the L2 regularization term. More... | |
size_t | MaxIterations () const |
Get the maximum number of iterations. More... | |
size_t & | MaxIterations () |
Modify the maximum number of iterations. More... | |
double | NewtonTolerance () const |
Get the tolerance for Newton's method (dictionary optimization step). More... | |
double & | NewtonTolerance () |
Modify the tolerance for Newton's method (dictionary optimization step). More... | |
double | Objective (const arma::mat &data, const arma::mat &codes) const |
Compute the objective function. More... | |
double | ObjTolerance () const |
Get the objective tolerance. More... | |
double & | ObjTolerance () |
Modify the objective tolerance. More... | |
double | OptimizeDictionary (const arma::mat &data, const arma::mat &codes, const arma::uvec &adjacencies) |
Learn dictionary via Newton method based on Lagrange dual. More... | |
void | ProjectDictionary () |
Project each atom of the dictionary back onto the unit ball, if necessary. More... | |
template < typename Archive > | |
void | serialize (Archive &ar, const uint32_t) |
Serialize the sparse coding model. More... | |
template < typename DictionaryInitializer = DataDependentRandomInitializer > | |
double | Train (const arma::mat &data, const DictionaryInitializer &initializer=DictionaryInitializer()) |
Train the sparse coding model on the given dataset. More... | |
An implementation of Sparse Coding with Dictionary Learning that achieves sparsity via an l1-norm regularizer on the codes (LASSO) or an (l1+l2)-norm regularizer on the codes (the Elastic Net).
Let d be the number of dimensions in the original space, m the number of training points, and k the number of atoms in the dictionary (the dimension of the learned feature space). The training data X is a d-by-m matrix where each column is a point and each row is a dimension. The dictionary D is a d-by-k matrix, and the sparse codes matrix Z is a k-by-m matrix. This program seeks to minimize the objective:
subject to for where typically and .
This problem is solved by an algorithm that alternates between a dictionary learning step and a sparse coding step. The dictionary learning step updates the dictionary D using a Newton method based on the Lagrange dual (see the paper below for details). The sparse coding step involves solving a large number of sparse linear regression problems; this can be done efficiently using LARS, an algorithm that can solve the LASSO or the Elastic Net (papers below).
Here are those papers:
Note that the implementation here does not use the feature-sign search algorithm from Honglak Lee's paper, but instead the LARS algorithm suggested in that paper.
When Train() is called, the dictionary is initialized using the DictionaryInitializationPolicy class. Possible choices include the RandomInitializer, which provides an entirely random dictionary, the DataDependentRandomInitializer, which provides a random dictionary based loosely on characteristics of the dataset, and the NothingInitializer, which does not initialize the dictionary – instead, the user should set the dictionary using the Dictionary() mutator method.
Once a dictionary is trained with Train(), another matrix may be encoded with the Encode() function.
DictionaryInitializationPolicy | The class to use to initialize the dictionary; must have 'void Initialize(const arma::mat& data, arma::mat& dictionary)' function. |
Definition at line 115 of file sparse_coding.hpp.
SparseCoding | ( | const arma::mat & | data, |
const size_t | atoms, | ||
const double | lambda1, | ||
const double | lambda2 = 0 , |
||
const size_t | maxIterations = 0 , |
||
const double | objTolerance = 0.01 , |
||
const double | newtonTolerance = 1e-6 , |
||
const DictionaryInitializer & | initializer = DictionaryInitializer() |
||
) |
Set the parameters to SparseCoding.
lambda2 defaults to 0. This constructor will train the model. If that is not desired, call the other constructor that does not take a data matrix. This constructor will also initialize the dictionary using the given DictionaryInitializer before training.
If you want to initialize the dictionary to a custom matrix, consider either writing your own DictionaryInitializer class (with void Initialize(const arma::mat& data, arma::mat& dictionary) function), or call the constructor that does not take a data matrix, then call Dictionary() to set the dictionary matrix to a matrix of your choosing, and then call Train() with NothingInitializer (i.e. Train<NothingInitializer>(data)).
data | Data matrix. |
atoms | Number of atoms in dictionary. |
lambda1 | Regularization parameter for l1-norm penalty. |
lambda2 | Regularization parameter for l2-norm penalty. |
maxIterations | Maximum number of iterations to run algorithm. If 0, the algorithm will run until convergence (or forever). |
objTolerance | Tolerance for objective function. When an iteration of the algorithm produces an improvement smaller than this, the algorithm will terminate. |
newtonTolerance | Tolerance for the Newton's method dictionary optimization step. |
initializer | The initializer to use. |
SparseCoding | ( | const size_t | atoms = 0 , |
const double | lambda1 = 0 , |
||
const double | lambda2 = 0 , |
||
const size_t | maxIterations = 0 , |
||
const double | objTolerance = 0.01 , |
||
const double | newtonTolerance = 1e-6 |
||
) |
Set the parameters to SparseCoding.
lambda2 defaults to 0. This constructor will not train the model, and a subsequent call to Train() will be required before the model can encode points with Encode().
atoms | Number of atoms in dictionary. |
lambda1 | Regularization parameter for l1-norm penalty. |
lambda2 | Regularization parameter for l2-norm penalty. |
maxIterations | Maximum number of iterations to run algorithm. If 0, the algorithm will run until convergence (or forever). |
objTolerance | Tolerance for objective function. When an iteration of the algorithm produces an improvement smaller than this, the algorithm will terminate. |
newtonTolerance | Tolerance for the Newton's method dictionary optimization step. |
|
inline |
Access the number of atoms.
Definition at line 228 of file sparse_coding.hpp.
|
inline |
Modify the number of atoms.
Definition at line 230 of file sparse_coding.hpp.
|
inline |
Access the dictionary.
Definition at line 223 of file sparse_coding.hpp.
|
inline |
Modify the dictionary.
Definition at line 225 of file sparse_coding.hpp.
void Encode | ( | const arma::mat & | data, |
arma::mat & | codes | ||
) |
Sparse code each point in the given dataset via LARS, using the current dictionary and store the encoded data in the codes matrix.
data | Input data matrix to be encoded. |
codes | Output codes matrix. |
|
inline |
Access the L1 regularization term.
Definition at line 233 of file sparse_coding.hpp.
|
inline |
Modify the L1 regularization term.
Definition at line 235 of file sparse_coding.hpp.
|
inline |
Access the L2 regularization term.
Definition at line 238 of file sparse_coding.hpp.
|
inline |
Modify the L2 regularization term.
Definition at line 240 of file sparse_coding.hpp.
|
inline |
Get the maximum number of iterations.
Definition at line 243 of file sparse_coding.hpp.
|
inline |
Modify the maximum number of iterations.
Definition at line 245 of file sparse_coding.hpp.
|
inline |
Get the tolerance for Newton's method (dictionary optimization step).
Definition at line 253 of file sparse_coding.hpp.
|
inline |
Modify the tolerance for Newton's method (dictionary optimization step).
Definition at line 255 of file sparse_coding.hpp.
References SparseCoding::serialize().
double Objective | ( | const arma::mat & | data, |
const arma::mat & | codes | ||
) | const |
Compute the objective function.
|
inline |
Get the objective tolerance.
Definition at line 248 of file sparse_coding.hpp.
|
inline |
Modify the objective tolerance.
Definition at line 250 of file sparse_coding.hpp.
double OptimizeDictionary | ( | const arma::mat & | data, |
const arma::mat & | codes, | ||
const arma::uvec & | adjacencies | ||
) |
Learn dictionary via Newton method based on Lagrange dual.
data | Data matrix. |
codes | Matrix of codes. |
adjacencies | Indices of entries (unrolled column by column) of the coding matrix Z that are non-zero (the adjacency matrix for the bipartite graph of points and atoms). |
void ProjectDictionary | ( | ) |
Project each atom of the dictionary back onto the unit ball, if necessary.
void serialize | ( | Archive & | ar, |
const uint32_t | |||
) |
Serialize the sparse coding model.
Referenced by SparseCoding::NewtonTolerance().
double Train | ( | const arma::mat & | data, |
const DictionaryInitializer & | initializer = DictionaryInitializer() |
||
) |
Train the sparse coding model on the given dataset.