The Silhouette Score is a metric of performance for clustering that represents the quality of clusters made as a result. More...
Static Public Member Functions | |
static double | MeanDistanceFromCluster (const arma::colvec &distances, const arma::Row< size_t > &labels, const size_t &label, const bool &sameCluster=false) |
Find mean distance of element from a given cluster. More... | |
template < typename DataType , typename Metric > | |
static double | Overall (const DataType &X, const arma::Row< size_t > &labels, const Metric &metric) |
Find the overall silhouette score. More... | |
template < typename DataType > | |
static arma::rowvec | SamplesScore (const DataType &distances, const arma::Row< size_t > &labels) |
Find the individual silhouette scores for precomputted dissimilarites. More... | |
template < typename DataType , typename Metric > | |
static arma::rowvec | SamplesScore (const DataType &X, const arma::Row< size_t > &labels, const Metric &metric) |
Find silhouette score of all individual elements. More... | |
Static Public Attributes | |
static const bool | NeedsMinimization = false |
Information for hyper-parameter tuning code. More... | |
The Silhouette Score is a metric of performance for clustering that represents the quality of clusters made as a result.
It provides an indication of goodness of fit and therefore a measure of how well unseen samples are likely to be predicted by the model, considering the inter-cluster and intra-cluster dissimilarities. Silhoutte Score is dependent on the metric used to calculate the dissimilarities. The best possible score is . Smaller values of Silhouette Score indicate poor clustering. Negative values would occur when a wrong label was put on the element. Values near zero indicate overlapping clusters. For an element i is within cluster average dissimilarity and is minimum of average dissimilarity from other clusters. the Silhouette Score of a Sample is calculated by
The Overall Silhouette Score is the mean of individual silhoutte scores.
Definition at line 40 of file silhouette_score.hpp.
|
static |
Find mean distance of element from a given cluster.
distances | colvec containing distances from other elements. |
labels | Labels assigned to data by clustering. |
label | label of the target cluster. |
sameCluster | true if calculating mean distance from same cluster. |
|
static |
Find the overall silhouette score.
X | Column-major data used for clustering. |
labels | Labels assigned to data by clustering. |
metric | Metric to be used to calculate dissimilarity. |
|
static |
Find the individual silhouette scores for precomputted dissimilarites.
distances | Square matrix containing distances between data points. |
labels | Labels assigned to data by clustering. |
|
static |
Find silhouette score of all individual elements.
(Distance not precomputed).
X | Column-major data used for clustering. |
labels | Labels assigned to data by clustering. |
metric | Metric to be used to calculate dissimilarity. |
|
static |
Information for hyper-parameter tuning code.
It indicates that we want to maximize the metric.
Definition at line 99 of file silhouette_score.hpp.