tslearn.clustering.silhouette_score¶
- tslearn.clustering.silhouette_score(X, labels, metric=None, sample_size=None, metric_params=None, n_jobs=None, verbose=0, random_state=None, **kwds)[source]¶
Compute the mean Silhouette Coefficient of all samples (cf. [1] and [2]).
Read more in the scikit-learn documentation.
- Parameters:
- Xarray [n_ts, n_ts] if metric == “precomputed”, or, [n_ts, sz, d] otherwise
Array of pairwise distances between time series, or a time series dataset.
- labelsarray, shape = [n_ts]
Predicted labels for each time series.
- metricstring, callable or None (default: None)
The metric to use when calculating distance between time series. Should be one of {‘dtw’, ‘softdtw’, ‘euclidean’} or a callable distance function or None. If ‘softdtw’ is passed, a normalized version of Soft-DTW is used that is defined as sdtw_(x,y) := sdtw(x,y) - 1/2(sdtw(x,x)+sdtw(y,y)). If X is the distance array itself, use
metric="precomputed"
. If None, dtw is used.- sample_sizeint or None (default: None)
The size of the sample to use when computing the Silhouette Coefficient on a random subset of the data. If
sample_size is None
, no sampling is used.- metric_paramsdict or None (default: None)
Parameter values for the chosen metric. For metrics that accept parallelization of the cross-distance matrix computations, n_jobs key passed in metric_params is overridden by the n_jobs argument.
- n_jobsint or None, optional (default=None)
The number of jobs to run in parallel for cross-distance matrix computations. Ignored if the cross-distance matrix cannot be computed using parallelization.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processors. See scikit-learns’ Glossary for more details.- verboseint (default: 0)
If nonzero, print information about the inertia while learning the model and joblib progress messages are printed.
- random_stateint, RandomState instance or None, optional (default: None)
The generator used to randomly select a subset of samples. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. Used when
sample_size is not None
.- **kwdsoptional keyword parameters
Any further parameters are passed directly to the distance function, just as for the metric_params parameter.
- Returns:
- silhouettefloat
Mean Silhouette Coefficient for all samples.
References
[1]Examples
>>> from tslearn.generators import random_walks >>> from tslearn.metrics import cdist_dtw >>> from tslearn.metrics import dtw >>> numpy.random.seed(0) >>> X = random_walks(n_ts=20, sz=16, d=1) >>> labels = numpy.random.randint(2, size=20) >>> silhouette_score(X, labels, metric="dtw") 0.13383800... >>> silhouette_score(X, labels, metric="euclidean") 0.09126917... >>> silhouette_score(X, labels, metric="softdtw") 0.17953934... >>> silhouette_score(X, labels, metric="softdtw", ... metric_params={"gamma": 2.}) 0.17591060... >>> silhouette_score(cdist_dtw(X), labels, ... metric="precomputed") 0.13383800... >>> silhouette_score(X, labels, metric=dtw) 0.13383800...