# tslearn.clustering.KShape¶

class tslearn.clustering.KShape(n_clusters=3, max_iter=100, tol=1e-06, n_init=1, verbose=False, random_state=None, init='random')[source]

KShape clustering for time series.

KShape was originally presented in [1].

Parameters: n_clusters : int (default: 3) Number of clusters to form. max_iter : int (default: 100) Maximum number of iterations of the k-Shape algorithm. tol : float (default: 1e-6) Inertia variation threshold. If at some point, inertia varies less than this threshold between two consecutive iterations, the model is considered to have converged and the algorithm stops. n_init : int (default: 1) Number of time the k-Shape algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia. verbose : bool (default: False) Whether or not to print information about the inertia while learning the model. random_state : integer or numpy.RandomState, optional Generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator. init : {‘random’ or ndarray} (default: ‘random’) Method for initialization. ‘random’: choose k observations (rows) at random from data for the initial centroids. If an ndarray is passed, it should be of shape (n_clusters, ts_size, d) and gives the initial centers. cluster_centers_ : numpy.ndarray of shape (sz, d). Centroids labels_ : numpy.ndarray of integers with shape (n_ts, ). Labels of each point inertia_ : float Sum of distances of samples to their closest cluster center. n_iter_ : int The number of iterations performed during fit.

Notes

This method requires a dataset of equal-sized time series.

References

 [1] J. Paparrizos & L. Gravano. k-Shape: Efficient and Accurate Clustering of Time Series. SIGMOD 2015. pp. 1855-1870.

Examples

>>> from tslearn.generators import random_walks
>>> X = random_walks(n_ts=50, sz=32, d=1)
>>> X = TimeSeriesScalerMeanVariance(mu=0., std=1.).fit_transform(X)
>>> ks = KShape(n_clusters=3, n_init=1, random_state=0).fit(X)
>>> ks.cluster_centers_.shape
(3, 32, 1)


Methods

 fit(X[, y]) Compute k-Shape clustering. fit_predict(X[, y]) Fit k-Shape clustering using X and then predict the closest cluster each time series in X belongs to. from_hdf5(path) Load model from a HDF5 file. from_json(path) Load model from a JSON file. from_pickle(path) Load model from a pickle file. get_params([deep]) Get parameters for this estimator. predict(X) Predict the closest cluster each time series in X belongs to. set_params(**params) Set the parameters of this estimator. to_hdf5(path) Save model to a HDF5 file. to_json(path) Save model to a JSON file. to_pickle(path) Save model to a pickle file.
fit(X, y=None)[source]

Compute k-Shape clustering.

Parameters: X : array-like of shape=(n_ts, sz, d) Time series dataset. y Ignored
fit_predict(X, y=None)[source]

Fit k-Shape clustering using X and then predict the closest cluster each time series in X belongs to.

It is more efficient to use this method than to sequentially call fit and predict.

Parameters: X : array-like of shape=(n_ts, sz, d) Time series dataset to predict. y Ignored labels : array of shape=(n_ts, ) Index of the cluster each sample belongs to.
classmethod from_hdf5(path)[source]

Load model from a HDF5 file. Requires h5py http://docs.h5py.org/

Parameters: path : str Full path to file. Model instance
classmethod from_json(path)[source]

Load model from a JSON file.

Parameters: path : str Full path to file. Model instance
classmethod from_pickle(path)[source]

Load model from a pickle file.

Parameters: path : str Full path to file. Model instance
get_params(deep=True)[source]

Get parameters for this estimator.

Parameters: deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. params : dict Parameter names mapped to their values.
predict(X)[source]

Predict the closest cluster each time series in X belongs to.

Parameters: X : array-like of shape=(n_ts, sz, d) Time series dataset to predict. labels : array of shape=(n_ts, ) Index of the cluster each sample belongs to.
set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params : dict Estimator parameters. self : estimator instance Estimator instance.
to_hdf5(path)[source]

Save model to a HDF5 file. Requires h5py http://docs.h5py.org/

Parameters: path : str Full file path. File must not already exist. FileExistsError If a file with the same path already exists.
to_json(path)[source]

Save model to a JSON file.

Parameters: path : str Full file path.
to_pickle(path)[source]

Save model to a pickle file.

Parameters: path : str Full file path.