KShape#

class tslearn.clustering.KShape(n_clusters=3, max_iter=100, tol=1e-06, n_init=1, verbose=False, random_state=None, init='random')[source]#

KShape clustering for time series.

KShape was originally presented in [1].

Parameters:

n_clustersint (default: 3): Number of clusters to form.
max_iterint (default: 100): Maximum number of iterations of the k-Shape algorithm.
tolfloat (default: 1e-6): Inertia variation threshold. If at some point, inertia varies less than this threshold between two consecutive iterations, the model is considered to have converged and the algorithm stops.
n_initint (default: 1): Number of time the k-Shape algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia. Ignored if initialization is not random.
verbosebool (default: False): Whether or not to print information about the inertia while learning the model.
random_stateinteger or numpy.RandomState, optional: Generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
init{‘random’ or ndarray} (default: ‘random’): Method for initialization. ‘random’: choose k observations (rows) at random from data for the initial centroids. If an ndarray is passed, it should be of shape (n_clusters, ts_size, d) and gives the initial centers.

Attributes:

cluster_centers_numpy.ndarray of shape (sz, d).: Centroids
labels_numpy.ndarray of integers with shape (n_ts, ).: Labels of each point
inertia_float: Sum of distances of samples to their closest cluster center.
n_iter_int: The number of iterations performed during fit.

Notes

This method requires a dataset of equal-sized time series.

References

[1]

J. Paparrizos & L. Gravano. k-Shape: Efficient and Accurate Clustering of Time Series. SIGMOD 2015. pp. 1855-1870.

Examples

>>> from tslearn.generators import random_walks
>>> X = random_walks(n_ts=50, sz=32, d=1)
>>> X = TimeSeriesScalerMeanVariance(mu=0., std=1.).fit_transform(X)
>>> ks = KShape(n_clusters=3, n_init=1, random_state=0).fit(X)
>>> ks.cluster_centers_.shape
(3, 32, 1)

Methods

`fit`(X[, y])	Compute k-Shape clustering.
`fit_predict`(X[, y])	Fit k-Shape clustering using X and then predict the closest cluster each time series in X belongs to.
`from_hdf5`(path)	Load model from a HDF5 file.
`from_json`(path)	Load model from a JSON file.
`from_pickle`(path)	Load model from a pickle file.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Predict the closest cluster each time series in X belongs to.
`set_params`(**params)	Set the parameters of this estimator.
`to_hdf5`(path)	Save model to a HDF5 file.
`to_json`(path)	Save model to a JSON file.
`to_pickle`(path)	Save model to a pickle file.

fit(X, y=None)[source]#

Compute k-Shape clustering.

Parameters:

Xarray-like of shape=(n_ts, sz, d): Time series dataset.
y: Ignored

fit_predict(X, y=None)[source]#

Fit k-Shape clustering using X and then predict the closest cluster each time series in X belongs to.

It is more efficient to use this method than to sequentially call fit and predict.

Parameters:

Xarray-like of shape=(n_ts, sz, d): Time series dataset to predict.
y: Ignored

Returns:

labelsarray of shape=(n_ts, ): Index of the cluster each sample belongs to.

classmethod from_hdf5(path)[source]#

Load model from a HDF5 file. Requires h5py http://docs.h5py.org/

Parameters:

pathstr: Full path to file.

Returns:

Model instance

classmethod from_json(path)[source]#

Load model from a JSON file.

Parameters:

pathstr: Full path to file.

Returns:

Model instance

classmethod from_pickle(path)[source]#

Load model from a pickle file.

Parameters:

pathstr: Full path to file.

Returns:

Model instance

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

predict(X)[source]#

Predict the closest cluster each time series in X belongs to.

Parameters:

Xarray-like of shape=(n_ts, sz, d): Time series dataset to predict.

Returns:

labelsarray of shape=(n_ts, ): Index of the cluster each sample belongs to.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

to_hdf5(path)[source]#

Save model to a HDF5 file. Requires h5py http://docs.h5py.org/

Parameters:

pathstr: Full file path. File must not already exist.

Raises:

FileExistsError: If a file with the same path already exists.

to_json(path)[source]#

Save model to a JSON file.

Parameters:

pathstr: Full file path.

to_pickle(path)[source]#

Save model to a pickle file.

Parameters:

pathstr: Full file path.

Examples using `tslearn.clustering.KShape`#

KShape

Model Persistence

KShape#

Examples using tslearn.clustering.KShape#

Examples using `tslearn.clustering.KShape`#