Kernel k-meansΒΆ

This example uses Global Alignment kernel at the core of a kernel \(k\)-means algorithm to perform time series clustering.

../_images/sphx_glr_plot_kernel_kmeans_001.png

Out:

Init 1
80.948 --> 70.106 --> 66.011 --> 63.422 --> 59.720 --> 58.005 --> 57.563 --> 57.563 -->
Init 2
80.519 --> 70.023 --> 66.522 --> 65.914 --> 65.914 -->
Init 3
80.374 --> 67.064 --> 62.859 --> 62.220 --> 59.391 --> 59.391 -->
Init 4
77.700 --> 69.585 --> 67.474 --> 67.022 --> 66.104 --> 65.075 --> 63.516 --> 62.861 --> 62.410 --> 61.166 --> 59.759 --> 59.759 -->
Init 5
79.246 --> 66.190 --> 63.040 --> 63.040 -->
Init 6
78.590 --> 68.315 --> 66.321 --> 65.633 --> 63.898 --> 63.898 -->
Init 7
75.299 --> 63.203 --> 59.963 --> 57.563 --> 57.563 -->
Init 8
76.876 --> 67.042 --> 66.764 --> 66.764 -->
Init 9
81.317 --> 69.313 --> 63.927 --> 61.124 --> 59.391 --> 59.391 -->
Init 10
79.317 --> 72.390 --> 70.197 --> 70.218 --> 70.218 -->
Init 11
78.202 --> 66.888 --> 60.961 --> 57.946 --> 57.387 --> 57.387 -->
Init 12
78.194 --> 67.992 --> 65.263 --> 63.436 --> 61.177 --> 57.799 --> 57.387 --> 57.387 -->
Init 13
77.553 --> 64.028 --> 64.008 --> 64.008 -->
Init 14
77.853 --> 62.815 --> 57.799 --> 57.387 --> 57.387 -->
Init 15
81.746 --> 67.617 --> 63.332 --> 62.827 --> 62.234 --> 58.470 --> 57.387 --> 57.387 -->
Init 16
78.934 --> 69.153 --> 65.466 --> 63.619 --> 63.619 -->
Init 17
78.303 --> 65.546 --> 63.619 --> 63.619 -->
Init 18
77.760 --> 67.020 --> 66.729 --> 65.900 --> 65.900 -->
Init 19
79.795 --> 70.429 --> 69.098 --> 69.098 -->
Init 20
79.419 --> 67.908 --> 65.330 --> 63.388 --> 61.019 --> 58.186 --> 57.387 --> 57.387 -->

# Author: Romain Tavenard
# License: BSD 3 clause

import numpy
import matplotlib.pyplot as plt

from tslearn.clustering import GlobalAlignmentKernelKMeans
from tslearn.metrics import sigma_gak, cdist_gak
from tslearn.datasets import CachedDatasets
from tslearn.preprocessing import TimeSeriesScalerMeanVariance

seed = 0
numpy.random.seed(seed)
X_train, y_train, X_test, y_test = CachedDatasets().load_dataset("Trace")
X_train = X_train[y_train < 4]  # Keep first 3 classes
numpy.random.shuffle(X_train)
X_train = TimeSeriesScalerMeanVariance().fit_transform(X_train[:50])  # Keep only 50 time series
sz = X_train.shape[1]

gak_km = GlobalAlignmentKernelKMeans(n_clusters=3, sigma=sigma_gak(X_train), n_init=20, verbose=True, random_state=seed)
y_pred = gak_km.fit_predict(X_train)

plt.figure()
for yi in range(3):
    plt.subplot(3, 1, 1 + yi)
    for xx in X_train[y_pred == yi]:
        plt.plot(xx.ravel(), "k-")
    plt.xlim(0, sz)
    plt.ylim(-4, 4)
    plt.title("Cluster %d" % (yi + 1))

plt.tight_layout()
plt.show()

Total running time of the script: ( 0 minutes 12.605 seconds)

Gallery generated by Sphinx-Gallery