tslearn.metrics.cdist_dtw

tslearn.metrics.cdist_dtw(dataset1, dataset2=None, global_constraint=None, sakoe_chiba_radius=None, itakura_max_slope=None, n_jobs=None, verbose=0, be=None)[source]

Compute cross-similarity matrix using Dynamic Time Warping (DTW) similarity measure.

DTW is computed as the Euclidean distance between aligned time series, i.e., if \(\pi\) is the alignment path:

\[DTW(X, Y) = \sqrt{\sum_{(i, j) \in \pi} \|X_{i} - Y_{j}\|^2}\]

Note that this formula is still valid for the multivariate case.

It is not required that time series share the same size, but they must be the same dimension. DTW was originally presented in [1] and is discussed in more details in our dedicated user-guide page.

Parameters:
dataset1array-like, shape=(n_ts1, sz1, d) or (n_ts1, sz1) or (sz1,)

A dataset of time series. If shape is (n_ts1, sz1), the dataset is composed of univariate time series. If shape is (sz1,), the dataset is composed of a unique univariate time series.

dataset2None or array-like, shape=(n_ts2, sz2, d) or (n_ts2, sz2) or (sz2,) (default: None)

Another dataset of time series. If None, self-similarity of dataset1 is returned. If shape is (n_ts2, sz2), the dataset is composed of univariate time series. If shape is (sz2,), the dataset is composed of a unique univariate time series.

global_constraint{“itakura”, “sakoe_chiba”} or None (default: None)

Global constraint to restrict admissible paths for DTW.

sakoe_chiba_radiusint or None (default: None)

Radius to be used for Sakoe-Chiba band global constraint. If None and global_constraint is set to “sakoe_chiba”, a radius of 1 is used. If both sakoe_chiba_radius and itakura_max_slope are set, global_constraint is used to infer which constraint to use among the two. In this case, if global_constraint corresponds to no global constraint, a RuntimeWarning is raised and no global constraint is used.

itakura_max_slopefloat or None (default: None)

Maximum slope for the Itakura parallelogram constraint. If None and global_constraint is set to “itakura”, a maximum slope of 2. is used. If both sakoe_chiba_radius and itakura_max_slope are set, global_constraint is used to infer which constraint to use among the two. In this case, if global_constraint corresponds to no global constraint, a RuntimeWarning is raised and no global constraint is used.

n_jobsint or None, optional (default=None)

The number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See scikit-learns’ Glossary for more details.

verboseint, optional (default=0)

The verbosity level: if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. Glossary for more details.

beBackend object or string or None

Backend. If be is an instance of the class NumPyBackend or the string “numpy”, the NumPy backend is used. If be is an instance of the class PyTorchBackend or the string “pytorch”, the PyTorch backend is used. If be is None, the backend is determined by the input arrays. See our dedicated user-guide page for more information.

Returns:
cdistarray-like, shape=(n_ts1, n_ts2)

Cross-similarity matrix.

See also

dtw

Get DTW similarity score

References

[1]

H. Sakoe, S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 26(1), pp. 43–49, 1978.

Examples

>>> cdist_dtw([[1, 2, 2, 3], [1., 2., 3., 4.]])
array([[0., 1.],
       [1., 0.]])
>>> cdist_dtw([[1, 2, 2, 3], [1., 2., 3., 4.]], [[1, 2, 3], [2, 3, 4, 5]])
array([[0.        , 2.44948974],
       [1.        , 1.41421356]])