tslearn.metrics.cdist_dtw

tslearn.metrics.cdist_dtw(dataset1, dataset2=None, global_constraint=None, sakoe_chiba_radius=None, itakura_max_slope=None, n_jobs=None, verbose=0)[source]

Compute cross-similarity matrix using Dynamic Time Warping (DTW) similarity measure.

DTW is computed as the Euclidean distance between aligned time series, i.e., if \(\pi\) is the alignment path:

\[DTW(X, Y) = \sqrt{\sum_{(i, j) \in \pi} \|X_{i} - Y_{j}\|^2}\]

Note that this formula is still valid for the multivariate case.

It is not required that time series share the same size, but they must be the same dimension. DTW was originally presented in [1] and is discussed in more details in our dedicated user-guide page.

Parameters:
dataset1 : array-like

A dataset of time series

dataset2 : array-like (default: None)

Another dataset of time series. If None, self-similarity of dataset1 is returned.

global_constraint : {“itakura”, “sakoe_chiba”} or None (default: None)

Global constraint to restrict admissible paths for DTW.

sakoe_chiba_radius : int or None (default: None)

Radius to be used for Sakoe-Chiba band global constraint. If None and global_constraint is set to “sakoe_chiba”, a radius of 1 is used. If both sakoe_chiba_radius and itakura_max_slope are set, global_constraint is used to infer which constraint to use among the two. In this case, if global_constraint corresponds to no global constraint, a RuntimeWarning is raised and no global constraint is used.

itakura_max_slope : float or None (default: None)

Maximum slope for the Itakura parallelogram constraint. If None and global_constraint is set to “itakura”, a maximum slope of 2. is used. If both sakoe_chiba_radius and itakura_max_slope are set, global_constraint is used to infer which constraint to use among the two. In this case, if global_constraint corresponds to no global constraint, a RuntimeWarning is raised and no global constraint is used.

n_jobs : int or None, optional (default=None)

The number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See scikit-learns’ Glossary for more details.

verbose : int, optional (default=0)

The verbosity level: if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. Glossary for more details.

Returns:
cdist : numpy.ndarray

Cross-similarity matrix

See also

dtw
Get DTW similarity score

References

[1]H. Sakoe, S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 26(1), pp. 43–49, 1978.

Examples

>>> cdist_dtw([[1, 2, 2, 3], [1., 2., 3., 4.]])
array([[0., 1.],
       [1., 0.]])
>>> cdist_dtw([[1, 2, 2, 3], [1., 2., 3., 4.]], [[1, 2, 3], [2, 3, 4, 5]])
array([[0.        , 2.44948974],
       [1.        , 1.41421356]])