Matrix Profile

The Matrix Profile, \(MP\), is a new time series that can be calculated based on an input time series \(T\) and a subsequence length \(m\). \(MP_i\) corresponds to the minimal distance from the query subsequence \(T_{i\rightarrow i+m}\) to any subsequence in \(T\) [1]. As the distance from the query subsequence to itself will be equal to zero, \(T_{i-\frac{m}{4}\rightarrow i+\frac{m}{4}}\) is considered as an exclusion zone. In order to construct the Matrix Profile, a distance profile which is similar to the distance calculation used to transform time series into their shapelet-transform space, is calculated for each subsequence, as illustrated below:

../_images/sphx_glr_plot_distance_and_matrix_profile_001.svg

For each segment, the distances to all subsequences of the time series are calculated and the minimal distance that does not correspond to the original location of the segment (where the distance is zero) is returned.

Implementation

The Matrix Profile implementation provided in tslearn uses numpy or wraps around STUMPY [2]. Three different versions are available:

  • numpy: a slow implementation

  • stump: a fast CPU version, which requires STUMPY to be installed

  • gpu_stump: the fastest version, which requires STUMPY to be installed and a GPU

Possible Applications

The Matrix Profile allows for many possible applications, which are well documented on the page created by the original authors [3]. Some of these applications include: motif and shapelet extraction, discord detection, earthquake detection, and many more.

Examples Involving Matrix Profile

Matrix Profile

Matrix Profile

Distance and Matrix Profiles

Distance and Matrix Profiles

References