Matrix Profile¶
The Matrix Profile, \(MP\), is a new time series that can be calculated based on an input time series \(T\) and a subsequence length \(m\). \(MP_i\) corresponds to the minimal distance from the query subsequence \(T_{i\rightarrow i+m}\) to any subsequence in \(T\) [1]. As the distance from the query subsequence to itself will be equal to zero, \(T_{i-\frac{m}{4}\rightarrow i+\frac{m}{4}}\) is considered as an exclusion zone. In order to construct the Matrix Profile, a distance profile which is similar to the distance calculation used to transform time series into their shapelet-transform space, is calculated for each subsequence, as illustrated below:
Implementation¶
The Matrix Profile implementation provided in tslearn
uses numpy or wraps around STUMPY [2]. Three different versions are available:
numpy
: a slow implementationstump
: a fast CPU version, which requires STUMPY to be installedgpu_stump
: the fastest version, which requires STUMPY to be installed and a GPU
Possible Applications¶
The Matrix Profile allows for many possible applications, which are well documented on the page created by the original authors [3]. Some of these applications include: motif and shapelet extraction, discord detection, earthquake detection, and many more.