7. Early Classification of Time Series#

Early classification of time series is the task of performing a classification as early as possible for an incoming time series, and decision about when to trigger the decision is part of the prediction process.

7.1. Early Classification Cost Function#

Dachraoui et al. [1] introduces a composite loss function for early classification of time series that balances earliness and accuracy.

The cost function is of the following form:

\[\mathcal{L}(\mathbf{x}_{\rightarrow t}, y, t, \boldsymbol{\theta}) = \mathcal{L}_c(\mathbf{x}_{\rightarrow t}, y, \boldsymbol{\theta}) + \alpha t\]

where \(\mathcal{L}_c(\cdot,\cdot,\cdot)\) is a classification loss and \(t\) is the time at which a decision is triggered by the system (\(\mathbf{x}_{\rightarrow t}\) is time series \(\mathbf{x}\) observed up to time \(t\)). In this setting, \(\alpha\) drives the tradeoff between accuracy and earliness and is supposed to be a hyper-parameter of the method.

The authors rely on (i) a clustering of the training time series and (ii) individual classifiers \(m_t(\cdot)\) trained at all possible timestamps, so as to be able to predict, at time \(t\), an expected cost for all future times \(t + \tau\) with \(\tau \geq 0\):

\[f_\tau(\mathbf{x}_{\rightarrow t}, y) = \sum_k \left[ P(C_k | \mathbf{x}_{\rightarrow t}) \sum_i \left( P(y=i | C_k) \left( \sum_{j \neq i} P_{t+\tau}(\hat{y} = j | y=i, C_k) \right) \right) \right] + \alpha t\]

where:

  • \(P(C_k | \mathbf{x}_{\rightarrow t})\) is a soft-assignment weight of \(\mathbf{x}_{\rightarrow t}\) to cluster \(C_k\);

  • \(P(y=i | C_k)\) is obtained from a contingency table that stores the number of training time series of each class in each cluster;

  • \(P_{t+\tau}(\hat{y} = j | y=i, C_k)\) is obtained through training time confusion matrices built on time series from cluster \(C_k\) using classifier \(m_{t+\tau}(\cdot)\).

At test time, if a series is observed up to time \(t\) and if, for all positive \(\tau\) we have \(f_\tau(\mathbf{x}_{\rightarrow t}, y) \geq f_0(\mathbf{x}_{\rightarrow t}, y)\), then a decision is made using classifier \(m_t(\cdot)\).

../_images/sphx_glr_plot_early_classification_002.svg

Early classification. At test time, prediction is made at a timestamp such that the expected earliness-accuracy is optimized, which can hence vary between time series.#

To use this early classifier in tslearn, one can rely on the tslearn.early_classification.NonMyopicEarlyClassifier class:

from tslearn.early_classification import NonMyopicEarlyClassifier

early_clf = NonMyopicEarlyClassifier(n_clusters=3,
                                     cost_time_parameter=1e-3,
                                     lamb=1e2,
                                     random_state=0)
early_clf.fit(X_train, y_train)
preds, times = early_clf.predict_class_and_earliness(X_test)

where cost_time_parameter is the \(\alpha\) parameter presented above and lamb is a trade-off parameter for the soft-assignment of partial series to clusters \(P(C_k | \mathbf{x}_{\rightarrow t})\) (when lamb tends to infinity, the assignment tends to hard-assignment, and when lamb is set to 0, equal probabilities are obtained for all clusters).

7.2. Examples Involving Early Classification Estimators#

Early Classification

Early Classification

7.3. References#