Early Classification of Time Series

Early classification of time series is the task of performing a classification as early as possible for an incoming time series, and decision about when to trigger the decision is part of the prediction process.

Early Classification Cost Function

Dachraoui et al. [1] introduces a composite loss function for early classification of time series that balances earliness and accuracy.

The cost function is of the following form:

\[\mathcal{L}(\mathbf{x}_{\rightarrow t}, y, t, \boldsymbol{\theta}) = \mathcal{L}_c(\mathbf{x}_{\rightarrow t}, y, \boldsymbol{\theta}) + \alpha t\]

where \(\mathcal{L}_c(\cdot,\cdot,\cdot)\) is a classification loss and \(t\) is the time at which a decision is triggered by the system (\(\mathbf{x}_{\rightarrow t}\) is time series \(\mathbf{x}\) observed up to time \(t\)). In this setting, \(\alpha\) drives the tradeoff between accuracy and earliness and is supposed to be a hyper-parameter of the method.

The authors rely on (i) a clustering of the training time series and (ii) individual classifiers \(m_t(\cdot)\) trained at all possible timestamps, so as to be able to predict, at time \(t\), an expected cost for all future times \(t + \tau\) with \(\tau \geq 0\):

\[f_\tau(\mathbf{x}_{\rightarrow t}, y) = \sum_k \left[ P(C_k | \mathbf{x}_{\rightarrow t}) \sum_i \left( P(y=i | C_k) \left( \sum_{j \neq i} P_{t+\tau}(\hat{y} = j | y=i, C_k) \right) \right) \right] + \alpha t\]


  • \(P(C_k | \mathbf{x}_{\rightarrow t})\) is a soft-assignment weight of \(\mathbf{x}_{\rightarrow t}\) to cluster \(C_k\);
  • \(P(y=i | C_k)\) is obtained from a contingency table that stores the number of training time series of each class in each cluster;
  • \(P_{t+\tau}(\hat{y} = j | y=i, C_k)\) is obtained through training time confusion matrices built on time series from cluster \(C_k\) using classifier \(m_{t+\tau}(\cdot)\).

At test time, if a series is observed up to time \(t\) and if, for all positive \(\tau\) we have \(f_\tau(\mathbf{x}_{\rightarrow t}, y) \geq f_0(\mathbf{x}_{\rightarrow t}, y)\), then a decision is made using classifier \(m_t(\cdot)\).


Early classification. At test time, prediction is made at a timestamp such that the expected earliness-accuracy is optimized, which can hence vary between time series.

To use this early classifier in tslearn, one can rely on the tslearn.early_classification.NonMyopicEarlyClassifier class:

from tslearn.early_classification import NonMyopicEarlyClassifier

early_clf = NonMyopicEarlyClassifier(n_clusters=3,
early_clf.fit(X_train, y_train)
preds, times = early_clf.predict_class_and_earliness(X_test)

where cost_time_parameter is the \(\alpha\) parameter presented above and lamb is a trade-off parameter for the soft-assignment of partial series to clusters \(P(C_k | \mathbf{x}_{\rightarrow t})\) (when lamb tends to infinity, the assignment tends to hard-assignment, and when lamb is set to 0, equal probabilities are obtained for all clusters).

Examples Involving Early Classification Estimators


[1]A. Dachraoui, A. Bondu and A. Cornuejols. “Early classification of time series as a non myopic sequential decision making problem,” ECML/PKDD 2015