# Early Classification of Time Series¶

Early classification of time series is the task of performing a classification as early as possible for an incoming time series, and decision about when to trigger the decision is part of the prediction process.

## Early Classification Cost Function¶

Dachraoui et al. [1] introduces a composite loss function for early classification of time series that balances earliness and accuracy.

The cost function is of the following form:

$\mathcal{L}(\mathbf{x}_{\rightarrow t}, y, t, \boldsymbol{\theta}) = \mathcal{L}_c(\mathbf{x}_{\rightarrow t}, y, \boldsymbol{\theta}) + \alpha t$

where $$\mathcal{L}_c(\cdot,\cdot,\cdot)$$ is a classification loss and $$t$$ is the time at which a decision is triggered by the system ($$\mathbf{x}_{\rightarrow t}$$ is time series $$\mathbf{x}$$ observed up to time $$t$$). In this setting, $$\alpha$$ drives the tradeoff between accuracy and earliness and is supposed to be a hyper-parameter of the method.

The authors rely on (i) a clustering of the training time series and (ii) individual classifiers $$m_t(\cdot)$$ trained at all possible timestamps, so as to be able to predict, at time $$t$$, an expected cost for all future times $$t + \tau$$ with $$\tau \geq 0$$:

$f_\tau(\mathbf{x}_{\rightarrow t}, y) = \sum_k \left[ P(C_k | \mathbf{x}_{\rightarrow t}) \sum_i \left( P(y=i | C_k) \left( \sum_{j \neq i} P_{t+\tau}(\hat{y} = j | y=i, C_k) \right) \right) \right] + \alpha t$

where:

• $$P(C_k | \mathbf{x}_{\rightarrow t})$$ is a soft-assignment weight of $$\mathbf{x}_{\rightarrow t}$$ to cluster $$C_k$$;
• $$P(y=i | C_k)$$ is obtained from a contingency table that stores the number of training time series of each class in each cluster;
• $$P_{t+\tau}(\hat{y} = j | y=i, C_k)$$ is obtained through training time confusion matrices built on time series from cluster $$C_k$$ using classifier $$m_{t+\tau}(\cdot)$$.

At test time, if a series is observed up to time $$t$$ and if, for all positive $$\tau$$ we have $$f_\tau(\mathbf{x}_{\rightarrow t}, y) \geq f_0(\mathbf{x}_{\rightarrow t}, y)$$, then a decision is made using classifier $$m_t(\cdot)$$.

Early classification. At test time, prediction is made at a timestamp such that the expected earliness-accuracy is optimized, which can hence vary between time series.

To use this early classifier in tslearn, one can rely on the tslearn.early_classification.NonMyopicEarlyClassifier class:

from tslearn.early_classification import NonMyopicEarlyClassifier

early_clf = NonMyopicEarlyClassifier(n_clusters=3,
cost_time_parameter=1e-3,
lamb=1e2,
random_state=0)
early_clf.fit(X_train, y_train)
preds, times = early_clf.predict_class_and_earliness(X_test)


where cost_time_parameter is the $$\alpha$$ parameter presented above and lamb is a trade-off parameter for the soft-assignment of partial series to clusters $$P(C_k | \mathbf{x}_{\rightarrow t})$$ (when lamb tends to infinity, the assignment tends to hard-assignment, and when lamb is set to 0, equal probabilities are obtained for all clusters).

## References¶

 [1] A. Dachraoui, A. Bondu and A. Cornuejols. “Early classification of time series as a non myopic sequential decision making problem,” ECML/PKDD 2015