# tslearn.early_classification.NonMyopicEarlyClassifier¶

class tslearn.early_classification.NonMyopicEarlyClassifier(n_clusters=2, base_classifier=None, min_t=1, lamb=1.0, cost_time_parameter=1.0, random_state=None)[source]

Early Classification modelling for time series using the model presented in [1].

Parameters: n_clusters : int Number of clusters to form. base_classifier : Estimator or None Estimator (instance) to be cloned and used for classifications. If None, the chosen classifier is a 1NN with Euclidean metric. min_t : int Earliest time at which a classification can be performed on a time series lamb : float Value of the hyper parameter lambda used during the computation of the cost function to evaluate the probability that a time series belongs to a cluster given the time series. cost_time_parameter : float Parameter of the cost function of time. This function is of the form : f(time) = time * cost_time_parameter random_state: int Random state of the base estimator classifiers_ : list A list containing all the classifiers trained for the model, that is, (maximum_time_stamp - min_t) elements. pyhatyck_ : array like of shape (maximum_time_stamp - min_t, n_cluster, __n_classes, __n_classes) Contains the probabilities of being classified as class y_hat given class y and cluster ck for a trained classifier. The penultimate dimension of the array is associated to the true class of the series and the last dimension to the predicted class. pyck_ : array like of shape (__n_classes, n_cluster) Contains the probabilities of being of true class y given a cluster ck X_fit_dims : tuple of the same shape as the training dataset

References

 [1] A. Dachraoui, A. Bondu & A. Cornuejols. Early classification of time series as a non myopic sequential decision making problem. ECML/PKDD 2015

Examples

>>> dataset = to_time_series_dataset([[1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [3, 2, 1, 1, 2, 3],
...                                   [3, 2, 1, 1, 2, 3]])
>>> y = [0, 0, 0, 1, 1, 1, 0, 0]
>>> model = NonMyopicEarlyClassifier(n_clusters=3, lamb=1000.,
...                                  cost_time_parameter=.1,
...                                  random_state=0)
>>> model.fit(dataset, y)  # doctest: +ELLIPSIS
NonMyopicEarlyClassifier(...)
>>> print(type(model.classifiers_))
<class 'dict'>
>>> print(model.pyck_)
[[0. 1. 1.]
[1. 0. 0.]]
>>> preds, pred_times = model.predict_class_and_earliness(dataset)
>>> preds
array([0, 0, 0, 1, 1, 1, 0, 0])
>>> pred_times
array([4, 4, 4, 4, 4, 4, 1, 1])
>>> pred_probas, pred_times = model.predict_proba_and_earliness(dataset)
>>> pred_probas
array([[1., 0.],
[1., 0.],
[1., 0.],
[0., 1.],
[0., 1.],
[0., 1.],
[1., 0.],
[1., 0.]])
>>> pred_times
array([4, 4, 4, 4, 4, 4, 1, 1])


Methods

 early_classification_cost(X, y) Compute early classification score. fit(X, y) Fit early classifier. get_cluster_probas(Xi) Compute cluster probability $$P(c_k | Xi)$$. get_params([deep]) Get parameters for this estimator. predict(X) Provide predicted class. predict_class_and_earliness(X) Provide predicted class as well as prediction timestamps. predict_proba(X) Probability estimates. predict_proba_and_earliness(X) Provide probability estimates as well as prediction timestamps. score(X, y[, sample_weight]) Return the mean accuracy on the given test data and labels. set_params(**params) Set the parameters of this estimator.
early_classification_cost(X, y)[source]

Compute early classification score.

The score is computed as:

$1 - acc + \alpha \frac{1}{n} \sum_i t_i$

where $$\alpha$$ is the trade-off parameter (self.cost_time_parameter) and $$t_i$$ are prediction timestamps.

Parameters: X : array-like of shape (n_series, n_timestamps, n_features) Vector to be scored, where n_series is the number of time series, n_timestamps is the number of timestamps in the series and n_features is the number of features recorded at each timestamp. y : array-like, shape = (n_samples) or (n_samples, n_outputs) True labels for X. float Early classification cost (a positive number, the lower the better)

Examples

>>> dataset = to_time_series_dataset([[1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [3, 2, 1, 1, 2, 3],
...                                   [3, 2, 1, 1, 2, 3]])
>>> y = [0, 0, 0, 1, 1, 1, 0, 0]
>>> model = NonMyopicEarlyClassifier(n_clusters=3, lamb=1000.,
...                                  cost_time_parameter=.1,
...                                  random_state=0)
>>> model.fit(dataset, y)  # doctest: +ELLIPSIS
NonMyopicEarlyClassifier(...)
>>> preds, pred_times = model.predict_class_and_earliness(dataset)
>>> preds
array([0, 0, 0, 1, 1, 1, 0, 0])
>>> pred_times
array([4, 4, 4, 4, 4, 4, 1, 1])
>>> model.early_classification_cost(dataset, y)
0.325

fit(X, y)[source]

Fit early classifier.

Parameters: X : array-like of shape (n_series, n_timestamps, n_features) Training data, where n_series is the number of time series, n_timestamps is the number of timestamps in the series and n_features is the number of features recorded at each timestamp. y : array-like of shape (n_samples,) Target values. Will be cast to X’s dtype if necessary self : returns an instance of self.
get_cluster_probas(Xi)[source]

Compute cluster probability $$P(c_k | Xi)$$.

This quantity is computed using the following formula:

$P(c_k | Xi) = \frac{s_k(Xi)}{\sum_j s_j(Xi)}$

where

$s_k(Xi) = \frac{1}{1 + \exp{-\lambda \Delta_k(Xi)}}$

with

$\Delta_k(Xi) = \frac{\bar{D} - d(Xi, c_k)}{\bar{D}}$

and $$\bar{D}$$ is the average of the distances between Xi and the cluster centers.

Parameters: Xi: numpy array, shape (t, d) A time series observed up to time t probas : numpy array, shape (n_clusters, )

Examples

>>> from tslearn.utils import to_time_series
>>> dataset = to_time_series_dataset([[1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 4, 5, 6],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [1, 2, 3, 3, 2, 1],
...                                   [3, 2, 1, 1, 2, 3],
...                                   [3, 2, 1, 1, 2, 3]])
>>> y = [0, 0, 0, 1, 1, 1, 0, 0]
>>> ts0 = to_time_series([1, 2])
>>> model = NonMyopicEarlyClassifier(n_clusters=3, lamb=0.,
...                                  random_state=0)
>>> probas = model.fit(dataset, y).get_cluster_probas(ts0)
>>> probas.shape
(3,)
>>> probas  # doctest: +ELLIPSIS
array([0.33..., 0.33..., 0.33...])
>>> model = NonMyopicEarlyClassifier(n_clusters=3, lamb=10000.,
...                                  random_state=0)
>>> probas = model.fit(dataset, y).get_cluster_probas(ts0)
>>> probas.shape
(3,)
>>> probas
array([0.5, 0.5, 0. ])
>>> ts1 = to_time_series([3, 2])
>>> model.get_cluster_probas(ts1)
array([0., 0., 1.])

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters: deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. params : dict Parameter names mapped to their values.
predict(X)[source]

Provide predicted class.

Parameters: X : array-like of shape (n_series, n_timestamps, n_features) Vector to be scored, where n_series is the number of time series, n_timestamps is the number of timestamps in the series and n_features is the number of features recorded at each timestamp. array, shape (n_samples,) Predicted classes.
predict_class_and_earliness(X)[source]

Provide predicted class as well as prediction timestamps.

Prediction timestamps are timestamps at which a prediction is made in early classification setting.

Parameters: X : array-like of shape (n_series, n_timestamps, n_features) Vector to be scored, where n_series is the number of time series, n_timestamps is the number of timestamps in the series and n_features is the number of features recorded at each timestamp. array, shape (n_samples,) Predicted classes. array-like of shape (n_series, ) Prediction timestamps.
predict_proba(X)[source]

Probability estimates.

The returned estimates for all classes are ordered by the label of classes.

Parameters: X : array-like of shape (n_series, n_timestamps, n_features) Vector to be scored, where n_series is the number of time series, n_timestamps is the number of timestamps in the series and n_features is the number of features recorded at each timestamp. array-like of shape (n_series, n_classes) Probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.
predict_proba_and_earliness(X)[source]

Provide probability estimates as well as prediction timestamps.

Prediction timestamps are timestamps at which a prediction is made in early classification setting. The returned estimates for all classes are ordered by the label of classes.

Parameters: X : array-like of shape (n_series, n_timestamps, n_features) Vector to be scored, where n_series is the number of time series, n_timestamps is the number of timestamps in the series and n_features is the number of features recorded at each timestamp. array-like of shape (n_series, n_classes) Probability of the sample for each class in the model, where classes are ordered as they are in self.classes_. array-like of shape (n_series, ) Prediction timestamps.
score(X, y, sample_weight=None)[source]

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters: X : array-like of shape (n_samples, n_features) Test samples. y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X. sample_weight : array-like of shape (n_samples,), default=None Sample weights. score : float Mean accuracy of self.predict(X) wrt. y.
set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params : dict Estimator parameters. self : estimator instance Estimator instance.