TimeSeriesDBSCAN#
- class tslearn.clustering.TimeSeriesDBSCAN(eps=0.5, min_ts=5, metric='dtw', metric_params=None, n_jobs=None)[source]#
DBSCAN clustering for time series.
- Parameters:
- epsfloat (default: 0.5)
The maximum distance between two time series for one to be considered as in the neighborhood of the other.
- min_tsint (default: 5)
The number of time series (including itself) in a neighborhood for a time series to be considered as a core point.
- metric: {‘dtw’, ‘ctw’, ‘frechet’, ‘euclidean’, ‘precomputed’} (default: ‘dtw’)
Metric to be used for similarity measure between time series.
- metric_paramsdict (default: None)
Additional keyword arguments to pass to the metric function. For metrics that accept parallelization of the cross-distance matrix computations, n_jobs key passed in metric_params is overridden by the n_jobs argument. Parameters that do not match the metric computation function signature are ignored.
- n_jobsint or None (default=None)
The number of jobs to run in parallel for cross-distance matrix computations. Ignored if the cross-distance matrix cannot be computed using parallelization.
Nonemeans 1 unless in ajoblib.parallel_backendcontext.-1means using all processors. See scikit-learns’ Glossary for more details.
- Attributes:
- core_ts_indices_numpy.ndarray of shape (n_core_ts).
Indices of core time series.
- components_: numpy.ndarray of shape (n_core_ts, sz, d)
Copy of each core time series found by training.
- labels_numpy.ndarray of integers with shape (n_ts).
Labels of each time series. Noisy time series are given the label -1.
- n_features_in_int
Number of features seen during training.
Notes
If metric is set to “euclidean”, the algorithm expects a dataset of equal-sized time series.
Examples
>>> from tslearn.generators import random_walk_blobs >>> from tslearn.preprocessing import TimeSeriesScalerMeanVariance >>> X, y = random_walk_blobs(n_ts_per_blob=20, sz=32, d=2, n_blobs=4, random_state=0) >>> X = TimeSeriesScalerMeanVariance(mu=0., std=1.).fit_transform(X) >>> db = TimeSeriesDBSCAN(eps=4, min_ts=3).fit(X) >>> np.unique(db.labels_) # Clusters and noise array([-1, 0, 1, 2, 3]) >>> list(db.labels_).count(-1) # Nb noisy elements 37
Methods
fit(X[, y])Compute DBSCAN clustering.
fit_predict(X[, y])Compute DBSCAN clustering.
from_hdf5(path)Load model from a HDF5 file.
from_json(path)Load model from a JSON file.
from_pickle(path)Load model from a pickle file.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
set_params(**params)Set the parameters of this estimator.
to_hdf5(path)Save model to a HDF5 file.
to_json(path)Save model to a JSON file.
to_pickle(path)Save model to a pickle file.
- fit(X, y=None)[source]#
Compute DBSCAN clustering.
- Parameters:
- Xarray-like of shape=(n_ts, sz, d)
Time series dataset.
- y
Ignored
- Returns:
- TimeSeriesDBSCAN
The fitted estimator
- fit_predict(X, y=None)[source]#
Compute DBSCAN clustering.
- Parameters:
- Xarray-like of shape (n_ts, sz, d)
Time series dataset.
- yIgnored
Not used, present here for API consistency by convention.
- Returns:
- labelsarray of shape=(n_ts)
Index of the cluster each TS belongs to. Noisy TS are given the label -1.
- classmethod from_hdf5(path)[source]#
Load model from a HDF5 file. Requires
h5pyhttp://docs.h5py.org/- Parameters:
- pathstr
Full path to file.
- Returns:
- Model instance
- classmethod from_json(path)[source]#
Load model from a JSON file.
- Parameters:
- pathstr
Full path to file.
- Returns:
- Model instance
- classmethod from_pickle(path)[source]#
Load model from a pickle file.
- Parameters:
- pathstr
Full path to file.
- Returns:
- Model instance
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- to_hdf5(path)[source]#
Save model to a HDF5 file. Requires
h5pyhttp://docs.h5py.org/- Parameters:
- pathstr
Full file path. File must not already exist.
- Raises:
- FileExistsError
If a file with the same path already exists.