# Getting started¶

This tutorial will guide you to format your first time series data, import standard datasets, and manipulate them using dedicated machine learning algorithms.

## Time series format¶

First, let us have a look at what tslearn time series format is. To do so, we will use the to_time_series utility from tslearn.utils module:

>>> from tslearn.utils import to_time_series
>>> my_first_time_series = [1, 3, 4, 2]
>>> formatted_time_series = to_time_series(my_first_time_series)
>>> print(formatted_time_series.shape)
(4, 1)


In tslearn, a time series is nothing more than a two-dimensional numpy array with its first dimension corresponding to the time axis and the second one being the feature dimensionality (1 by default).

Then, if we want to manipulate sets of time series, we can cast them to three-dimensional arrays, using to_time_series_dataset. If time series from the set are not equal-sized, NaN values are appended to the shorter ones and the shape of the resulting array is (n_ts, max_sz, d) where max_sz is the maximum of sizes for time series in the set.

>>> from tslearn.utils import to_time_series_dataset
>>> my_first_time_series = [1, 3, 4, 2]
>>> my_second_time_series = [1, 2, 4, 2]
>>> formatted_dataset = to_time_series_dataset([my_first_time_series, my_second_time_series])
>>> print(formatted_dataset.shape)
(2, 4, 1)
>>> my_third_time_series = [1, 2, 4, 2, 2]
>>> formatted_dataset = to_time_series_dataset([my_first_time_series,
my_second_time_series,
my_third_time_series])
>>> print(formatted_dataset.shape)
(3, 5, 1)


## Importing standard time series datasets¶

If you aim at experimenting with standard time series datasets, you should have a look at the tslearn.datasets module.

>>> from tslearn.datasets import UCR_UEA_datasets
>>> X_train, y_train, X_test, y_test = UCR_UEA_datasets().load_dataset("TwoPatterns")
>>> print(X_train.shape)
(1000, 128, 1)
>>> print(y_train.shape)
(1000,)


Note that when working with time series datasets, it can be useful to rescale time series using tools from the tslearn.preprocessing module.

If you want to import other time series from text files, the expected format is:

• each line represents a single time series (and time series from a dataset are not forced to be the same length);
• in each line, modalities are separated by a | character (useless if you only have one modality in your data);
• in each modality, observations are sparated by a space character.

Here is an example of such a file storing two time series of dimension 2 (the first time series is of length 3 and the second one is of length 2).

1.0 0.0 2.5|3.0 2.0 1.0
1.0 2.0|4.333 2.12


To read from / write to this format, have a look at the tslearn.utils module:

>>> from tslearn.utils import save_timeseries_txt, load_timeseries_txt

>>> from tslearn.clustering import TimeSeriesKMeans