# Getting started¶

This tutorial will guide you to format your first time series data, import standard datasets, and manipulate them using dedicated machine learning algorithms.

## Time series format¶

First, let us have a look at what `tslearn`

time series format is. To do so, we will use the `to_time_series`

utility
from `tslearn.utils`

:

```
>>> from tslearn.utils import to_time_series
>>> my_first_time_series = [1, 3, 4, 2]
>>> formatted_time_series = to_time_series(my_first_time_series)
>>> print(formatted_time_series.shape)
(4, 1)
```

In `tslearn`

, a time series is nothing more than a two-dimensional `numpy`

array with its first dimension corresponding
to the time axis and the second one being the feature dimensionality (1 by default).

Then, if we want to manipulate sets of time series, we can cast them to three-dimensional arrays, using
`to_time_series_dataset`

. If time series from the set are not equal-sized, NaN values are appended to the shorter
ones and the shape of the resulting array is `(n_ts, max_sz, d)`

where `max_sz`

is the maximum of sizes for time
series in the set.

```
>>> from tslearn.utils import to_time_series_dataset
>>> my_first_time_series = [1, 3, 4, 2]
>>> my_second_time_series = [1, 2, 4, 2]
>>> formatted_dataset = to_time_series_dataset([my_first_time_series, my_second_time_series])
>>> print(formatted_dataset.shape)
(2, 4, 1)
>>> my_third_time_series = [1, 2, 4, 2, 2]
>>> formatted_dataset = to_time_series_dataset([my_first_time_series,
my_second_time_series,
my_third_time_series])
>>> print(formatted_dataset.shape)
(3, 5, 1)
```

## Importing standard time series datasets¶

If you aim at experimenting with standard time series datasets, you should have a look at the
`tslearn.datasets`

.

```
>>> from tslearn.datasets import UCR_UEA_datasets
>>> X_train, y_train, X_test, y_test = UCR_UEA_datasets().load_dataset("TwoPatterns")
>>> print(X_train.shape)
(1000, 128, 1)
>>> print(y_train.shape)
(1000,)
```

Note that when working with time series datasets, it can be useful to rescale time series using tools from the
`tslearn.preprocessing`

.

If you want to import other time series from text files, the expected format is:

each line represents a single time series (and time series from a dataset are not forced to be the same length);

in each line, modalities are separated by a | character (useless if you only have one modality in your data);

in each modality, observations are separated by a space character.

Here is an example of such a file storing two time series of dimension 2 (the first time series is of length 3 and the second one is of length 2).

```
1.0 0.0 2.5|3.0 2.0 1.0
1.0 2.0|4.333 2.12
```

To read from / write to this format, have a look at the `tslearn.utils`

:

```
>>> from tslearn.utils import save_time_series_txt, load_time_series_txt
>>> time_series_dataset = load_time_series_txt("path/to/your/file.txt")
>>> save_time_series_txt("path/to/another/file.txt", dataset_to_be_saved)
```

## Playing with your data¶

Once your data is loaded and formatted according to `tslearn`

standards, the next step is to feed machine learning
models with it. Most `tslearn`

models inherit from `scikit-learn`

base classes, hence interacting with them is very
similar to interacting with a `scikit-learn`

model, except that datasets are not two-dimensional arrays, but rather
`tslearn`

time series datasets (i.e. three-dimensional arrays or lists of two-dimensional arrays).

```
>>> from tslearn.clustering import TimeSeriesKMeans
>>> km = TimeSeriesKMeans(n_clusters=3, metric="dtw")
>>> km.fit(X_train)
```

As seen above, one key parameter when applying machine learning methods to time series datasets is the metric to be used. You can learn more about it in the dedicated section of this documentation.