TSGM

Datasets

class UCRDataManager(path: str = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.7-py3.8.egg/tsgm/utils/../../data', ds: str = 'gunpoint')[source]

A manager for UCR collection of time series datasets. If you find these datasets useful, please cite: @misc{UCRArchive2018,

title = {The UCR Time Series Classification Archive}, author = {Dau, Hoang Anh and Keogh, Eamonn and Kamgar, Kaveh and Yeh, Chin-Chia Michael and Zhu, Yan

and Gharghabi, Shaghayegh and Ratanamahatana, Chotirat Ann and Yanping and Hu, Bing and Begum, Nurjahan and Bagnall, Anthony and Mueen, Abdullah and Batista, Gustavo, and Hexagon-ML},

year = {2018}, month = {October}, note = {url{https://www.cs.ucr.edu/~eamonn/time_series_data_2018/}}

}

Parameters:
Raises:

ValueError – When there is no stored UCR archive, or the name of the dataset is incorrect.

default_path = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.7-py3.8.egg/tsgm/utils/../../data'
get() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Returns a tuple containing training and testing data.

Returns:

A tuple (X_train, y_train, X_test, y_test).

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike]

get_classes_distribution() Dict[source]

Returns a dictionary with the fraction of occurrences for each class.

Returns:

A dictionary containing the fraction of occurrences for each class.

Return type:

dict[Any, float]

key = 'someone'
mirrors = ['https://www.cs.ucr.edu/~eamonn/time_series_data_2018/']
resources = [('UCRArchive_2018.zip', 0)]
summary() None[source]

Prints a summary of the dataset.

class UCRDataManager(path: str = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.7-py3.8.egg/tsgm/utils/../../data', ds: str = 'gunpoint')[source]

A manager for UCR collection of time series datasets. If you find these datasets useful, please cite: @misc{UCRArchive2018,

title = {The UCR Time Series Classification Archive}, author = {Dau, Hoang Anh and Keogh, Eamonn and Kamgar, Kaveh and Yeh, Chin-Chia Michael and Zhu, Yan

and Gharghabi, Shaghayegh and Ratanamahatana, Chotirat Ann and Yanping and Hu, Bing and Begum, Nurjahan and Bagnall, Anthony and Mueen, Abdullah and Batista, Gustavo, and Hexagon-ML},

year = {2018}, month = {October}, note = {url{https://www.cs.ucr.edu/~eamonn/time_series_data_2018/}}

}

Parameters:
Raises:

ValueError – When there is no stored UCR archive, or the name of the dataset is incorrect.

default_path = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.7-py3.8.egg/tsgm/utils/../../data'[source]
get() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Returns a tuple containing training and testing data.

Returns:

A tuple (X_train, y_train, X_test, y_test).

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike]

get_classes_distribution() Dict[source]

Returns a dictionary with the fraction of occurrences for each class.

Returns:

A dictionary containing the fraction of occurrences for each class.

Return type:

dict[Any, float]

key = 'someone'[source]
mirrors = ['https://www.cs.ucr.edu/~eamonn/time_series_data_2018/'][source]
resources = [('UCRArchive_2018.zip', 0)][source]
summary() None[source]

Prints a summary of the dataset.

y_all: Collection[Hashable] | None
download_physionet2012() None[source]

Downloads the Physionet 2012 dataset files from the Physionet website and extracts them in local folder ‘physionet2012’

gen_sine_const_switch_dataset(N: int, T: int, D: int, max_value: int = 10, const: int = 0, frequency_switch: float = 0.1) Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates a dataset with alternating constant and sinusoidal sequences.

Parameters:
  • N (int) – Number of samples in the dataset.

  • T (int) – Length of each sequence in the dataset.

  • D (int) – Number of dimensions in each sequence.

  • max_value (int, optional) – Maximum value for amplitude and shift of the sinusoids. Defaults to 10.

  • const (int, optional) – Value indicating whether the sequence is constant or sinusoidal. Defaults to 0.

  • frequency_switch (float, optional) – Probability of switching between constant and sinusoidal sequences. Defaults to 0.1.

Returns:

Tuple containing input data (X) and target labels (y).

Return type:

tuple[numpy.ndarray, numpy.ndarray]

gen_sine_dataset(N: int, T: int, D: int, max_value: int = 10) ndarray[Any, dtype[ScalarType]][source]

Generates a dataset of sinusoidal waves with random parameters.

Parameters:
  • N (int) – Number of samples in the dataset.

  • T (int) – Length of each time series in the dataset.

  • D (int) – Number of dimensions (sinusoids) in each time series.

  • max_value (int, optional) – Maximum value for amplitude and shift of the sinusoids. Defaults to 10.

Returns:

Generated dataset with shape (N, T, D).

Return type:

numpy.ndarray

gen_sine_vs_const_dataset(N: int, T: int, D: int, max_value: int = 10, const: int = 0) Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates a dataset with alternating sinusoidal and constant sequences.

Parameters:
  • N (int) – Number of samples in the dataset.

  • T (int) – Length of each sequence in the dataset.

  • D (int) – Number of dimensions in each sequence.

  • max_value (int, optional) – Maximum value for amplitude and shift of the sinusoids. Defaults to 10.

  • const (int, optional) – Maximum value for the constant sequence. Defaults to 0.

Returns:

Tuple containing input data (X) and target labels (y).

Return type:

tuple[numpy.ndarray, numpy.ndarray]

get_covid_19() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tuple, List][source]

Loads Covid-19 dataset with additional graph information The dataset is based on data from The New York Times, based on reports from state and local health agencies [1].

And was adapted to graph case in [2]. [1] The New York Times. (2021). Coronavirus (Covid-19) Data in the United States. Retrieved [Insert Date Here], from https://github.com/nytimes/covid-19-data. [2] Alexander V. Nikitin, St John, Arno Solin, Samuel Kaski Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:10640-10660, 2022.

Returns:

tuple

First element is time series data (n_nodes x n_timestamps x n_features). Each timestamp consists of the number of deaths, cases, deaths normalized by the population, and cases normalized by the population. The second element is the graph tuple (nodes, edges). The third element is the order of states.

get_eeg() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Loads the EEG Eye State dataset.

This function downloads the EEG Eye State dataset from the UCI Machine Learning Repository and returns the input features (X) and target labels (y).

Returns:

A tuple containing the input features (X) and target labels (y).

Return type:

tuple[TensorLike, TensorLike]

get_energy_data() ndarray[Any, dtype[ScalarType]][source]

Retrieves the energy consumption dataset.

This function downloads and loads the energy consumption dataset from the UCI Machine Learning Repository. It returns the dataset as a NumPy array.

Returns:

Energy consumption dataset.

Return type:

numpy.ndarray

get_gp_samples_data(num_samples: int, max_time: int, covar_func: ~typing.Callable = <function _exponential_quadratic>) ndarray[Any, dtype[ScalarType]][source]

Generates samples from a Gaussian process.

This function generates samples from a Gaussian process using the specified covariance function. It returns the generated samples as a NumPy array.

Parameters:
  • num_samples (int) – Number of samples to generate.

  • max_time (int) – Maximum time value for the samples.

  • covar_func (Callable, optional) – Covariance function to use. Defaults to _exponential_quadratic.

Returns:

Generated samples from the Gaussian process.

Return type:

numpy.ndarray

get_mauna_loa() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Loads the Mauna Loa CO2 dataset.

This function loads the Mauna Loa CO2 dataset, which contains measurements of atmospheric CO2 concentrations at the Mauna Loa Observatory in Hawaii.

Returns:

A tuple containing the input data (X) and target labels (y).

Return type:

tuple[TensorLike, TensorLike]

get_mnist_data() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Retrieves the MNIST dataset.

This function loads the MNIST dataset, which consists of 28x28 grayscale images of handwritten digits, and returns the training and testing data along with their corresponding labels.

Returns:

A tuple containing the training data, training labels, testing data, and testing labels.

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike]

get_physionet2012() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Retrieves the Physionet 2012 dataset.

This function downloads and retrieves the Physionet 2012 dataset, which consists of physiological data and corresponding outcomes. It returns the training, testing, and validation datasets along with their labels.

Returns:

A tuple containing the training, testing, and validation datasets along with their labels. (train_X, train_y, test_X, test_y, val_X, val_y)

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike, TensorLike, TensorLike]

get_power_consumption() ndarray[Any, dtype[ScalarType]][source]

Retrieves the household power consumption dataset.

This function downloads and loads the household power consumption dataset from the UCI Machine Learning Repository. It returns the dataset as a NumPy array.

Returns:

Household power consumption dataset.

Return type:

numpy.ndarray

get_stock_data(stock_name: str) ndarray[Any, dtype[ScalarType]][source]

Downloads historical stock data for the specified stock ticker.

This function downloads historical stock data for the specified stock ticker using the Yahoo Finance API. It returns the stock data as a NumPy array with an additional axis representing the batch dimension.

Parameters:

stock_name (str) – Ticker symbol of the stock.

Returns:

Historical stock data.

Return type:

numpy.ndarray

Raises:

ValueError – If the provided stock ticker is invalid or no data is available.

get_synchronized_brainwave_dataset() Tuple[DataFrame, DataFrame][source]

Loads the EEG Synchronized Brainwave dataset.

This function downloads the EEG Synchronized Brainwave dataset from dropbox and returns the input features (X) and target labels (y).

Returns:

A tuple containing the input features (X) and target labels (y).

Return type:

tuple[pd.DataFrame, pd.DataFrame]

load_arff(path: str) DataFrame[source]

Loads data from an ARFF (Attribute-Relation File Format) file.

This function reads data from an ARFF file located at the specified path and returns it as a pandas DataFrame.

Parameters:

path (str) – Path to the ARFF file.

Returns:

DataFrame containing the loaded data.

Return type:

pandas.DataFrame

split_dataset_into_objects(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, step: int = 10) Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Splits the dataset into objects of fixed length.

This function splits the input dataset into objects of fixed length along the first dimension, 0-padding if necessary.

Parameters:
  • X (TensorLike) – Input data.

  • y (TensorLike) – Target labels.

  • step (int, optional) – Length of each object. Defaults to 10.

Returns:

A tuple containing input data objects and corresponding target label objects.

Return type:

tuple[TensorLike, TensorLike]

Augmentations

class BaseAugmenter(per_feature: bool)[source]
generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]
class BaseCompose(augmentations: List[BaseAugmenter])[source]
class DTWBarycentricAveraging[source]
DTW Barycenter Averaging (DBA) [1] method estimated through

Expectation-Maximization algorithm [2] as in https://github.com/tslearn-team/tslearn/

References

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, num_initial_samples: int | None = None, initial_timeseries: List[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic] | None = None, initial_labels: List[int] | None = None, **kwargs) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Parameters

X : TensorLike, the timeseries dataset y : TensorLike or None, the classes n_samples : int, number of samples to generate (per class, if y is given) num_initial_samples : int or None (default: None)

The number of timeseries to draw (per class) from the dataset before computing DTW_BA. If None, use the entire set (per class).

initial_timeseriesarray or None (default: None)

Initial timesteries to start from for the optimization process, with shape (original_size, d). In case y is given, the shape of initial_timeseries is assumed to be (n_classes, original_size, d)

initial_labels: array or None (default: None)

Labels for samples from initial_timeseries

Returns

np.array of shape (n_samples, original_size, d) if y is None

or (n_classes * n_samples, original_size, d), and np.array of labels (or None)

class GaussianNoise(per_feature: bool = True)[source]

Apply noise to the input time series. Args:

variance ((float, float) or float): variance range for noise. If var_limit is a single float, the range

will be (0, var_limit). Default: (10.0, 50.0).

mean (float): mean of the noise. Default: 0 per_feature (bool): if set to True, noise will be sampled for each feature independently.

Otherwise, the noise will be sampled once for all features. Default: True

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, mean: float = 0, variance: float = 1.0) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generate synthetic data with Gaussian noise.

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

  • mean (float) – The mean of the noise. Default is 0.

  • variance (float) – The variance of the noise. Default is 1.0.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class MagnitudeWarping[source]

Magnitude warping changes the magnitude of each sample by convolving the data window with a smooth curve varying around one https://dl.acm.org/doi/pdf/10.1145/3136755.3136817

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, sigma: float = 0.2, n_knots: int = 4) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates augmented samples via MagnitudeWarping for (X, y)

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

  • sigma (float) – Standard deviation for the random warping. Default is 0.2.

  • n_knots (int) – Number of knots used for warping curve. Default is 4.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class Shuffle[source]

Shuffles time series features. Shuffling is beneficial when each feature corresponds to interchangeable sensors.

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generate synthetic data using Shuffle strategy. Features are randomly shuffled to generate novel samples.

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class SliceAndShuffle(per_feature: bool = False)[source]

Slice the time series in k pieces and create a new time series by shuffling. Args:

per_feature (bool): if set to True, each time series is sliced independently.

Otherwise, all features are sliced in the same way. Default: True

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, n_segments: int = 2) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generate synthetic data using Slice-And-Shuffle strategy. Slices are randomly selected.

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_segments (int) – The number of slices, default is 2.

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class WindowWarping[source]

https://halshs.archives-ouvertes.fr/halshs-01357973/document

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, window_ratio: float = 0.2, scales: Tuple = (0.25, 1.0), n_samples: int = 1) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates augmented samples via MagnitudeWarping for (X, y)

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • window_ratio (float) – The ratio of the window size relative to the total number of timesteps. Default is 0.2.

  • scale (tuple) – A tuple specifying the scale range for warping. Default is (0.25, 1.0).

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

Metrics

class BaseDownstreamEvaluator[source]
evaluate(*args, **kwargs)[source]
class ConsistencyMetric(evaluators: List)[source]

Predictive consistency metric measures whether a set of evaluators yield consistent results on real and synthetic data.

Parameters:

evaluators (list) – A list of evaluators (each item should implement method .evaluate(D))

class DemographicParityMetric[source]

Measuring demographic parity between two datasets.

This metric assesses the difference in the distributions of a target variable among different groups in two datasets. By default, it uses the Kolmogorov-Smirnov statistic to quantify the maximum vertical deviation between the cumulative distribution functions of the target variable for the historical and synthetic data within each group.

Args:

d_hist (tsgm.dataset.DatasetOrTensor): The historical input dataset or tensor. groups_hist (TensorLike): The group assignments for the historical data. d_synth (tsgm.dataset.DatasetOrTensor): The synthetic input dataset or tensor. groups_synth (TensorLike): The group assignments for the synthetic data. metric (callable, optional): The metric used to compare the target variable distributions within each group.

Default is the Kolmogorov-Smirnov statistic.

Returns:

dict: A dictionary mapping each group to the computed demographic parity metric.

Example:
>>> metric = DemographicParityMetric()
>>> dataset_hist = tsgm.dataset.Dataset(...)
>>> dataset_synth = tsgm.dataset.Dataset(...)
>>> groups_hist = [0, 1, 0, 1, 1, 0]
>>> groups_synth = [1, 1, 0, 0, 0, 1]
>>> result = metric(dataset_hist, groups_hist, dataset_synth, groups_synth)
>>> print(result)
class DiscriminativeMetric[source]

The DiscriminativeMetric measures the discriminative performance of a model in distinguishing between synthetic and real datasets.

This metric evaluates a discriminative model by training it on a combination of synthetic and real datasets and assessing its performance on a test set.

Parameters:
  • d_hist (tsgm.dataset.DatasetOrTensor) – Real dataset.

  • d_syn (tsgm.dataset.DatasetOrTensor) – Synthetic dataset.

  • model (T.Callable) – Discriminative model to be evaluated.

  • test_size (T.Union[float, int]) – Proportion of the dataset to include in the test split or the absolute number of test samples.

  • n_epochs (int) – Number of training epochs for the model.

  • metric (T.Optional[T.Callable]) – Optional evaluation metric to use (default: accuracy).

  • random_seed (T.Optional[int]) – Optional random seed for reproducibility.

Returns:

Discriminative performance metric.

Return type:

float

Example:

>>> from my_module import DiscriminativeMetric, MyDiscriminativeModel
>>> import tsgm.dataset
>>> import numpy as np
>>> import sklearn
>>>
>>> # Create real and synthetic datasets
>>> real_dataset = tsgm.dataset.Dataset(...)  # Replace ... with appropriate arguments
>>> synthetic_dataset = tsgm.dataset.Dataset(...)  # Replace ... with appropriate arguments
>>>
>>> # Create a discriminative model
>>> model = MyDiscriminativeModel()  # Replace with the actual discriminative model class
>>>
>>> # Create and use the DiscriminativeMetric
>>> metric = DiscriminativeMetric()
>>> result = metric(real_dataset, synthetic_dataset, model, test_size=0.2, n_epochs=10)
>>> print(result)
class DistanceMetric(statistics: list, discrepancy: Callable)[source]

Metric that measures similarity between synthetic and real time series

Parameters:
  • statistics (list) – A list of summary statistics (callable)

  • discrepancy (Callable) – Discrepancy function, measures the distance between the vectors of summary statistics.

discrepancy(stats1: Tensor | ndarray[Any, dtype[ScalarType]], stats2: Tensor | ndarray[Any, dtype[ScalarType]]) float[source]
Parameters:
  • stats1 (tsgm.types.Tensor.) – A vector of summary statistics.

  • stats2 (tsgm.types.Tensor.) – A vector of summary statistics.

Returns:

the distance between two vectors calculated by self._discrepancy.

stats(X: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]
Parameters:

X (tsgm.types.Tensor.) – A time series dataset.

Returns:

a tensor with calculated summary statistics.

class DownstreamPerformanceMetric(evaluator: BaseDownstreamEvaluator)[source]

The downstream performance metric evaluates the performance of a model on a downstream task. It returns performance gains achieved with the addition of synthetic data.

Parameters:

evaluator (BaseDownstreamEvaluator) – An evaluator, should implement method .evaluate(D)

class EntropyMetric[source]

Calculates the spectral entropy of a dataset or tensor as a sum of individual entropies.

Args:

d (tsgm.dataset.DatasetOrTensor): The input dataset or tensor.

Returns:

float: The computed spectral entropy.

Example:
>>> metric = EntropyMetric()
>>> dataset = tsgm.dataset.Dataset(...)
>>> result = metric(dataset)
>>> print(result)
class MMDMetric(kernel: ~typing.Callable = <function exp_quad_kernel>)[source]

This metric calculated MMD between real and synthetic samples

Args:

d (tsgm.dataset.DatasetOrTensor): The input dataset or tensor.

Returns:

float: The computed spectral entropy.

Example:
>>> metric = MMDMetric(kernel)
>>> dataset, synth_dataset = tsgm.dataset.Dataset(...), tsgm.dataset.Dataset(...)
>>> result = metric(dataset)
>>> print(result)
class Metric[source]
class PairwiseDistanceMetric[source]

Measures pairwise distances in a set of time series.

pairwise_euclidean_distances(ts: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Computes the pairwise Euclidean distances for a set of time series.

Parameters: ts (numpy.ndarray): A 2D array where each row represents a time series.

Returns: numpy.ndarray: A 2D array representing the pairwise Euclidean distance matrix.

class PredictiveParityMetric[source]

Measuring predictive parity between two datasets.

This metric assesses the discrepancy in the predictive performance of a model among different groups in two datasets. By default, it uses precision to quantify the predictive performance of the model within each group.

Args:

y_true_hist (TensorLike): The true target values for the historical data. y_pred_hist (TensorLike): The predicted target values for the historical data. groups_hist (TensorLike): The group assignments for the historical data. y_true_synth (TensorLike): The true target values for the synthetic data. y_pred_synth (TensorLike): The predicted target values for the synthetic data. groups_synth (TensorLike): The group assignments for the synthetic data. metric (callable, optional): The metric used to compare the predictive performance within each group.

Default is precision score.

Returns:

dict: A dictionary mapping each group to the computed predictive parity metric.

Example:
>>> metric = PredictiveParityMetric()
>>> y_true_hist = [0, 1, 0, 1, 1, 0]
>>> y_pred_hist = [0, 1, 0, 0, 1, 1]
>>> groups_hist = [0, 1, 0, 1, 1, 0]
>>> y_true_synth = [1, 0, 1, 0, 0, 1]
>>> y_pred_synth = [1, 0, 1, 1, 0, 0]
>>> groups_synth = [1, 1, 0, 0, 0, 1]
>>> result = metric(y_true_hist, y_pred_hist, groups_hist, y_true_synth, y_pred_synth, groups_synth)
>>> print(result)
class PrivacyMembershipInferenceMetric(attacker: Any, metric: Callable | None = None)[source]

The metric measures the possibility of membership inference attacks.

Parameters:
  • attacker (Callable) – An attacker, one class classififier (OCC) that implements methods .fit and .predict

  • metric – Measures quality of attacker (precision by default)

class ShannonEntropyMetric[source]

Shannon Entropy calculated over the labels of a dataset. This index is a measure of diversity that accounts for categories present in a dataset.

GANs

class ConditionalGAN(*args, **kwargs)[source]

Conditional GAN implementation for labeled and temporally labeled time series.

Parameters:
  • discriminator (keras.Model) – A discriminator model which takes a time series as input and check whether the sample is real or fake.

  • generator (keras.Model) – Takes as input a random noise vector of latent_dim length and return a simulated time-series.

  • latent_dim (int) – The size of the noise vector.

  • temporal (bool) – Indicates whether the time series temporally labeled or not.

compile(d_optimizer: OptimizerV2, g_optimizer: OptimizerV2, loss_fn: Callable) None[source]

Compiles the generator and discriminator models.

Parameters:
  • d_optimizer (keras.Model) – An optimizer for the GAN’s discriminator.

  • g_optimizer – An optimizer for the GAN’s generator.

  • loss_fn (keras.losses.Loss) – Loss function.

generate(labels: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates new data from the model.

Parameters:

labels (tsgm.types.Tensor) – the number of samples to be generated.

Returns:

generated samples

Return type:

tsgm.types.Tensor

property metrics: List[source]
Returns:

A list of metrics trackers (e.g., generator’s loss and discriminator’s loss).

Return type:

T.List

train_step(data: Tuple) Dict[str, float][source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dictionary with generator (key “g_loss”) and discriminator (key “d_loss”) losses

Return type:

T.Dict[str, float]

class GAN(*args, **kwargs)[source]

GAN implementation for unlabeled time series.

Parameters:
  • discriminator (keras.Model) – A discriminator model which takes a time series as input and check whether the sample is real or fake.

  • generator (keras.Model) – Takes as input a random noise vector of latent_dim length and returns a simulated time-series.

  • latent_dim (int) – The size of the noise vector.

  • use_wgan (bool) – Use Wasserstein GAN with gradien penalty

clone() GAN[source]

Clones GAN object

Returns:

The exact copy of the object

Return type:

“GAN”

compile(d_optimizer: OptimizerV2, g_optimizer: OptimizerV2, loss_fn: Loss) None[source]

Compiles the generator and discriminator models.

Parameters:
  • d_optimizer (keras.Model) – An optimizer for the GAN’s discriminator.

  • g_optimizer – An optimizer for the GAN’s generator.

  • loss_fn (keras.losses.Loss) – Loss function.

generate(num: int) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates new data from the model.

Parameters:

num (int) – the number of samples to be generated.

Returns:

Generated samples

Return type:

tsgm.types.Tensor

gradient_penalty(batch_size, real_samples, fake_samples)[source]
property metrics: List[source]
Returns:

A list of metrics trackers (e.g., generator’s loss and discriminator’s loss).

train_step(data: Tensor | ndarray[Any, dtype[ScalarType]]) Dict[str, float][source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dictionary with generator (key “g_loss”) and discriminator (key “d_loss”) losses

Return type:

T.Dict[str, float]

wgan_discriminator_loss(real_sample, fake_sample)[source]
wgan_generator_loss(fake_sample)[source]

VAEs

class BetaVAE(*args, **kwargs)[source]

beta-VAE implementation for unlabeled time series.

Parameters:
  • encoder (keras.Model) – An encoder model which takes a time series as input and check whether the image is real or fake.

  • decoder (keras.Model) – Takes as input a random noise vector of latent_dim length and returns a simulated time-series.

  • latent_dim (int) – The size of the noise vector.

call(X: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Encodes and decodes time series dataset X.

Parameters:

X (tsgm.types.Tensor) – The size of the noise vector.

Returns:

Generated samples

Return type:

tsgm.types.Tensor

generate(n: int) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates new data from the model.

Parameters:

n (int) – the number of samples to be generated.

Returns:

A tensor with generated samples.

Return type:

tsgm.types.Tensor

property metrics: List[source]
Returns:

A list of metrics trackers (e.g., generator’s loss and discriminator’s loss).

train_step(data: Tensor | ndarray[Any, dtype[ScalarType]]) Dict[source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dict with losses

Return type:

T.Dict

class cBetaVAE(*args, **kwargs)[source]
call(data: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Encodes and decodes time series dataset X.

Parameters:

X (tsgm.types.Tensor) – The size of the noise vector.

Returns:

Generated samples

Return type:

tsgm.types.Tensor

generate(labels: Tensor | ndarray[Any, dtype[ScalarType]]) Tuple[Tensor | ndarray[Any, dtype[ScalarType]], Tensor | ndarray[Any, dtype[ScalarType]]][source]

Generates new data from the model.

Parameters:

labels (tsgm.types.Tensor) – the number of samples to be generated.

Returns:

a tuple of synthetically generated data and labels.

Return type:

T.Tuple[tsgm.types.Tensor, tsgm.types.Tensor]

property metrics: List[source]

Returns the list of loss tracker: [loss, reconstruction_loss, kl_loss].

train_step(data: Tensor | ndarray[Any, dtype[ScalarType]]) Dict[str, float][source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dict with losses

Return type:

T.Dict[str, float]

ABC

class ABCAlgorithm[source]

A base class for ABC algorithms.

sample_parameters(n_samples: int) List[source]
class RejectionSampler(simulator: ModelBasedSimulator, data: Dataset, statistics: List, epsilon: float, discrepancy: Callable, priors: Dict | None = None, **kwargs)[source]

Rejection sampling algorithm for approximate Bayesian computation.

Parameters:
  • simulator (class tsgm.simulator.ModelBasedSimulator) – A model based simulator

  • data (class tsgm.dataset.Dataset) – Historical dataset storage

  • statistics (list) – contains a list of summary statistics

  • epsilon (float) – tolerance of synthetically generated data to a set of summary statistics

  • discrepancy (Callable) – discrepancy measure function

  • priors – set of priors for each of the simulator parametors, defaults to DEFAULT_PRIOR

sample_parameters(n_samples: int) List[source]

Samples parameters from the rejection sampler.

Parameters:

n_samples – Number of samples

Returns:

A list of samples. Each sample is represent as dict.

Return type:

T.List[T.Dict]

prior_samples(priors: Dict, params: List) Dict[source]

Generate prior samples for the specified parameters.

Parameters:
  • priors (T.Dict) – A dictionary containing probability distributions for each parameter. Keys are parameter names, and values are instances of probability distribution classes. If a parameter is not present in the dictionary, a default prior distribution is used.

  • params (T.List) – A list of parameter names for which prior samples are to be generated.

Returns:

A dictionary where keys are parameter names and values are samples drawn from their respective prior distributions.

Return type:

T.Dict

Example:

priors = {'mean': NormalDistribution(0, 1), 'std_dev': UniformDistribution(0, 2)}
params = ['mean', 'std_dev']
samples = prior_samples(priors, params)

STS

class STS(model: StructuralTimeSeries | None = None)[source]

Class for training and generating from a structural time series model.

Initializes a new instance of the STS class.

Parameters:

model (tfp.sts.StructuralTimeSeriesModel or None) – Structural time series model to use. If None, default model is used.

elbo_loss() float[source]

Returns the evidence lower bound (ELBO) loss from training.

Returns:

The value of the ELBO loss.

Return type:

float

generate(num_samples: int) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates samples from the trained model.

Parameters:

num_samples (int) – Number of samples to generate.

Returns:

Generated samples.

Return type:

tsgm.types.Tensor

train(ds: Dataset, num_variational_steps: int = 200, steps_forw: int = 10) None[source]

Trains the structural time series model.

Parameters:
  • ds (tsgm.dataset.Dataset) – Dataset containing time series data.

  • num_variational_steps (int) – Number of variational optimization steps, defaults to 200.

  • steps_forw (int) – Number of steps to forecast, defaults to 10.

Visualization

visualize_dataset(dataset: Dataset | Tensor | ndarray[Any, dtype[ScalarType]], obj_id: int = 0, palette: dict = {'gen': 'blue', 'hist': 'red'}, path: str = '/tmp/generated_data.pdf') None[source]

The function visualizes time series dataset with target values.

Parameters:

dataset (tsgm.dataset.DatasetOrTensor.) – A time series dataset.

visualize_original_and_reconst_ts(original: Tensor | ndarray[Any, dtype[ScalarType]], reconst: Tensor | ndarray[Any, dtype[ScalarType]], num: int = 5, vmin: int = 0, vmax: int = 1) None[source]

Visualizes original and reconstructed time series data.

This function generates side-by-side visualizations of the original and reconstructed time series data. It randomly selects a specified number of samples from the input tensors original and reconst and displays them as images using imshow.

Parameters:
  • original (tsgm.types.Tensor) – Original time series data tensor.

  • reconst (tsgm.types.Tensor) – Reconstructed time series data tensor.

  • num (int, optional) – Number of samples to visualize, defaults to 5.

  • vmin (int, optional) – Minimum value for colormap normalization, defaults to 0.

  • vmax (int, optional) – Maximum value for colormap normalization, defaults to 1.

visualize_training_loss(loss_vector: Tensor | ndarray[Any, dtype[ScalarType]], labels: tuple = (), path: str = '/tmp/training_loss.pdf') None[source]

Plot training losses as a function of the epochs

Parameters:
  • loss_vector – np.array, having shape num of metrics times number of epochs

  • labels – list of strings

  • path – str, where to save the plot

visualize_ts(ts: Tensor | ndarray[Any, dtype[ScalarType]], num: int = 5) None[source]

Visualizes time series tensor.

This function generates a plot to visualize time series data. It displays a specified number of time series from the input tensor.

Parameters:
  • ts (tsgm.types.Tensor) – The time series data tensor of shape (num_samples, num_timesteps, num_features).

  • num (int, optional) – The number of time series to display. Defaults to 5.

Raises:

AssertionError: If the input tensor does not have three dimensions.

Example:
>>> visualize_ts(time_series_tensor, num=10)
visualize_ts_lineplot(ts: Tensor | ndarray[Any, dtype[ScalarType]], ys: Tensor | ndarray[Any, dtype[ScalarType]] | None = None, num: int = 5, unite_features: bool = True, legend_fontsize: int = 12, tick_size: int = 10) None[source]

Visualizes time series data using line plots.

This function generates line plots to visualize the time series data. It randomly selects a specified number of samples from the input tensor ts and plots each sample as a line plot. If ys is provided, it can be either a 1D or 2D tensor representing the target variable(s), and the function will optionally overlay it on the line plot.

Parameters:
  • ts (tsgm.types.Tensor) – Input time series data tensor.

  • ys (tsgm.types.OptTensor, optional) – Optional target variable(s) tensor, defaults to None.

  • num (int, optional) – Number of samples to visualize, defaults to 5.

  • unite_features (bool, optional) – Whether to plot all features together or separately, defaults to True.

  • legend_fontsize (int, optional) – Font size to use.

  • tick_size (int, optional) – Font size for y-axis ticks.

visualize_tsne(X: Tensor | ndarray[Any, dtype[ScalarType]], y: Tensor | ndarray[Any, dtype[ScalarType]], X_gen: Tensor | ndarray[Any, dtype[ScalarType]], y_gen: Tensor | ndarray[Any, dtype[ScalarType]], path: str = '/tmp/tsne_embeddings.pdf', feature_averaging: bool = False, perplexity=30.0) None[source]

Visualizes t-SNE embeddings of real and synthetic data.

This function generates a scatter plot of t-SNE embeddings for real and synthetic data. Each data point is represented by a marker on the plot, and the colors of the markers correspond to the corresponding class labels of the data points.

Parameters:
  • X (tsgm.types.Tensor) – The original real data tensor of shape (num_samples, num_features).

  • y (tsgm.types.Tensor) – The labels of the original real data tensor of shape (num_samples,).

  • X_gen (tsgm.types.Tensor) – The generated synthetic data tensor of shape (num_samples, num_features).

  • y_gen (tsgm.types.Tensor) – The labels of the generated synthetic data tensor of shape (num_samples,).

  • path (str, optional) – The path to save the visualization as a PDF file. Defaults to “/tmp/tsne_embeddings.pdf”.

  • feature_averaging (bool, optional) – Whether to compute the average features for each class. Defaults to False.

visualize_tsne_unlabeled(X: Tensor | ndarray[Any, dtype[ScalarType]], X_gen: Tensor | ndarray[Any, dtype[ScalarType]], palette: dict = {'gen': 'blue', 'hist': 'red'}, alpha: float = 0.25, path: str = '/tmp/tsne_embeddings.pdf', fontsize: int = 20, markerscale: int = 3, markersize: int = 1, feature_averaging: bool = False, perplexity: float = 30.0) None[source]

Visualizes t-SNE embeddings of unlabeled data.

Parameters:
  • X (tsgm.types.Tensor) – The original data tensor of shape (num_samples, num_features).

  • X_gen (tsgm.types.Tensor) – The generated data tensor of shape (num_samples, num_features).

  • palette (dict, optional) – A dictionary mapping class labels to colors. Defaults to DEFAULT_PALETTE_TSNE.

  • alpha (float, optional) – The transparency level of the plotted points. Defaults to 0.25.

  • path (str, optional) – The path to save the visualization as a PDF file. Defaults to “/tmp/tsne_embeddings.pdf”.

  • fontsize (int, optional) – The font size of the class labels in the legend. Defaults to 20.

  • markerscale (int, optional) – The scaling factor for the size of the markers in the legend. Defaults to 3.

  • markersize (int, optional) – The size of the markers in the scatter plot. Defaults to 1.

  • feature_averaging (bool, optional) – Whether to compute the average features for each class. Defaults to False.

Monitors

class GANMonitor(num_samples: int, latent_dim: int, labels: Tensor | ndarray[Any, dtype[ScalarType]], save: bool = True, save_path: str | None = None, mode: str = 'clf')[source]

GANMonitor is a Keras callback for monitoring and visualizing generated samples during training.

Parameters:
  • num_samples (int) – The number of samples to generate and visualize.

  • latent_dim (int) – The dimensionality of the latent space. Defaults to 128.

  • output_dim (int) – The dimensionality of the output space. Defaults to 2.

  • save (bool) – Whether to save the generated samples. Defaults to True.

  • save_path (str) – The path to save the generated samples. Defaults to None.

Raises:

ValueError – If the mode is not one of [‘clf’, ‘reg’]

Note:

If save is True and save_path is not specified, the default save path is “/tmp/”.

Warning:

If save_path is specified but save is False, a warning is issued.

on_epoch_end(epoch: int, logs: Dict | None = None) None[source]

Callback function called at the end of each training epoch.

Parameters:
  • epoch (int) – Current epoch number.

  • logs (dict) – Dictionary containing the training loss values.

class VAEMonitor(num_samples: int = 6, latent_dim: int = 128, output_dim: int = 2, save: bool = True, save_path: str | None = None)[source]

VAEMonitor is a Keras callback for monitoring and visualizing generated samples from a Variational Autoencoder (VAE) during training.

Parameters:
  • num_samples (int) – The number of samples to generate and visualize. Defaults to 6.

  • latent_dim (int) – The dimensionality of the latent space. Defaults to 128.

  • output_dim (int) – The dimensionality of the output space. Defaults to 2.

  • save (bool) – Whether to save the generated samples. Defaults to True.

  • save_path (str) – The path to save the generated samples. Defaults to None.

Raises:

ValueError – If output_dim is less than or equal to 0.

Note:

If save is True and save_path is not specified, the default save path is “/tmp/”.

Warning:

If save_path is specified but save is False, a warning is issued.

on_epoch_end(epoch: int, logs: Dict | None = None) None[source]

Callback function called at the end of each training epoch.

Parameters:
  • epoch (int) – The current epoch number.

  • logs (dict) – Dictionary containing the training loss values.

Zoo

class Architecture[source]
abstract property arch_type[source]
class BaseClassificationArchitecture(seq_len: int, feat_dim: int, output_dim: int)[source]

Base class for classification architectures.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

Initializes the base classification architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

arch_type = 'downstream:classification'[source]
get() Dict[source]

Returns a dictionary containing the model.

Returns:

A dictionary containing the model.

Return type:

dict

property model: Model[source]

Property to access the underlying Keras model.

Returns:

The Keras model.

Return type:

keras.models.Model

class BaseDenoisingArchitecture(seq_len: int, feat_dim: int, n_filters: int = 64, n_conv_layers: int = 3, **kwargs)[source]

Base class for denoising architectures in DDPM (Denoising Diffusion Probabilistic Models, tsgm.models.ddpm).

Attributes:

arch_type: A string indicating the type of architecture, set to “ddpm:denoising”. _seq_len: The length of the input sequences. _feat_dim: The dimensionality of the input features. _n_filters: The number of filters used in the convolutional layers. _n_conv_layers: The number of convolutional layers in the model. _model: The Keras model instance built using the _build_model method.

Initializes the BaseDenoisingArchitecture with the specified parameters.

Args:

seq_len (int): The length of the input sequences. feat_dim (int): The dimensionality of the input features. n_filters (int, optional): The number of filters for convolutional layers. Default is 64. n_conv_layers (int, optional): The number of convolutional layers. Default is 3. **kwargs: Additional keyword arguments to be passed to the parent class Architecture.

arch_type = 'ddpm:denoising'[source]
get() Dict[source]

Returns a dictionary containing the model.

Returns:

A dictionary containing the model.

Return type:

dict

property model: Model[source]

Provides access to the Keras model instance.

Returns:

keras.models.Model: The Keras model instance built by _build_model.

class BaseGANArchitecture[source]

Base class for defining architectures of Generative Adversarial Networks (GANs).

property discriminator: Model[source]

Property for accessing the discriminator model.

Returns:

The discriminator model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the discriminator model is not found.

property generator: Model[source]

Property for accessing the generator model.

Returns:

The generator model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the generator model is not implemented.

get() Dict[source]

Retrieves both discriminator and generator models as a dictionary.

Returns:

A dictionary containing discriminator and generator models.

Return type:

dict

Raises:

NotImplementedError – If either discriminator or generator models are not implemented.

class BaseVAEArchitecture[source]

Base class for defining architectures of Variational Autoencoders (VAEs).

property decoder: Model[source]

Property for accessing the decoder model.

Returns:

The decoder model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the decoder model is not implemented.

property encoder: Model[source]

Property for accessing the encoder model.

Returns:

The encoder model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the encoder model is not implemented.

get() Dict[source]

Retrieves both encoder and decoder models as a dictionary.

Returns:

A dictionary containing encoder and decoder models.

Return type:

dict

Raises:

NotImplementedError – If either encoder or decoder models are not implemented.

class BasicRecurrentArchitecture(hidden_dim: int, output_dim: int, n_layers: int, network_type: str, name: str = 'Sequential')[source]

Base class for recurrent neural network architectures.

Inherits from Architecture.

Parameters:
  • hidden_dim – int, the number of units (e.g. 24)

  • output_dim – int, the number of output units (e.g. 1)

  • n_layers – int, the number of layers (e.g. 3)

  • network_type – str, one of ‘gru’, ‘lstm’, or ‘lstmLN’

  • name – str, model name Default: “Sequential”

arch_type = 'rnn_architecture'[source]
build(activation: str = 'sigmoid', return_sequences: bool = True) Model[source]

Builds the recurrent neural network model.

Parameters:
  • activation (str) – Activation function for the output layer (default is ‘sigmoid’).

  • return_sequences (bool) – Whether to return the full sequence of outputs (default is True).

Returns:

The built Keras model.

Return type:

keras.models.Model

class BlockClfArchitecture(seq_len: int, feat_dim: int, output_dim: int, blocks: list)[source]

Architecture for classification using a sequence of blocks.

Inherits from BaseClassificationArchitecture.

Initializes the BlockClfArchitecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

  • blocks (list) – List of blocks used in the architecture.

arch_type = 'downstream:classification'[source]
class ConvnArchitecture(seq_len: int, feat_dim: int, output_dim: int, n_conv_blocks: int = 1)[source]

Convolutional neural network architecture for classification. Inherits from BaseClassificationArchitecture.

Initializes the convolutional neural network architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

  • n_conv_blocks (int, optional) – Number of convolutional blocks to use (default is 1).

class ConvnLSTMnArchitecture(seq_len: int, feat_dim: int, output_dim: int, n_conv_lstm_blocks: int = 1)[source]

Initializes the base classification architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

class DDPMConvDenoiser(**kwargs)[source]

A convolutional denoising model for DDPM.

This class defines a convolutional neural network architecture used as a denoiser in DDPM. It predicts the noise added to the input samples during the diffusion process.

Attributes:

arch_type: A string indicating the architecture type, set to “ddpm:denoiser”.

Initializes the DDPMConvDenoiser model with additional parameters.

Args:

**kwargs: Additional keyword arguments to be passed to the parent class.

arch_type = 'ddpm:denoiser'[source]
class Sampling(*args, **kwargs)[source]

Custom Keras layer for sampling from a latent space.

This layer samples from a latent space using the reparameterization trick during training. It takes as input the mean and log variance of the latent distribution and generates samples by adding random noise scaled by the standard deviation to the mean.

call(inputs: Tuple[Tensor | ndarray[Any, dtype[ScalarType]], Tensor | ndarray[Any, dtype[ScalarType]]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates samples from a latent space.

Parameters:

inputs (tuple[tsgm.types.Tensor, tsgm.types.Tensor]) – Tuple containing mean and log variance tensors of the latent distribution.

Returns:

Sampled latent vector.

Return type:

tsgm.types.Tensor

class TimeEmbedding(*args, **kwargs)[source]
call(inputs: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Args:
inputs: Input tensor, or dict/list/tuple of input tensors.

The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

  • NumPy array or Python scalar values in inputs get cast as tensors.

  • Keras mask metadata is only collected from inputs.

  • Layers are built (build(input_shape) method) using shape info from inputs only.

  • input_spec compatibility is only checked against inputs.

  • Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

  • The SavedModel input specification is generated using inputs only.

  • Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

*args: Additional positional arguments. May contain tensors, although

this is not recommended, for the reasons above.

**kwargs: Additional keyword arguments. May contain tensors, although

this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

  • mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns:

A tensor or list/tuple of tensors.

class TransformerClfArchitecture(seq_len: int, feat_dim: int, num_heads: int = 2, ff_dim: int = 64, n_blocks: int = 1, dropout_rate=0.5, output_dim: int = 2)[source]

Base class for transformer architectures.

Inherits from BaseClassificationArchitecture.

Initializes the TransformerClfArchitecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • num_heads (int) – Number of attention heads (default is 2).

  • ff_dim (int) – Feed forward dimension in the attention block (default is 64).

  • output_dim (int, optional) – Dimensionality of the output.

  • dropout_rate (float, optional) – Dropout probability (default is 0.5).

  • n_blocks (int, optional) – Number of transformer blocks (default is 1).

  • output_dim – Number of classes (default is 2).

arch_type = 'downstream:classification'[source]
transformer_block(inputs)[source]
class VAE_CONV5Architecture(seq_len: int, feat_dim: int, latent_dim: int)[source]

This class defines the architecture for a Variational Autoencoder (VAE) with Convolutional Layers.

Parameters:

seq_len (int): Length of input sequence. feat_dim (int): Dimensionality of input features. latent_dim (int): Dimensionality of latent space.

Initializes the VAE_CONV5Architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

arch_type = 'vae:unconditional'[source]
class WaveGANArchitecture(seq_len: int, feat_dim: int = 64, latent_dim: int = 32, output_dim: int = 1, kernel_size: int = 32, phase_rad: int = 2, use_batchnorm: bool = False)[source]

WaveGAN architecture, from https://arxiv.org/abs/1802.04208

Inherits from BaseGANArchitecture.

Initializes the WaveGANArchitecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of the latent space.

  • output_dim (int) – Dimensionality of the output.

  • kernel_size (int, optional) – Sizes of convolutions

  • phase_rad (int, optional) – Phase shuffle radius for wavegan (default is 2)

  • use_batchnorm (bool, optional) – Whether to use batchnorm (default is False)

arch_type = 'gan:raw'[source]
class Zoo(*arg, **kwargs)[source]

A collection of architectures represented. It behaves like supports Python dict API.

Initializes the Zoo.

summary() None[source]

Prints a summary of architectures in the Zoo.

class cGAN_Conv4Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int)[source]

Architecture for Conditional Generative Adversarial Network (cGAN) with Convolutional Layers.

Initializes the cGAN_Conv4Architecture.

Parameters:
  • seq_len (int) – Length of input sequence.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

  • output_dim (int) – Dimensionality of output.

arch_type = 'gan:conditional'[source]
class cGAN_LSTMConv3Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int)[source]

Architecture for Conditional Generative Adversarial Network (cGAN) with LSTM and Convolutional Layers.

Initializes the cGAN_LSTMConv3Architecture.

Parameters:
  • seq_len (int) – Length of input sequence.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

  • output_dim (int) – Dimensionality of output.

arch_type = 'gan:conditional'[source]
class cGAN_LSTMnArchitecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int, n_blocks: int = 1, output_activation: str = 'tanh')[source]

Conditional Generative Adversarial Network (cGAN) with LSTM-based architecture.

Inherits from BaseGANArchitecture.

Initializes the cGAN_LSTMnArchitecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of the latent space.

  • output_dim (int) – Dimensionality of the output.

  • n_blocks (int, optional) – Number of LSTM blocks in the architecture (default is 1).

  • output_activation (str, optional) – Activation function for the output layer (default is “tanh”).

arch_type = 'gan:conditional'[source]
class cVAE_CONV5Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int = 2)[source]
arch_type = 'vae:conditional'[source]
class tcGAN_Conv4Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int)[source]

Architecture for Temporal Conditional Generative Adversarial Network (tcGAN) with Convolutional Layers.

Initializes the tcGAN_Conv4Architecture.

Parameters:
  • seq_len (int) – Length of input sequence.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

  • output_dim (int) – Dimensionality of output.

arch_type = 'gan:t-conditional'[source]

Simulators

class BaseSimulator[source]

Abstract base class for simulators. This class defines the interface for simulators.

Methods

generate(num_samples: int, *args) -> tsgm.dataset.Dataset

Generate a dataset with the specified number of samples.

dump(path: str, format: str = “csv”) -> None

Save the generated dataset to a file in the specified format.

abstract dump(path: str, format: str = 'csv') None[source]

Abstract method to save the generated dataset to a file.

Parameters

pathstr

The file path where the dataset will be saved.

formatstr, optional

The format in which to save the dataset, by default “csv”.

abstract generate(num_samples: int, *args) Dataset[source]

Abstract method to generate a dataset.

Parameters

num_samplesint

Number of samples to generate.

*args

Additional arguments to be passed to the method.

Returns

tsgm.dataset.Dataset

The generated dataset.

class LotkaVolterraSimulator(data: DatasetProperties, alpha: float = 1, beta: float = 1, gamma: float = 1, delta: float = 1, x0: float = 1, y0: float = 1)[source]

Simulates the Lotka-Volterra equations, which model the dynamics of biological systems in which two species interact, one as a predator and the other as prey.

For the details refer to https://en.wikipedia.org/wiki/Lotka%E2%80%93Volterra_equations

Initializes the Lotka-Volterra simulator with given parameters.

Args:

data (tsgm.dataset.DatasetProperties): The dataset properties. alpha (float): The maximum prey per capita growth rate. Default is 1. beta (float): The effect of the presence of predators on the prey death rate. Default is 1. gamma (float): The predator’s per capita death rate. Default is 1. delta (float): The effect of the presence of prey on the predator’s growth rate. Default is 1. x0 (float): The initial population density of prey. Default is 1. y0 (float): The initial population density of predator. Default is 1.

clone() LotkaVolterraSimulator[source]

Creates a deep copy of the current LotkaVolterraSimulator instance.

Returns:

LotkaVolterraSimulator: A new instance of LotkaVolterraSimulator with copied data and parameters.

generate(num_samples: int, tmax: float = 1)[source]

Generates the simulation data based on the Lotka-Volterra equations.

Args:

num_samples (int): The number of sample points to generate. tmax (float): The maximum time value for the simulation. Default is 1.

Returns:

np.ndarray: An array containing the population densities of prey and predators over time.

set_params(alpha, beta, gamma, delta, x0, y0, **kwargs)[source]

Sets the parameters for the simulator.

Args:

alpha (float): The maximum prey per capita growth rate. beta (float): The effect of the presence of predators on the prey death rate. gamma (float): The predator’s per capita death rate. delta (float): The effect of the presence of prey on the predator’s growth rate. x0 (float): The initial population density of prey. y0 (float): The initial population density of predator. **kwargs: Arbitrary keyword arguments for setting simulator parameters.

class ModelBasedSimulator(data: DatasetProperties)[source]

A simulator that is based on a model. This class extends the Simulator class and provides additional methods for handling model parameters.

Methods

params() -> T.Dict[str, T.Any]

Get a dictionary of the simulator’s parameters.

set_params(params: T.Dict[str, T.Any]) -> None

Set the simulator’s parameters from a dictionary.

generate(num_samples: int, *args) -> None

Generate a dataset with the specified number of samples.

Initialize the ModelBasedSimulator with dataset properties.

Parameters

datatsgm.dataset.DatasetProperties

Properties of the dataset to be used.

abstract generate(num_samples: int, *args) None[source]

Abstract method to generate a dataset. Must be implemented by subclasses.

Parameters

num_samplesint

Number of samples to generate.

*args

Additional arguments to be passed to the method.

Raises

NotImplementedError

This method is not implemented in this class and must be overridden by subclasses.

params() Dict[str, Any][source]

Get a dictionary of the simulator’s parameters.

Returns

dict

A dictionary containing the simulator’s parameters.

set_params(params: Dict[str, Any]) None[source]

Set the simulator’s parameters from a dictionary.

Parameters

paramsdict

A dictionary containing the parameters to set.

class NNSimulator(data: DatasetProperties, driver: Any | None = None)[source]

Initialize the Simulator with dataset properties and an optional model.

Parameters

datatsgm.dataset.DatasetProperties

Properties of the dataset to be used.

driverOptional[tsgm.types.Model], optional

The model to be used for generating data, by default None.

clone() NNSimulator[source]

Create a deep copy of the simulator.

Returns

Simulator

A deep copy of the current simulator instance.

class PredictiveMaintenanceSimulator(data: DatasetProperties)[source]

Predictive Maintenance Simulator class that extends the ModelBasedSimulator base class. The simulator is based on https://github.com/AaltoPML/human-in-the-loop-predictive-maintenance From publication: Nikitin, Alexander, and Samuel Kaski. “Human-in-the-loop large-scale predictive maintenance of workstations.” Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022.

Attributes:

CAT_FEATURES (list): List of categorical feature indices. encoders (dict): Dictionary of OneHotEncoders for categorical features.

Methods:

__init__(data): Initializes the simulator with dataset properties and sets encoders. S(lmbd, t): Calculates the survival curve. R(rho, lmbd, t): Calculates the recovery curve parameter. set_params(**kwargs): Sets the parameters for the simulator. mixture_function(a, x): Calculates the mixture function. sample_equipment(num_samples): Samples equipment data and generates the dataset. generate(num_samples): Generates the predictive maintenance dataset. clone() -> PredictiveMaintenanceSimulator: Creates and returns a deep copy of the current simulator.

Initializes the PredictiveMaintenanceSimulator with dataset properties and sets encoders for categorical features.

Args:

data (tsgm.dataset.DatasetProperties): Dataset properties for the simulator.

CAT_FEATURES = [0, 1, 2, 3, 4, 5, 6, 7][source]
R(rho, lmbd, t)[source]

Calculates the recovery curve parameter.

Args:

rho: Rho parameter for the recovery function. lmbd: Lambda parameter for the exponential distribution. t: Time variable.

Returns:

float: Recovery curve parameter at time t.

S(lmbd, t)[source]

Calculates the survival curve.

Args:

lmbd: Lambda parameter for the exponential distribution. t: Time variable.

Returns:

float: Survival probability at time t.

clone() PredictiveMaintenanceSimulator[source]

Creates a deep copy of the current PredictiveMaintenanceSimulator instance.

Returns:

PredictiveMaintenanceSimulator: A new instance of PredictiveMaintenanceSimulator with copied data and parameters.

generate(num_samples: int)[source]

Samples equipment data and generates the dataset.

Args:

num_samples (int): Number of samples to generate.

Returns:

tuple: A tuple containing the dataset and equipment information.

mixture_function(a, x)[source]

Calculates the mixture function.

Args:

a: Mixture parameter. x: Input variable.

Returns:

float: Mixture function value.

sample_equipment(num_samples)[source]

Samples equipment data and generates the dataset.

Args:

num_samples (int): Number of samples to generate.

Returns:

tuple: A tuple containing the dataset and equipment information.

set_params(**kwargs)[source]

Sets the parameters for the simulator.

Args:

**kwargs: Arbitrary keyword arguments for setting simulator parameters.

class Simulator(data: DatasetProperties, driver: Any | None = None)[source]

Concrete class for a basic simulator. This class implements the basic methods for fitting a model and generating a dataset, but does not implement the generation and dump methods.

Attributes

_datatsgm.dataset.DatasetProperties

Properties of the dataset to be used by the simulator.

_driverOptional[tsgm.types.Model]

The model to be used for generating data.

Initialize the Simulator with dataset properties and an optional model.

Parameters

datatsgm.dataset.DatasetProperties

Properties of the dataset to be used.

driverOptional[tsgm.types.Model], optional

The model to be used for generating data, by default None.

clone() Simulator[source]

Create a deep copy of the simulator.

Returns

Simulator

A deep copy of the current simulator instance.

dump(path: str, format: str = 'csv') None[source]

Method to save the generated dataset to a file. Not implemented in this class.

Parameters

pathstr

The file path where the dataset will be saved.

formatstr, optional

The format in which to save the dataset, by default “csv”.

Raises

NotImplementedError

This method is not implemented in this class.

fit(**kwargs) None[source]

Fit the model using the dataset properties.

Parameters

**kwargs

Additional keyword arguments to pass to the model’s fit method.

generate(num_samples: int, *args) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Method to generate a dataset. Not implemented in this class.

Parameters

num_samplesint

Number of samples to generate.

*args

Additional arguments to be passed to the method.

Returns

TensorLike

The generated dataset.

Raises

NotImplementedError

This method is not implemented in this class.

class SineConstSimulator(data: DatasetProperties, max_scale: float = 10.0, max_const: float = 5.0)[source]

Sine and Constant Function Simulator class that extends the ModelBasedSimulator base class.

Attributes:

_scale: TensorFlow probability distribution for scaling factor. _const: TensorFlow probability distribution for constant. _shift: TensorFlow probability distribution for shift.

Methods:

__init__(data, max_scale=10.0, max_const=5.0): Initializes the simulator with dataset properties and optional parameters. set_params(max_scale, max_const, *args, **kwargs): Sets the parameters for scale, constant, and shift distributions. generate(num_samples, *args) -> tsgm.dataset.Dataset: Generates a dataset based on sine and constant functions. clone() -> SineConstSimulator: Creates and returns a deep copy of the current simulator.

Initializes the SineConstSimulator with dataset properties and optional maximum scale and constant values. Args:

data (tsgm.dataset.DatasetProperties): Dataset properties for the simulator. max_scale (float, optional): Maximum value for the scale parameter. Defaults to 10.0. max_const (float, optional): Maximum value for the constant parameter. Defaults to 5.0.

clone() SineConstSimulator[source]

Creates a deep copy of the current SineConstSimulator instance.

Returns:

SineConstSimulator: A new instance of SineConstSimulator with copied data and parameters.

generate(num_samples: int, *args) Dataset[source]

Generates a dataset based on sine and constant functions.

Args:

num_samples (int): Number of samples to generate.

Returns:

tsgm.dataset.Dataset: A dataset containing generated samples.

set_params(max_scale: float, max_const: float, *args, **kwargs)[source]

Sets the parameters for scale, constant, and shift distributions.

Args:

max_scale (float): Maximum value for the scale parameter. max_const (float): Maximum value for the constant parameter.

Data Processing Utils

class TSFeatureWiseScaler(feature_range: Tuple[float, float] = (0, 1))[source]

Scales time series data feature-wise.

Parameters:

feature_rangetuple(float, float), optional

Tuple representing the minimum and maximum feature values (default is (0, 1)).

Attributes:

_min_vfloat

Minimum feature value.

_max_vfloat

Maximum feature value.

Initializes a new instance of the TSFeatureWiseScaler class.

parameter feature_range:

Tuple representing the minimum and maximum feature values, defaults to (0, 1)

type tuple(float, float), optional:

fit(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) TSFeatureWiseScaler[source]

Fits the scaler to the data.

Parameters:

X (TensorLike) – Input data.

Returns:

The fitted scaler object.

Return type:

TSGlobalScaler

fit_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Fits the scaler to the data and transforms it.

Parameters:

X (TensorLike) – Input data

Returns:

Scaled input data X

Return type:

TensorLike

inverse_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Inverse-transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Original data.

Return type:

TensorLike

transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Scaled X.

Return type:

TensorLike

class TSGlobalScaler[source]

Scales time series data globally.

Attributes:

minfloat

Minimum value encountered in the data.

maxfloat

Maximum value encountered in the data.

fit(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) TSGlobalScaler[source]

Fits the scaler to the data.

Parameters:

X (TensorLike) – Input data.

Returns:

The fitted scaler object.

Return type:

TSGlobalScaler

fit_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Fits the scaler to the data and transforms it.

Parameters:

X (TensorLike) – Input data

Returns:

Scaled input data X

Return type:

TensorLike

inverse_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Inverse-transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Original data.

Return type:

TensorLike

transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Scaled X.

Return type:

TensorLike