TSGM

Datasets

class UCRDataManager(path: str = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.5-py3.8.egg/tsgm/utils/../../data', ds: str = 'gunpoint')[source]

A manager for UCR collection of time series datasets.

Parameters:
  • path (str) – a relative path to the stored UCR dataset.

  • ds (str) – Name of the dataset. Should be in (beef | coffee | ecg200 | freezer | gunpoint | insect | mixed_shapes | starlight).

Raises:

ValueError – When there is no stored UCR archive, or the name of the dataset is incorrect.

default_path = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.5-py3.8.egg/tsgm/utils/../../data'
get() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Returns a tuple containing training and testing data.

Returns:

A tuple (X_train, y_train, X_test, y_test).

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike]

get_classes_distribution() Dict[source]

Returns a dictionary with the fraction of occurrences for each class.

Returns:

A dictionary containing the fraction of occurrences for each class.

Return type:

dict[Any, float]

key = 'someone'
mirrors = ['https://www.cs.ucr.edu/~eamonn/time_series_data_2018/']
resources = [('UCRArchive_2018.zip', 0)]
summary() None[source]

Prints a summary of the dataset.

Augmentations

class BaseAugmenter(per_feature: bool)[source]
generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]
class BaseCompose(augmentations: List[BaseAugmenter])[source]
class DTWBarycentricAveraging[source]
DTW Barycenter Averaging (DBA) [1] method estimated through

Expectation-Maximization algorithm [2] as in https://github.com/tslearn-team/tslearn/

References

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, num_initial_samples: int | None = None, initial_timeseries: List[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic] | None = None, initial_labels: List[int] | None = None, **kwargs) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Parameters

X : TensorLike, the timeseries dataset y : TensorLike or None, the classes n_samples : int, number of samples to generate (per class, if y is given) num_initial_samples : int or None (default: None)

The number of timeseries to draw (per class) from the dataset before computing DTW_BA. If None, use the entire set (per class).

initial_timeseriesarray or None (default: None)

Initial timesteries to start from for the optimization process, with shape (original_size, d). In case y is given, the shape of initial_timeseries is assumed to be (n_classes, original_size, d)

initial_labels: array or None (default: None)

Labels for samples from initial_timeseries

Returns

np.array of shape (n_samples, original_size, d) if y is None

or (n_classes * n_samples, original_size, d), and np.array of labels (or None)

class GaussianNoise(per_feature: bool = True)[source]

Apply noise to the input time series. Args:

variance ((float, float) or float): variance range for noise. If var_limit is a single float, the range

will be (0, var_limit). Default: (10.0, 50.0).

mean (float): mean of the noise. Default: 0 per_feature (bool): if set to True, noise will be sampled for each feature independently.

Otherwise, the noise will be sampled once for all features. Default: True

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, mean: float = 0, variance: float = 1.0) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generate synthetic data with Gaussian noise.

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

  • mean (float) – The mean of the noise. Default is 0.

  • variance (float) – The variance of the noise. Default is 1.0.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class MagnitudeWarping[source]

Magnitude warping changes the magnitude of each sample by convolving the data window with a smooth curve varying around one https://dl.acm.org/doi/pdf/10.1145/3136755.3136817

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, sigma: float = 0.2, n_knots: int = 4) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates augmented samples via MagnitudeWarping for (X, y)

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

  • sigma (float) – Standard deviation for the random warping. Default is 0.2.

  • n_knots (int) – Number of knots used for warping curve. Default is 4.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class Shuffle[source]

Shuffles time series features. Shuffling is beneficial when each feature corresponds to interchangeable sensors.

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generate synthetic data using Shuffle strategy. Features are randomly shuffled to generate novel samples.

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class SliceAndShuffle(per_feature: bool = False)[source]

Slice the time series in k pieces and create a new time series by shuffling. Args:

per_feature (bool): if set to True, each time series is sliced independently.

Otherwise, all features are sliced in the same way. Default: True

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, n_samples: int = 1, n_segments: int = 2) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generate synthetic data using Slice-And-Shuffle strategy. Slices are randomly selected.

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • n_segments (int) – The number of slices, default is 2.

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

class WindowWarping[source]

https://halshs.archives-ouvertes.fr/halshs-01357973/document

generate(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | None = None, window_ratio: float = 0.2, scales: Tuple = (0.25, 1.0), n_samples: int = 1) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic | Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates augmented samples via MagnitudeWarping for (X, y)

Parameters:
  • X (TensorLike) – Input data tensor of shape (n_data, n_timesteps, n_features).

  • y (Optional[TensorLike]) – Optional labels tensor. If provided, labels will also be returned

  • window_ratio (float) – The ratio of the window size relative to the total number of timesteps. Default is 0.2.

  • scale (tuple) – A tuple specifying the scale range for warping. Default is (0.25, 1.0).

  • n_samples (int) – Number of augmented samples to generate. Default is 1.

Returns:

Augmented data tensor of shape (n_samples, n_timesteps, n_features) and optionally augmented labels if ‘y’ is provided.

Return type:

Union[TensorLike, Tuple[TensorLike, TensorLike]]

Metrics

class BaseDownstreamEvaluator[source]
evaluate(*args, **kwargs)[source]
class ConsistencyMetric(evaluators: List)[source]

Predictive consistency metric measures whether a set of evaluators yield consistent results on real and synthetic data.

Parameters:

evaluators (list) – A list of evaluators (each item should implement method .evaluate(D))

class DemographicParityMetric[source]

Measuring demographic parity between two datasets.

This metric assesses the disparity in the distributions of a target variable among different groups in two datasets. By default, it uses the Kolmogorov-Smirnov statistic to quantify the maximum vertical deviation between the cumulative distribution functions of the target variable for the historical and synthetic data within each group.

Args:

d_hist (tsgm.dataset.DatasetOrTensor): The historical input dataset or tensor. groups_hist (TensorLike): The group assignments for the historical data. d_synth (tsgm.dataset.DatasetOrTensor): The synthetic input dataset or tensor. groups_synth (TensorLike): The group assignments for the synthetic data. metric (callable, optional): The metric used to compare the target variable distributions within each group.

Default is the Kolmogorov-Smirnov statistic.

Returns:

dict: A dictionary mapping each group to the computed demographic parity metric.

Example:
>>> metric = DemographicParityMetric()
>>> dataset_hist = tsgm.dataset.Dataset(...)
>>> dataset_synth = tsgm.dataset.Dataset(...)
>>> groups_hist = [0, 1, 0, 1, 1, 0]
>>> groups_synth = [1, 1, 0, 0, 0, 1]
>>> result = metric(dataset_hist, groups_hist, dataset_synth, groups_synth)
>>> print(result)
class DiscriminativeMetric[source]

The DiscriminativeMetric measures the discriminative performance of a model in distinguishing between synthetic and real datasets.

This metric evaluates a discriminative model by training it on a combination of synthetic and real datasets and assessing its performance on a test set.

Parameters:
  • d_hist (tsgm.dataset.DatasetOrTensor) – Real dataset.

  • d_syn (tsgm.dataset.DatasetOrTensor) – Synthetic dataset.

  • model (T.Callable) – Discriminative model to be evaluated.

  • test_size (T.Union[float, int]) – Proportion of the dataset to include in the test split or the absolute number of test samples.

  • n_epochs (int) – Number of training epochs for the model.

  • metric (T.Optional[T.Callable]) – Optional evaluation metric to use (default: accuracy).

  • random_seed (T.Optional[int]) – Optional random seed for reproducibility.

Returns:

Discriminative performance metric.

Return type:

float

Example:

>>> from my_module import DiscriminativeMetric, MyDiscriminativeModel
>>> import tsgm.dataset
>>> import numpy as np
>>> import sklearn
>>>
>>> # Create real and synthetic datasets
>>> real_dataset = tsgm.dataset.Dataset(...)  # Replace ... with appropriate arguments
>>> synthetic_dataset = tsgm.dataset.Dataset(...)  # Replace ... with appropriate arguments
>>>
>>> # Create a discriminative model
>>> model = MyDiscriminativeModel()  # Replace with the actual discriminative model class
>>>
>>> # Create and use the DiscriminativeMetric
>>> metric = DiscriminativeMetric()
>>> result = metric(real_dataset, synthetic_dataset, model, test_size=0.2, n_epochs=10)
>>> print(result)
class DistanceMetric(statistics: list, discrepancy: Callable)[source]

Metric that measures similarity between synthetic and real time series

Parameters:
  • statistics (list) – A list of summary statistics (callable)

  • discrepancy (Callable) – Discrepancy function, measures the distance between the vectors of summary statistics.

discrepancy(stats1: Tensor | ndarray[Any, dtype[ScalarType]], stats2: Tensor | ndarray[Any, dtype[ScalarType]]) float[source]
Parameters:
  • stats1 (tsgm.types.Tensor.) – A vector of summary statistics.

  • stats2 (tsgm.types.Tensor.) – A vector of summary statistics.

Returns:

the distance between two vectors calculated by self._discrepancy.

stats(X: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]
Parameters:

X (tsgm.types.Tensor.) – A time series dataset.

Returns:

a tensor with calculated summary statistics.

class DownstreamPerformanceMetric(evaluator: BaseDownstreamEvaluator)[source]

The downstream performance metric evaluates the performance of a model on a downstream task. It returns performance gains achieved with the addition of synthetic data.

Parameters:

evaluator (BaseDownstreamEvaluator) – An evaluator, should implement method .evaluate(D)

class EntropyMetric[source]

Calculates the spectral entropy of a dataset or tensor.

This metric measures the randomness or disorder in a dataset or tensor using spectral entropy, which is a measure of the distribution of energy in the frequency domain.

Args:

d (tsgm.dataset.DatasetOrTensor): The input dataset or tensor.

Returns:

float: The computed spectral entropy.

Example:
>>> metric = EntropyMetric()
>>> dataset = tsgm.dataset.Dataset(...)
>>> result = metric(dataset)
>>> print(result)
class MMDMetric(kernel: ~typing.Callable = <function exp_quad_kernel>)[source]

This metric calculated MMD between real and synthetic samples

Args:

d (tsgm.dataset.DatasetOrTensor): The input dataset or tensor.

Returns:

float: The computed spectral entropy.

Example:
>>> metric = MMDMetric(kernel)
>>> dataset, synth_dataset = tsgm.dataset.Dataset(...), tsgm.dataset.Dataset(...)
>>> result = metric(dataset)
>>> print(result)
class Metric[source]
class PrivacyMembershipInferenceMetric(attacker: Any, metric: Callable | None = None)[source]

The metric that measures the possibility of membership inference attacks.

Parameters:
  • attacker (Callable) – An attacker, one class classififier (OCC) that implements methods .fit and .predict

  • metric – Measures quality of attacker (precision by default)

GANs

class ConditionalGAN(*args, **kwargs)[source]

Conditional GAN implementation for labeled and temporally labeled time series.

Parameters:
  • discriminator (keras.Model) – A discriminator model which takes a time series as input and check whether the image is real or fake.

  • generator (keras.Model) – Takes as input a random noise vector of latent_dim length and return a simulated time-series.

  • latent_dim (int) – The size of the noise vector.

  • temporal (bool) – Indicates whether the time series temporally labeled or not.

compile(d_optimizer: OptimizerV2, g_optimizer: OptimizerV2, loss_fn: Callable) None[source]

Compiles the generator and discriminator models.

Parameters:
  • d_optimizer (keras.Model) – An optimizer for the GAN’s discriminator.

  • g_optimizer – An optimizer for the GAN’s generator.

  • loss_fn (keras.losses.Loss) – Loss function.

generate(labels: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates new data from the model.

Parameters:

labels (tsgm.types.Tensor) – the number of samples to be generated.

Returns:

generated samples

Return type:

tsgm.types.Tensor

property metrics: List[source]
Returns:

A list of metrics trackers (e.g., generator’s loss and discriminator’s loss).

Return type:

T.List

train_step(data: Tuple) Dict[str, float][source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dictionary with generator (key “g_loss”) and discriminator (key “d_loss”) losses

Return type:

T.Dict[str, float]

class GAN(*args, **kwargs)[source]

GAN implementation for unlabeled time series.

Parameters:
  • discriminator (keras.Model) – A discriminator model which takes a time series as input and check whether the image is real or fake.

  • generator (keras.Model) – Takes as input a random noise vector of latent_dim length and returns a simulated time-series.

  • latent_dim (int) – The size of the noise vector.

clone() GAN[source]

Clones GAN object

Returns:

The exact copy of the object

Return type:

“GAN”

compile(d_optimizer: OptimizerV2, g_optimizer: OptimizerV2, loss_fn: Loss) None[source]

Compiles the generator and discriminator models.

Parameters:
  • d_optimizer (keras.Model) – An optimizer for the GAN’s discriminator.

  • g_optimizer – An optimizer for the GAN’s generator.

  • loss_fn (keras.losses.Loss) – Loss function.

generate(num: int) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates new data from the model.

Parameters:

num (int) – the number of samples to be generated.

Returns:

Generated samples

Return type:

tsgm.types.Tensor

property metrics: List[source]
Returns:

A list of metrics trackers (e.g., generator’s loss and discriminator’s loss).

train_step(data: Tensor | ndarray[Any, dtype[ScalarType]]) Dict[str, float][source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dictionary with generator (key “g_loss”) and discriminator (key “d_loss”) losses

Return type:

T.Dict[str, float]

VAEs

class BetaVAE(*args, **kwargs)[source]

beta-VAE implementation for unlabeled time series.

Parameters:
  • encoder (keras.Model) – An encoder model which takes a time series as input and check whether the image is real or fake.

  • decoder (keras.Model) – Takes as input a random noise vector of latent_dim length and returns a simulated time-series.

  • latent_dim (int) – The size of the noise vector.

call(X: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Encodes and decodes time series dataset X.

Parameters:

X (tsgm.types.Tensor) – The size of the noise vector.

Returns:

Generated samples

Return type:

tsgm.types.Tensor

generate(n: int) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates new data from the model.

Parameters:

n (int) – the number of samples to be generated.

Returns:

A tensor with generated samples.

Return type:

tsgm.types.Tensor

property metrics: List[source]
Returns:

A list of metrics trackers (e.g., generator’s loss and discriminator’s loss).

train_step(data: Tensor | ndarray[Any, dtype[ScalarType]]) Dict[source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dict with losses

Return type:

T.Dict

class cBetaVAE(*args, **kwargs)[source]
call(data: Tensor | ndarray[Any, dtype[ScalarType]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Encodes and decodes time series dataset X.

Parameters:

X (tsgm.types.Tensor) – The size of the noise vector.

Returns:

Generated samples

Return type:

tsgm.types.Tensor

generate(labels: Tensor | ndarray[Any, dtype[ScalarType]]) Tuple[Tensor | ndarray[Any, dtype[ScalarType]], Tensor | ndarray[Any, dtype[ScalarType]]][source]

Generates new data from the model.

Parameters:

labels (tsgm.types.Tensor) – the number of samples to be generated.

Returns:

a tuple of synthetically generated data and labels.

Return type:

T.Tuple[tsgm.types.Tensor, tsgm.types.Tensor]

property metrics: List[source]

Returns the list of loss tracker: [loss, reconstruction_loss, kl_loss].

train_step(data: Tensor | ndarray[Any, dtype[ScalarType]]) Dict[str, float][source]

Performs a training step using a batch of data, stored in data.

Parameters:

data (tsgm.types.Tensor) – A batch of data in a format batch_size x seq_len x feat_dim

Returns:

A dict with losses

Return type:

T.Dict[str, float]

ABC

class ABCAlgorithm[source]

A base class for ABC algorithms.

sample_parameters(n_samples: int) List[source]
class RejectionSampler(simulator: ModelBasedSimulator, data: Dataset, statistics: List, epsilon: float, discrepancy: Callable, priors: Dict | None = None, **kwargs)[source]

Rejection sampling algorithm for approximate Bayesian computation.

Parameters:
  • simulator (class tsgm.simulator.ModelBasedSimulator) – A model based simulator

  • data (class tsgm.dataset.Dataset) – Historical dataset storage

  • statistics (list) – contains a list of summary statistics

  • epsilon (float) – tolerance of synthetically generated data to a set of summary statistics

  • discrepancy (Callable) – discrepancy measure function

  • priors – set of priors for each of the simulator parametors, defaults to DEFAULT_PRIOR

sample_parameters(n_samples: int) List[source]

Samples parameters from the rejection sampler.

Parameters:

n_samples – Number of samples

Returns:

A list of samples. Each sample is represent as dict.

Return type:

T.List[T.Dict]

prior_samples(priors: Dict, params: List) Dict[source]

Generate prior samples for the specified parameters.

Parameters:
  • priors (T.Dict) – A dictionary containing probability distributions for each parameter. Keys are parameter names, and values are instances of probability distribution classes. If a parameter is not present in the dictionary, a default prior distribution is used.

  • params (T.List) – A list of parameter names for which prior samples are to be generated.

Returns:

A dictionary where keys are parameter names and values are samples drawn from their respective prior distributions.

Return type:

T.Dict

Example:

priors = {'mean': NormalDistribution(0, 1), 'std_dev': UniformDistribution(0, 2)}
params = ['mean', 'std_dev']
samples = prior_samples(priors, params)

STS

class STS(model: StructuralTimeSeries | None = None)[source]

Class for training and generating from a structural time series model.

Initializes a new instance of the STS class.

Parameters:

model (tfp.sts.StructuralTimeSeriesModel or None) – Structural time series model to use. If None, default model is used.

elbo_loss() float[source]

Returns the evidence lower bound (ELBO) loss from training.

Returns:

The value of the ELBO loss.

Return type:

float

generate(num_samples: int) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates samples from the trained model.

Parameters:

num_samples (int) – Number of samples to generate.

Returns:

Generated samples.

Return type:

tsgm.types.Tensor

train(ds: Dataset, num_variational_steps: int = 200, steps_forw: int = 10) None[source]

Trains the structural time series model.

Parameters:
  • ds (tsgm.dataset.Dataset) – Dataset containing time series data.

  • num_variational_steps (int) – Number of variational optimization steps, defaults to 200.

  • steps_forw (int) – Number of steps to forecast, defaults to 10.

Visualization

visualize_dataset(dataset: Dataset | Tensor | ndarray[Any, dtype[ScalarType]], obj_id: int = 0, palette: dict = {'gen': 'blue', 'hist': 'red'}, path: str = '/tmp/generated_data.pdf') None[source]

The function visualizes time series dataset with target values.

Parameters:

dataset (tsgm.dataset.DatasetOrTensor.) – A time series dataset.

visualize_original_and_reconst_ts(original: Tensor | ndarray[Any, dtype[ScalarType]], reconst: Tensor | ndarray[Any, dtype[ScalarType]], num: int = 5, vmin: int = 0, vmax: int = 1) None[source]

Visualizes original and reconstructed time series data.

This function generates side-by-side visualizations of the original and reconstructed time series data. It randomly selects a specified number of samples from the input tensors original and reconst and displays them as images using imshow.

Parameters:
  • original (tsgm.types.Tensor) – Original time series data tensor.

  • reconst (tsgm.types.Tensor) – Reconstructed time series data tensor.

  • num (int, optional) – Number of samples to visualize, defaults to 5.

  • vmin (int, optional) – Minimum value for colormap normalization, defaults to 0.

  • vmax (int, optional) – Maximum value for colormap normalization, defaults to 1.

visualize_training_loss(loss_vector: Tensor | ndarray[Any, dtype[ScalarType]], labels: tuple = (), path: str = '/tmp/training_loss.pdf') None[source]

Plot training losses as a function of the epochs

Parameters:
  • loss_vector – np.array, having shape num of metrics times number of epochs

  • labels – list of strings

  • path – str, where to save the plot

visualize_ts(ts: Tensor | ndarray[Any, dtype[ScalarType]], num: int = 5) None[source]

Visualizes time series tensor.

This function generates a plot to visualize time series data. It displays a specified number of time series from the input tensor.

Parameters:
  • ts (tsgm.types.Tensor) – The time series data tensor of shape (num_samples, num_timesteps, num_features).

  • num (int, optional) – The number of time series to display. Defaults to 5.

Raises:

AssertionError: If the input tensor does not have three dimensions.

Example:
>>> visualize_ts(time_series_tensor, num=10)
visualize_ts_lineplot(ts: Tensor | ndarray[Any, dtype[ScalarType]], ys: Tensor | ndarray[Any, dtype[ScalarType]] | None = None, num: int = 5, unite_features: bool = True) None[source]

Visualizes time series data using line plots.

This function generates line plots to visualize the time series data. It randomly selects a specified number of samples from the input tensor ts and plots each sample as a line plot. If ys is provided, it can be either a 1D or 2D tensor representing the target variable(s), and the function will optionally overlay it on the line plot.

Parameters:
  • ts (tsgm.types.Tensor) – Input time series data tensor.

  • ys (tsgm.types.OptTensor, optional) – Optional target variable(s) tensor, defaults to None.

  • num (int, optional) – Number of samples to visualize, defaults to 5.

  • unite_features (bool, optional) – Whether to plot all features together or separately, defaults to True.

visualize_tsne(X: Tensor | ndarray[Any, dtype[ScalarType]], y: Tensor | ndarray[Any, dtype[ScalarType]], X_gen: Tensor | ndarray[Any, dtype[ScalarType]], y_gen: Tensor | ndarray[Any, dtype[ScalarType]], path: str = '/tmp/tsne_embeddings.pdf', feature_averaging: bool = False, perplexity=30.0) None[source]

Visualizes t-SNE embeddings of real and synthetic data.

This function generates a scatter plot of t-SNE embeddings for real and synthetic data. Each data point is represented by a marker on the plot, and the colors of the markers correspond to the corresponding class labels of the data points.

Parameters:
  • X (tsgm.types.Tensor) – The original real data tensor of shape (num_samples, num_features).

  • y (tsgm.types.Tensor) – The labels of the original real data tensor of shape (num_samples,).

  • X_gen (tsgm.types.Tensor) – The generated synthetic data tensor of shape (num_samples, num_features).

  • y_gen (tsgm.types.Tensor) – The labels of the generated synthetic data tensor of shape (num_samples,).

  • path (str, optional) – The path to save the visualization as a PDF file. Defaults to “/tmp/tsne_embeddings.pdf”.

  • feature_averaging (bool, optional) – Whether to compute the average features for each class. Defaults to False.

visualize_tsne_unlabeled(X: Tensor | ndarray[Any, dtype[ScalarType]], X_gen: Tensor | ndarray[Any, dtype[ScalarType]], palette: dict = {'gen': 'blue', 'hist': 'red'}, alpha: float = 0.25, path: str = '/tmp/tsne_embeddings.pdf', fontsize: int = 20, markerscale: int = 3, markersize: int = 1, feature_averaging: bool = False, perplexity: float = 30.0) None[source]

Visualizes t-SNE embeddings of unlabeled data.

Parameters:
  • X (tsgm.types.Tensor) – The original data tensor of shape (num_samples, num_features).

  • X_gen (tsgm.types.Tensor) – The generated data tensor of shape (num_samples, num_features).

  • palette (dict, optional) – A dictionary mapping class labels to colors. Defaults to DEFAULT_PALETTE_TSNE.

  • alpha (float, optional) – The transparency level of the plotted points. Defaults to 0.25.

  • path (str, optional) – The path to save the visualization as a PDF file. Defaults to “/tmp/tsne_embeddings.pdf”.

  • fontsize (int, optional) – The font size of the class labels in the legend. Defaults to 20.

  • markerscale (int, optional) – The scaling factor for the size of the markers in the legend. Defaults to 3.

  • markersize (int, optional) – The size of the markers in the scatter plot. Defaults to 1.

  • feature_averaging (bool, optional) – Whether to compute the average features for each class. Defaults to False.

Monitors

class GANMonitor(num_samples: int, latent_dim: int, labels: Tensor | ndarray[Any, dtype[ScalarType]], save: bool = True, save_path: str | None = None, mode: str = 'clf')[source]

GANMonitor is a Keras callback for monitoring and visualizing generated samples during training.

Parameters:
  • num_samples (int) – The number of samples to generate and visualize.

  • latent_dim (int) – The dimensionality of the latent space. Defaults to 128.

  • output_dim (int) – The dimensionality of the output space. Defaults to 2.

  • save (bool) – Whether to save the generated samples. Defaults to True.

  • save_path (str) – The path to save the generated samples. Defaults to None.

Raises:

ValueError – If the mode is not one of [‘clf’, ‘reg’]

Note:

If save is True and save_path is not specified, the default save path is “/tmp/”.

Warning:

If save_path is specified but save is False, a warning is issued.

on_epoch_end(epoch: int, logs: Dict | None = None) None[source]

Callback function called at the end of each training epoch.

Parameters:
  • epoch (int) – Current epoch number.

  • logs (dict) – Dictionary containing the training loss values.

class VAEMonitor(num_samples: int = 6, latent_dim: int = 128, output_dim: int = 2, save: bool = True, save_path: str | None = None)[source]

VAEMonitor is a Keras callback for monitoring and visualizing generated samples from a Variational Autoencoder (VAE) during training.

Parameters:
  • num_samples (int) – The number of samples to generate and visualize. Defaults to 6.

  • latent_dim (int) – The dimensionality of the latent space. Defaults to 128.

  • output_dim (int) – The dimensionality of the output space. Defaults to 2.

  • save (bool) – Whether to save the generated samples. Defaults to True.

  • save_path (str) – The path to save the generated samples. Defaults to None.

Raises:

ValueError – If output_dim is less than or equal to 0.

Note:

If save is True and save_path is not specified, the default save path is “/tmp/”.

Warning:

If save_path is specified but save is False, a warning is issued.

on_epoch_end(epoch: int, logs: Dict | None = None) None[source]

Callback function called at the end of each training epoch.

Parameters:
  • epoch (int) – The current epoch number.

  • logs (dict) – Dictionary containing the training loss values.

Zoo

class Architecture[source]
abstract property arch_type[source]
class BaseClassificationArchitecture(seq_len: int, feat_dim: int, output_dim: int)[source]

Base class for classification architectures.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

Initializes the base classification architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

arch_type = 'downstream:classification'[source]
get() Dict[source]

Returns a dictionary containing the model.

Returns:

A dictionary containing the model.

Return type:

dict

property model: Model[source]

Property to access the underlying Keras model.

Returns:

The Keras model.

Return type:

keras.models.Model

class BaseGANArchitecture[source]

Base class for defining architectures of Generative Adversarial Networks (GANs).

property discriminator: Model[source]

Property for accessing the discriminator model.

Returns:

The discriminator model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the discriminator model is not found.

property generator: Model[source]

Property for accessing the generator model.

Returns:

The generator model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the generator model is not implemented.

get() Dict[source]

Retrieves both discriminator and generator models as a dictionary.

Returns:

A dictionary containing discriminator and generator models.

Return type:

dict

Raises:

NotImplementedError – If either discriminator or generator models are not implemented.

class BaseVAEArchitecture[source]

Base class for defining architectures of Variational Autoencoders (VAEs).

property decoder: Model[source]

Property for accessing the decoder model.

Returns:

The decoder model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the decoder model is not implemented.

property encoder: Model[source]

Property for accessing the encoder model.

Returns:

The encoder model.

Return type:

keras.models.Model

Raises:

NotImplementedError – If the encoder model is not implemented.

get() Dict[source]

Retrieves both encoder and decoder models as a dictionary.

Returns:

A dictionary containing encoder and decoder models.

Return type:

dict

Raises:

NotImplementedError – If either encoder or decoder models are not implemented.

class BasicRecurrentArchitecture(hidden_dim: int, output_dim: int, n_layers: int, network_type: str, name: str = 'Sequential')[source]

Base class for basic recurrent neural network architectures.

Inherits from Architecture.

Parameters:
  • hidden_dim – int, the number of units (e.g. 24)

  • output_dim – int, the number of output units (e.g. 1)

  • n_layers – int, the number of layers (e.g. 3)

  • network_type – str, one of ‘gru’, ‘lstm’, or ‘lstmLN’

  • name – str, model name Default: “Sequential”

arch_type = 'rnn_architecture'[source]
build(activation: str = 'sigmoid', return_sequences: bool = True) Model[source]

Builds the recurrent neural network model.

Parameters:
  • activation (str) – Activation function for the output layer (default is ‘sigmoid’).

  • return_sequences (bool) – Whether to return the full sequence of outputs (default is True).

Returns:

The built Keras model.

Return type:

keras.models.Model

class BlockClfArchitecture(seq_len: int, feat_dim: int, output_dim: int, blocks: list)[source]

Architecture for classification using a sequence of blocks.

Inherits from BaseClassificationArchitecture.

Initializes the BlockClfArchitecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

  • blocks (list) – List of blocks used in the architecture.

arch_type = 'downstream:classification'[source]
class ConvnArchitecture(seq_len: int, feat_dim: int, output_dim: int, n_conv_blocks: int = 1)[source]

Convolutional neural network architecture for classification. Inherits from BaseClassificationArchitecture.

Initializes the convolutional neural network architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

  • n_conv_blocks (int, optional) – Number of convolutional blocks to use (default is 1).

class ConvnLSTMnArchitecture(seq_len: int, feat_dim: int, output_dim: int, n_conv_lstm_blocks: int = 1)[source]

Initializes the base classification architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • output_dim (int) – Dimensionality of the output.

class Sampling(*args, **kwargs)[source]

Custom Keras layer for sampling from a latent space.

This layer samples from a latent space using the reparameterization trick during training. It takes as input the mean and log variance of the latent distribution and generates samples by adding random noise scaled by the standard deviation to the mean.

call(inputs: Tuple[Tensor | ndarray[Any, dtype[ScalarType]], Tensor | ndarray[Any, dtype[ScalarType]]]) Tensor | ndarray[Any, dtype[ScalarType]][source]

Generates samples from a latent space.

Parameters:

inputs (tuple[tsgm.types.Tensor, tsgm.types.Tensor]) – Tuple containing mean and log variance tensors of the latent distribution.

Returns:

Sampled latent vector.

Return type:

tsgm.types.Tensor

class VAE_CONV5Architecture(seq_len: int, feat_dim: int, latent_dim: int)[source]

This class defines the architecture for a Variational Autoencoder (VAE) with Convolutional Layers.

Parameters:

seq_len (int): Length of input sequence. feat_dim (int): Dimensionality of input features. latent_dim (int): Dimensionality of latent space.

Initializes the VAE_CONV5Architecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

arch_type = 'vae:unconditional'[source]
class Zoo(*arg, **kwargs)[source]

A collection of architectures represented. It behaves like supports Python dict API.

Initializes the Zoo.

summary() None[source]

Prints a summary of architectures in the Zoo.

class cGAN_Conv4Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int)[source]

Architecture for Conditional Generative Adversarial Network (cGAN) with Convolutional Layers.

Initializes the cGAN_Conv4Architecture.

Parameters:
  • seq_len (int) – Length of input sequence.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

  • output_dim (int) – Dimensionality of output.

arch_type = 'gan:conditional'[source]
class cGAN_LSTMConv3Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int)[source]

Architecture for Conditional Generative Adversarial Network (cGAN) with LSTM and Convolutional Layers.

Initializes the cGAN_LSTMConv3Architecture.

Parameters:
  • seq_len (int) – Length of input sequence.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

  • output_dim (int) – Dimensionality of output.

arch_type = 'gan:conditional'[source]
class cGAN_LSTMnArchitecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int, n_blocks: int = 1, output_activation: str = 'tanh')[source]

Conditional Generative Adversarial Network (cGAN) with LSTM-based architecture.

Inherits from BaseGANArchitecture.

Initializes the cGAN_LSTMnArchitecture.

Parameters:
  • seq_len (int) – Length of input sequences.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of the latent space.

  • output_dim (int) – Dimensionality of the output.

  • n_blocks (int, optional) – Number of LSTM blocks in the architecture (default is 1).

  • output_activation (str, optional) – Activation function for the output layer (default is “tanh”).

arch_type = 'gan:conditional'[source]
class cVAE_CONV5Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int = 2)[source]
arch_type = 'vae:conditional'[source]
class tcGAN_Conv4Architecture(seq_len: int, feat_dim: int, latent_dim: int, output_dim: int)[source]

Architecture for Temporal Conditional Generative Adversarial Network (tcGAN) with Convolutional Layers.

Initializes the tcGAN_Conv4Architecture.

Parameters:
  • seq_len (int) – Length of input sequence.

  • feat_dim (int) – Dimensionality of input features.

  • latent_dim (int) – Dimensionality of latent space.

  • output_dim (int) – Dimensionality of output.

arch_type = 'gan:t-conditional'[source]

Datasets

class UCRDataManager(path: str = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.5-py3.8.egg/tsgm/utils/../../data', ds: str = 'gunpoint')[source]

A manager for UCR collection of time series datasets.

Parameters:
  • path (str) – a relative path to the stored UCR dataset.

  • ds (str) – Name of the dataset. Should be in (beef | coffee | ecg200 | freezer | gunpoint | insect | mixed_shapes | starlight).

Raises:

ValueError – When there is no stored UCR archive, or the name of the dataset is incorrect.

default_path = '/home/docs/checkouts/readthedocs.org/user_builds/tsgm/envs/latest/lib/python3.8/site-packages/tsgm-0.0.5-py3.8.egg/tsgm/utils/../../data'[source]
get() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Returns a tuple containing training and testing data.

Returns:

A tuple (X_train, y_train, X_test, y_test).

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike]

get_classes_distribution() Dict[source]

Returns a dictionary with the fraction of occurrences for each class.

Returns:

A dictionary containing the fraction of occurrences for each class.

Return type:

dict[Any, float]

key = 'someone'[source]
mirrors = ['https://www.cs.ucr.edu/~eamonn/time_series_data_2018/'][source]
resources = [('UCRArchive_2018.zip', 0)][source]
summary() None[source]

Prints a summary of the dataset.

y_all: Collection[Hashable] | None
download_physionet2012() None[source]

Downloads the Physionet 2012 dataset files from the Physionet website and extracts them in local folder ‘physionet2012’

gen_sine_const_switch_dataset(N: int, T: int, D: int, max_value: int = 10, const: int = 0, frequency_switch: float = 0.1) Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates a dataset with alternating constant and sinusoidal sequences.

Parameters:
  • N (int) – Number of samples in the dataset.

  • T (int) – Length of each sequence in the dataset.

  • D (int) – Number of dimensions in each sequence.

  • max_value (int, optional) – Maximum value for amplitude and shift of the sinusoids. Defaults to 10.

  • const (int, optional) – Value indicating whether the sequence is constant or sinusoidal. Defaults to 0.

  • frequency_switch (float, optional) – Probability of switching between constant and sinusoidal sequences. Defaults to 0.1.

Returns:

Tuple containing input data (X) and target labels (y).

Return type:

tuple[numpy.ndarray, numpy.ndarray]

gen_sine_dataset(N: int, T: int, D: int, max_value: int = 10) ndarray[Any, dtype[ScalarType]][source]

Generates a dataset of sinusoidal waves with random parameters.

Parameters:
  • N (int) – Number of samples in the dataset.

  • T (int) – Length of each time series in the dataset.

  • D (int) – Number of dimensions (sinusoids) in each time series.

  • max_value (int, optional) – Maximum value for amplitude and shift of the sinusoids. Defaults to 10.

Returns:

Generated dataset with shape (N, T, D).

Return type:

numpy.ndarray

gen_sine_vs_const_dataset(N: int, T: int, D: int, max_value: int = 10, const: int = 0) Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Generates a dataset with alternating sinusoidal and constant sequences.

Parameters:
  • N (int) – Number of samples in the dataset.

  • T (int) – Length of each sequence in the dataset.

  • D (int) – Number of dimensions in each sequence.

  • max_value (int, optional) – Maximum value for amplitude and shift of the sinusoids. Defaults to 10.

  • const (int, optional) – Maximum value for the constant sequence. Defaults to 0.

Returns:

Tuple containing input data (X) and target labels (y).

Return type:

tuple[numpy.ndarray, numpy.ndarray]

get_covid_19() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tuple, List][source]

Loads Covid-19 dataset with additional graph information The dataset is based on data from The New York Times, based on reports from state and local health agencies [1].

And was adapted to graph case in [2]. [1] The New York Times. (2021). Coronavirus (Covid-19) Data in the United States. Retrieved [Insert Date Here], from https://github.com/nytimes/covid-19-data. [2] Alexander V. Nikitin, St John, Arno Solin, Samuel Kaski Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:10640-10660, 2022.

Returns:

tuple

First element is time series data (n_nodes x n_timestamps x n_features). Each timestamp consists of the number of deaths, cases, deaths normalized by the population, and cases normalized by the population. The second element is the graph tuple (nodes, edges). The third element is the order of states.

get_eeg() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Loads the EEG Eye State dataset.

This function downloads the EEG Eye State dataset from the UCI Machine Learning Repository and returns the input features (X) and target labels (y).

Returns:

A tuple containing the input features (X) and target labels (y).

Return type:

tuple[TensorLike, TensorLike]

get_energy_data() ndarray[Any, dtype[ScalarType]][source]

Retrieves the energy consumption dataset.

This function downloads and loads the energy consumption dataset from the UCI Machine Learning Repository. It returns the dataset as a NumPy array.

Returns:

Energy consumption dataset.

Return type:

numpy.ndarray

get_gp_samples_data(num_samples: int, max_time: int, covar_func: ~typing.Callable = <function _exponential_quadratic>) ndarray[Any, dtype[ScalarType]][source]

Generates samples from a Gaussian process.

This function generates samples from a Gaussian process using the specified covariance function. It returns the generated samples as a NumPy array.

Parameters:
  • num_samples (int) – Number of samples to generate.

  • max_time (int) – Maximum time value for the samples.

  • covar_func (Callable, optional) – Covariance function to use. Defaults to _exponential_quadratic.

Returns:

Generated samples from the Gaussian process.

Return type:

numpy.ndarray

get_mauna_loa() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Loads the Mauna Loa CO2 dataset.

This function loads the Mauna Loa CO2 dataset, which contains measurements of atmospheric CO2 concentrations at the Mauna Loa Observatory in Hawaii.

Returns:

A tuple containing the input data (X) and target labels (y).

Return type:

tuple[TensorLike, TensorLike]

get_mnist_data() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Retrieves the MNIST dataset.

This function loads the MNIST dataset, which consists of 28x28 grayscale images of handwritten digits, and returns the training and testing data along with their corresponding labels.

Returns:

A tuple containing the training data, training labels, testing data, and testing labels.

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike]

get_physionet2012() Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Retrieves the Physionet 2012 dataset.

This function downloads and retrieves the Physionet 2012 dataset, which consists of physiological data and corresponding outcomes. It returns the training, testing, and validation datasets along with their labels.

Returns:

A tuple containing the training, testing, and validation datasets along with their labels.

Return type:

tuple[TensorLike, TensorLike, TensorLike, TensorLike, TensorLike, TensorLike]

get_power_consumption() ndarray[Any, dtype[ScalarType]][source]

Retrieves the household power consumption dataset.

This function downloads and loads the household power consumption dataset from the UCI Machine Learning Repository. It returns the dataset as a NumPy array.

Returns:

Household power consumption dataset.

Return type:

numpy.ndarray

get_stock_data(stock_name: str) ndarray[Any, dtype[ScalarType]][source]

Downloads historical stock data for the specified stock ticker.

This function downloads historical stock data for the specified stock ticker using the Yahoo Finance API. It returns the stock data as a NumPy array with an additional axis representing the batch dimension.

Parameters:

stock_name (str) – Ticker symbol of the stock.

Returns:

Historical stock data.

Return type:

numpy.ndarray

Raises:

ValueError – If the provided stock ticker is invalid or no data is available.

load_arff(path: str) DataFrame[source]

Loads data from an ARFF (Attribute-Relation File Format) file.

This function reads data from an ARFF file located at the specified path and returns it as a pandas DataFrame.

Parameters:

path (str) – Path to the ARFF file.

Returns:

DataFrame containing the loaded data.

Return type:

pandas.DataFrame

split_dataset_into_objects(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, y: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, step: int = 10) Tuple[Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic, Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic][source]

Splits the dataset into objects of fixed length.

This function splits the input dataset into objects of fixed length along the first dimension, 0-padding if necessary.

Parameters:
  • X (TensorLike) – Input data.

  • y (TensorLike) – Target labels.

  • step (int, optional) – Length of each object. Defaults to 10.

Returns:

A tuple containing input data objects and corresponding target label objects.

Return type:

tuple[TensorLike, TensorLike]

Data Processing Utils

class TSFeatureWiseScaler(feature_range: Tuple[float, float] = (0, 1))[source]

Scales time series data feature-wise.

Parameters:

feature_rangetuple(float, float), optional

Tuple representing the minimum and maximum feature values (default is (0, 1)).

Attributes:

_min_vfloat

Minimum feature value.

_max_vfloat

Maximum feature value.

Initializes a new instance of the TSFeatureWiseScaler class.

parameter feature_range:

Tuple representing the minimum and maximum feature values, defaults to (0, 1)

type tuple(float, float), optional:

fit(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) TSFeatureWiseScaler[source]

Fits the scaler to the data.

Parameters:

X (TensorLike) – Input data.

Returns:

The fitted scaler object.

Return type:

TSGlobalScaler

fit_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Fits the scaler to the data and transforms it.

Parameters:

X (TensorLike) – Input data

Returns:

Scaled input data X

Return type:

TensorLike

inverse_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Inverse-transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Original data.

Return type:

TensorLike

transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Scaled X.

Return type:

TensorLike

class TSGlobalScaler[source]

Scales time series data globally.

Attributes:

minfloat

Minimum value encountered in the data.

maxfloat

Maximum value encountered in the data.

fit(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) TSGlobalScaler[source]

Fits the scaler to the data.

Parameters:

X (TensorLike) – Input data.

Returns:

The fitted scaler object.

Return type:

TSGlobalScaler

fit_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Fits the scaler to the data and transforms it.

Parameters:

X (TensorLike) – Input data

Returns:

Scaled input data X

Return type:

TensorLike

inverse_transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Inverse-transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Original data.

Return type:

TensorLike

transform(X: Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic) Tensor | TensorProtocol | int | float | bool | str | bytes | complex | tuple | list | ndarray | generic[source]

Transforms the data.

Parameters:

X (TensorLike) – Input data.

Returns:

Scaled X.

Return type:

TensorLike