tsgm.metrics
¶
Package Contents¶
- class DistanceMetric(statistics: list, discrepancy: Callable)[source]¶
Bases:
Metric
Metric that measures similarity between synthetic and real time series
- Parameters:
- stats(X: tsgm.types.Tensor) tsgm.types.Tensor [source]¶
- Parameters:
X (tsgm.types.Tensor.) – A time series dataset.
- Returns:
a tensor with calculated summary statistics.
- class ConsistencyMetric(evaluators: List)[source]¶
Bases:
Metric
Predictive consistency metric measures whether a set of evaluators yield consistent results on real and synthetic data.
- Parameters:
evaluators (list) – A list of evaluators (each item should implement method
.evaluate(D)
)
- __call__(D1: tsgm.dataset.DatasetOrTensor, D2: tsgm.dataset.DatasetOrTensor, D_test: tsgm.dataset.DatasetOrTensor) float [source]¶
- Parameters:
D1 (tsgm.dataset.DatasetOrTensor.) – A time series dataset.
D2 (tsgm.dataset.DatasetOrTensor.) – A time series dataset.
- Returns:
consistency metric between D1 & D2.
- class BaseDownstreamEvaluator[source]¶
Bases:
abc.ABC
Helper class that provides a standard way to create an ABC using inheritance.
- class DownstreamPerformanceMetric(evaluator: BaseDownstreamEvaluator)[source]¶
Bases:
Metric
The downstream performance metric evaluates the performance of a model on a downstream task. It returns performance gains achieved with the addition of synthetic data.
- Parameters:
evaluator (BaseDownstreamEvaluator) – An evaluator, should implement method
.evaluate(D)
- __call__(D1: tsgm.dataset.DatasetOrTensor, D2: tsgm.dataset.DatasetOrTensor, D_test: tsgm.dataset.DatasetOrTensor | None, return_std: bool = False) float [source]¶
- Parameters:
D1 (tsgm.dataset.DatasetOrTensor.) – A time series dataset.
D2 (tsgm.dataset.DatasetOrTensor.) – A time series dataset.
- Returns:
downstream performance metric between D1 & D2.
- class PrivacyMembershipInferenceMetric(attacker: Any, metric: Callable | None = None)[source]¶
Bases:
Metric
The metric measures the possibility of membership inference attacks.
- Parameters:
attacker (Callable) – An attacker, one class classififier (OCC) that implements methods
.fit
and.predict
metric – Measures quality of attacker (precision by default)
- __call__(d_tr: tsgm.dataset.Dataset, d_syn: tsgm.dataset.Dataset, d_test: tsgm.dataset.Dataset) float [source]¶
- Parameters:
d_tr (tsgm.dataset.DatasetOrTensor.) – Training dataset (the dataset that was used to produce
d_dyn
).d_syn (tsgm.dataset.DatasetOrTensor.) – Training dataset (the dataset that was used to produce
d_dyn
).d_test (tsgm.dataset.DatasetOrTensor.) – Training dataset (the dataset that was used to produce
d_dyn
).
- Returns:
how well the attacker can distinguish
d_tr
&d_test
when it is trained ond_syn
.
- class MMDMetric(kernel: Callable = tsgm.utils.mmd.exp_quad_kernel)[source]¶
Bases:
Metric
This metric calculated MMD between real and synthetic samples
- Args:
d (tsgm.dataset.DatasetOrTensor): The input dataset or tensor.
- Returns:
float: The computed spectral entropy.
- Example:
>>> metric = MMDMetric(kernel) >>> dataset, synth_dataset = tsgm.dataset.Dataset(...), tsgm.dataset.Dataset(...) >>> result = metric(dataset) >>> print(result)
- class DiscriminativeMetric[source]¶
Bases:
Metric
The DiscriminativeMetric measures the discriminative performance of a model in distinguishing between synthetic and real datasets.
This metric evaluates a discriminative model by training it on a combination of synthetic and real datasets and assessing its performance on a test set.
- Parameters:
d_hist (tsgm.dataset.DatasetOrTensor) – Real dataset.
d_syn (tsgm.dataset.DatasetOrTensor) – Synthetic dataset.
model (T.Callable) – Discriminative model to be evaluated.
test_size (T.Union[float, int]) – Proportion of the dataset to include in the test split or the absolute number of test samples.
n_epochs (int) – Number of training epochs for the model.
metric (T.Optional[T.Callable]) – Optional evaluation metric to use (default: accuracy).
random_seed (T.Optional[int]) – Optional random seed for reproducibility.
- Returns:
Discriminative performance metric.
- Return type:
Example:¶
>>> from my_module import DiscriminativeMetric, MyDiscriminativeModel >>> import tsgm.dataset >>> import numpy as np >>> import sklearn >>> >>> # Create real and synthetic datasets >>> real_dataset = tsgm.dataset.Dataset(...) # Replace ... with appropriate arguments >>> synthetic_dataset = tsgm.dataset.Dataset(...) # Replace ... with appropriate arguments >>> >>> # Create a discriminative model >>> model = MyDiscriminativeModel() # Replace with the actual discriminative model class >>> >>> # Create and use the DiscriminativeMetric >>> metric = DiscriminativeMetric() >>> result = metric(real_dataset, synthetic_dataset, model, test_size=0.2, n_epochs=10) >>> print(result)
- class EntropyMetric[source]¶
Bases:
Metric
Calculates the spectral entropy of a dataset or tensor as a sum of individual entropies.
- Args:
d (tsgm.dataset.DatasetOrTensor): The input dataset or tensor.
- Returns:
float: The computed spectral entropy.
- Example:
>>> metric = EntropyMetric() >>> dataset = tsgm.dataset.Dataset(...) >>> result = metric(dataset) >>> print(result)
- class DemographicParityMetric[source]¶
Bases:
Metric
Measuring demographic parity between two datasets.
This metric assesses the difference in the distributions of a target variable among different groups in two datasets. By default, it uses the Kolmogorov-Smirnov statistic to quantify the maximum vertical deviation between the cumulative distribution functions of the target variable for the historical and synthetic data within each group.
- Args:
d_hist (tsgm.dataset.DatasetOrTensor): The historical input dataset or tensor. groups_hist (TensorLike): The group assignments for the historical data. d_synth (tsgm.dataset.DatasetOrTensor): The synthetic input dataset or tensor. groups_synth (TensorLike): The group assignments for the synthetic data. metric (callable, optional): The metric used to compare the target variable distributions within each group.
Default is the Kolmogorov-Smirnov statistic.
- Returns:
dict: A dictionary mapping each group to the computed demographic parity metric.
- Example:
>>> metric = DemographicParityMetric() >>> dataset_hist = tsgm.dataset.Dataset(...) >>> dataset_synth = tsgm.dataset.Dataset(...) >>> groups_hist = [0, 1, 0, 1, 1, 0] >>> groups_synth = [1, 1, 0, 0, 0, 1] >>> result = metric(dataset_hist, groups_hist, dataset_synth, groups_synth) >>> print(result)
- __call__(d_hist: tsgm.dataset.DatasetOrTensor, groups_hist: tensorflow.python.types.core.TensorLike, d_synth: tsgm.dataset.DatasetOrTensor, groups_synth: tensorflow.python.types.core.TensorLike, metric: Callable = _DEFAULT_KS_METRIC) Dict [source]¶
Calculate the demographic parity metric for the input datasets.
- Args:
d_hist (tsgm.dataset.DatasetOrTensor): The historical input dataset or tensor. groups_hist (TensorLike): The group assignments for the historical data. d_synth (tsgm.dataset.DatasetOrTensor): The synthetic input dataset or tensor. groups_synth (TensorLike): The group assignments for the synthetic data. metric (callable, optional): The metric used to compare the target variable distributions within each group.
Default is the Kolmogorov-Smirnov statistic.
- Returns:
dict: A dictionary mapping each group to the computed demographic parity metric.
- class ShannonEntropyMetric[source]¶
Bases:
Metric
Shannon Entropy calculated over the labels of a dataset. This index is a measure of diversity that accounts for categories present in a dataset.
- _shannon_entropy(labels)[source]¶
Private method to calculate the Shannon Entropy for a given set of labels.
Parameters: labels (array-like): The labels or categories for which the diversity measure is to be calculated.
Returns: float: The Shannon Entropy value.
- __call__(d: tsgm.dataset.DatasetOrTensor) float [source]¶
Calculate the Shannon entropy for the dataset.
Parameters: d (tsgm.dataset.DatasetOrTensor): The dataset or tensor object containing the labels.
Returns: float: The Shannon entropy value.
Raises: AssertionError: If the dataset does not contain labels.
- class PairwiseDistanceMetric[source]¶
Bases:
Metric
Measures pairwise distances in a set of time series.
- pairwise_euclidean_distances(ts: tensorflow.python.types.core.TensorLike) tensorflow.python.types.core.TensorLike [source]¶
Computes the pairwise Euclidean distances for a set of time series.
Parameters: ts (numpy.ndarray): A 2D array where each row represents a time series.
Returns: numpy.ndarray: A 2D array representing the pairwise Euclidean distance matrix.
- __call__(d: tsgm.dataset.DatasetOrTensor) tensorflow.python.types.core.TensorLike [source]¶
Calculates the pairwise Euclidean distances for a dataset or tensor.
Parameters: d (tsgm.dataset.DatasetOrTensor): The input dataset or tensor containing time series data.
Returns: float: The pairwise Euclidean distances of the input data.
- class PredictiveParityMetric[source]¶
Measuring predictive parity between two datasets.
This metric assesses the discrepancy in the predictive performance of a model among different groups in two datasets. By default, it uses precision to quantify the predictive performance of the model within each group.
- Args:
y_true_hist (TensorLike): The true target values for the historical data. y_pred_hist (TensorLike): The predicted target values for the historical data. groups_hist (TensorLike): The group assignments for the historical data. y_true_synth (TensorLike): The true target values for the synthetic data. y_pred_synth (TensorLike): The predicted target values for the synthetic data. groups_synth (TensorLike): The group assignments for the synthetic data. metric (callable, optional): The metric used to compare the predictive performance within each group.
Default is precision score.
- Returns:
dict: A dictionary mapping each group to the computed predictive parity metric.
- Example:
>>> metric = PredictiveParityMetric() >>> y_true_hist = [0, 1, 0, 1, 1, 0] >>> y_pred_hist = [0, 1, 0, 0, 1, 1] >>> groups_hist = [0, 1, 0, 1, 1, 0] >>> y_true_synth = [1, 0, 1, 0, 0, 1] >>> y_pred_synth = [1, 0, 1, 1, 0, 0] >>> groups_synth = [1, 1, 0, 0, 0, 1] >>> result = metric(y_true_hist, y_pred_hist, groups_hist, y_true_synth, y_pred_synth, groups_synth) >>> print(result)
- __call__(y_true_hist: tensorflow.python.types.core.TensorLike, y_pred_hist: tensorflow.python.types.core.TensorLike, groups_hist: tensorflow.python.types.core.TensorLike, y_true_synth: tensorflow.python.types.core.TensorLike, y_pred_synth: tensorflow.python.types.core.TensorLike, groups_synth: tensorflow.python.types.core.TensorLike, metric: Callable = _DEFAULT_METRIC) Dict[int, float] [source]¶
Calculate the predictive parity metric for the input datasets.
- Args:
y_true_hist (TensorLike): The true target values for the historical data. y_pred_hist (TensorLike): The predicted target values for the historical data. groups_hist (TensorLike): The group assignments for the historical data. y_true_synth (TensorLike): The true target values for the synthetic data. y_pred_synth (TensorLike): The predicted target values for the synthetic data. groups_synth (TensorLike): The group assignments for the synthetic data. metric (callable, optional): The metric used to compare the predictive performance within each group.
Default is precision score.
- Returns:
dict: A dictionary mapping each group to the computed predictive parity metric.