teras.preprocessing.CTGANDataSampler

teras.preprocessing.CTGANDataSampler#

class teras.preprocessing.CTGANDataSampler(metadata, categorical_features, continuous_features=None, batch_size=512, seed=1337)[source]#

CTGANDataSampler class based on the data sampler class in the official CTGAN implementation.

Reference(s):: https://arxiv.org/abs/1907.00503 sdv-dev/CTGAN

Parameters:

metadata – dict, A dictionary of metadata computed during data transformation. You can access it from the .get_metadata() of CTGANDataTransformer instance.
categorical_features (Union[List[str], Tuple[str]]) – list, List of categorical features names. CTGAN requires dataset to have at least one categorical feature, if your dataset doesn’t contain any categorical features, consider using some other generative model.
continuous_features (Union[List[str], Tuple[str]]) – list, List of continuous features names
batch_size (int) – int, default 512, Batch size to use for the dataset.
seed (int) – int, Seed for random ops.

__init__(metadata, categorical_features, continuous_features=None, batch_size=512, seed=1337)[source]#

Methods

`__init__`(metadata, categorical_features[, ...])
`generator`(x_transformed)	Used to create a tensorflow dataset.
`get_dataset`(x_transformed, x_original)
`sample_cond_vectors_for_generation`(batch_size)	The difference between this method and the training one is that, here we sample indices purely randomly instead of based on the calculated probability as proposed in the paper.
`sample_cond_vectors_for_training`(batch_size)

teras.preprocessing.CTGANDataSampler

Contents

teras.preprocessing.CTGANDataSampler#