teras.preprocessing.CTGANDataTransformer#
- class teras.preprocessing.CTGANDataTransformer(continuous_features=None, categorical_features=None, max_clusters=10, std_multiplier=4, weight_threshold=0.005, covariance_type='full', weight_concentration_prior_type='dirichlet_process', weight_concentration_prior=0.001)[source]#
Data Transformation class based on the data transformation in the official CTGAN paper and implementation.
- Reference(s):
- Parameters:
categorical_features (
Union[List[str],Tuple[str]]) – list, List of categorical features names in the dataset.continuous_features (
Union[List[str],Tuple[str]]) – list, List of continuous features names in the dataset.max_clusters (
int) – int, Maximum Number of clusters to use in ModeSpecificNormalization. Defaults to 10.std_multiplier (
int) – int, Multiplies the standard deviation in the normalization. Defaults to 4.weight_threshold (
float) – float, The minimum value a component weight can take to be considered a valid component. weights_ under this value will be ignored. (Taken from the official implementation.) Defaults to 0.005.covariance_type (
str) – str, Parameter for the GaussianMixtureModel class of sklearn. Defaults to “full”.weight_concentration_prior_type (
str) – str, Parameter for the GaussianMixtureModel class of sklearn. Defaults to “dirichlet_process”weight_concentration_prior (
float) – float, Parameter for the GaussianMixtureModel class of sklearn. Defaults to 0.001.
- __init__(continuous_features=None, categorical_features=None, max_clusters=10, std_multiplier=4, weight_threshold=0.005, covariance_type='full', weight_concentration_prior_type='dirichlet_process', weight_concentration_prior=0.001)[source]#
Methods
__init__([continuous_features, ...])fit(x)fit_transform(x)get_metadata()load(filename)Loads the saved state of CTGANDataTransformer from the json file.
reverse_transform(x_generated)Reverses transforms the generated data to the original data format.
save(filename)Saves the fitted state of CTGANDataTransformer instance for portability, in the json format.
transform(**kwargs)Attributes
metadata