autopandas.generators package¶
Submodules¶
autopandas.generators.anm module¶
-
class
autopandas.generators.anm.
ANM
(model=None)[source]¶ Bases:
object
-
__init__
(model=None)[source]¶ Data generator using multiple imputations with random forest (or another model).
Parameters: model – Model used for imputations.
-
fit
(data, noise=False)[source]¶ Fit one random forest (or another model) for each column, given the others.
Parameters: noise – If True, add noise during sampling relative to the residual matrix
-
partial_fit_generate
(n=1, p=0.8, replace=True, noise=False)[source]¶ Fit and generate for high dimensional case. To avoid memory error, features are trained and generated one by one.
Parameters: - n – Number of examples to sample
- p – The probability of changing a value if p=0, the generated dataset will be equals to the original if p=1, the generated dataset will contains only new values
- replace – If True, sample the original data with replacement before the imputations
- noise – If True, add noise relative to the residual matrix. NOT IMPLEMENTED (not possible?)
Returns: Generated data
Return type: pd.DataFrame
-
sample
(n=1, p=0.8, replace=True, noise=False)[source]¶ Generate n rows by copying data and then do values imputations.
Parameters: - n – Number of examples to sample
- p – The probability of changing a value if p=0, the generated dataset will be equals to the original if p=1, the generated dataset will contains only new values
- replace – If True, sample the original data with replacement before the imputations
- noise – If True, add noise relative to the residual matrix
Returns: Generated data
Return type: pd.DataFrame
-
autopandas.generators.artificial module¶
autopandas.generators.autoencoder module¶
-
class
autopandas.generators.autoencoder.
AE
(input_dim, layers=[], latent_dim=2, architecture='fully', loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶ Bases:
object
-
__init__
(input_dim, layers=[], latent_dim=2, architecture='fully', loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶ Autoencoder with fully connected layers.
Default behaviour: Symmetric layers but no weight sharing. Default behaviour: For CNN architecture, if latent_dim is None then there is no dense layers.
The latent space dimension depends on the convolutional layers in this case.Parameters: - input_dim – Input/output size.
- layers – Dimension of intermediate layers (encoder and decoder). It can be: - an integer (one intermediate layer) - a list of integers (several intermediate layers)
- latent_dim – Dimension of latent space layer.
- architecture – ‘fully’, ‘cnn’.
- espilon_std – Standard deviation of gaussian distribution prior.
- decoder_layers – Dimension of intermediate decoder layers for asymmetrical architectures.
-
distance
(X, Y, **kwargs)[source]¶ Step 1: project X and Y in the learned latent space, Step 2: compute distance between the projections (NNAA score by default).
-
autopandas.generators.copula module¶
-
class
autopandas.generators.copula.
Copula
[source]¶ Bases:
object
-
autopandas.generators.copula.
copula_generate
(X, generator=None, n=None)[source]¶ Generate using copula trick.
Parameters: - generator – Model to fit and sample from. KDE by default.
- n – Number of examples to generate. By default it is the number of observations in X.
autopandas.generators.copycat module¶
autopandas.generators.gmm module¶
autopandas.generators.kde module¶
autopandas.generators.sae module¶
-
class
autopandas.generators.sae.
SAE
(layers, normalization=False, **kwargs)[source]¶
autopandas.generators.vae module¶
-
class
autopandas.generators.vae.
KLDivergenceLayer
(*args, **kwargs)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Identity transform layer that adds KL divergence to the final model loss.
-
class
autopandas.generators.vae.
VAE
(input_dim, layers=[], latent_dim=2, architecture='fully', epsilon_std=1.0, loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶ Bases:
autopandas.generators.autoencoder.AE
-
__init__
(input_dim, layers=[], latent_dim=2, architecture='fully', epsilon_std=1.0, loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶ Variational Autoencoder.
Parameters: - input_dim – Input/output size.
- layers – Dimension of intermediate layers (encoder and decoder). It can be: - an integer (one intermediate layer) - a list of integers (several intermediate layers)
- latent_dim – Dimension of latent space layer.
- architecture – ‘fully’, ‘cnn’.
- espilon_std – Standard deviation of gaussian distribution prior.
- decoder_layers – Dimension of intermediate decoder layers for asymmetrical architectures.
-