autopandas.generators package¶

Submodules¶

autopandas.generators.anm module¶

class autopandas.generators.anm.ANM(model=None)[source]¶

Bases: object

__init__(model=None)[source]¶

Data generator using multiple imputations with random forest (or another model).

Parameters:	model – Model used for imputations.

fit(data, noise=False)[source]¶

Fit one random forest (or another model) for each column, given the others.

Parameters:	noise – If True, add noise during sampling relative to the residual matrix

partial_fit_generate(n=1, p=0.8, replace=True, noise=False)[source]¶

Fit and generate for high dimensional case. To avoid memory error, features are trained and generated one by one.

Parameters:	n – Number of examples to sample p – The probability of changing a value if p=0, the generated dataset will be equals to the original if p=1, the generated dataset will contains only new values replace – If True, sample the original data with replacement before the imputations noise – If True, add noise relative to the residual matrix. NOT IMPLEMENTED (not possible?)
Returns:	Generated data
Return type:	pd.DataFrame

sample(n=1, p=0.8, replace=True, noise=False)[source]¶

Generate n rows by copying data and then do values imputations.

Parameters:	n – Number of examples to sample p – The probability of changing a value if p=0, the generated dataset will be equals to the original if p=1, the generated dataset will contains only new values replace – If True, sample the original data with replacement before the imputations noise – If True, add noise relative to the residual matrix
Returns:	Generated data
Return type:	pd.DataFrame

autopandas.generators.artificial module¶

class autopandas.generators.artificial.Artificial(method='moons')[source]¶

Bases: object

__init__(method='moons')[source]¶

Artificial data generator. Generate 2D classification datasets.

Parameters:	method – ‘moons’, ‘blobs’ or ‘circles’.

sample(n=1, noise=0.01)[source]¶

Sample data from the artificial data generator.

Parameters:	n – Number of artificial points to create.

autopandas.generators.autoencoder module¶

class autopandas.generators.autoencoder.AE(input_dim, layers=[], latent_dim=2, architecture='fully', loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶

Bases: object

__init__(input_dim, layers=[], latent_dim=2, architecture='fully', loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶

Autoencoder with fully connected layers.

Default behaviour: Symmetric layers but no weight sharing. Default behaviour: For CNN architecture, if latent_dim is None then there is no dense layers.

The latent space dimension depends on the convolutional layers in this case.

Parameters:

input_dim – Input/output size.
layers – Dimension of intermediate layers (encoder and decoder). It can be: - an integer (one intermediate layer) - a list of integers (several intermediate layers)
latent_dim – Dimension of latent space layer.
architecture – ‘fully’, ‘cnn’.
espilon_std – Standard deviation of gaussian distribution prior.
decoder_layers – Dimension of intermediate decoder layers for asymmetrical architectures.

distance(X, Y, **kwargs)[source]¶: Step 1: project X and Y in the learned latent space, Step 2: compute distance between the projections (NNAA score by default).

fit(X, X2=None, **kwargs)[source]¶

get_autoencoder()[source]¶

get_decoder()[source]¶

get_encoder()[source]¶

init_loss(loss='nll')[source]¶

init_model(architecture='fully')[source]¶

Parameters:	architecture – ‘fully’, ‘cnn’

sample(n=100, loc=0, scale=1)[source]¶

Parameters:	scale – Standard deviation of gaussian distribution prior.

siamese_distance(x, y, **kwargs)[source]¶

autopandas.generators.copula module¶

class autopandas.generators.copula.Copula[source]¶

Bases: object

__init__()[source]¶: Copula generator.

fit(data)[source]¶

Use the copula trick and train the generator with data.

Parameters:	data – Data frame to use as training set.

sample(n=1, replace=False)[source]¶

Sample from trained generator.

Parameters:	n – Number of examples to sample. replace – If True, sample with replacement.

autopandas.generators.copula.copula_generate(X, generator=None, n=None)[source]¶

Generate using copula trick.

Parameters:	generator – Model to fit and sample from. KDE by default. n – Number of examples to generate. By default it is the number of observations in X.

autopandas.generators.copula.marginal_retrofit(Xartif, Xreal)[source]¶: Retrofit the marginal distributions of the features in Xartif to those in Xreal.

autopandas.generators.copula.matrix_to_rank(X)[source]¶

autopandas.generators.copula.rank_matrix_to_inverse(X)[source]¶

autopandas.generators.copula.rank_vector_to_inverse(x)[source]¶

autopandas.generators.copula.vector_to_rank(x, reverse=False)[source]¶

autopandas.generators.copycat module¶

class autopandas.generators.copycat.Copycat[source]¶

Bases: object

__init__()[source]¶: Baseline generator: simply copy training data.

fit(data)[source]¶

Train the generator with data.

Parameters:	data – The data to copy.

sample(n=1, replace=False)[source]¶

Sample from train data.

Parameters:	n – Number of examples to sample. replace – If True, sample with replacement.

autopandas.generators.gmm module¶

class autopandas.generators.gmm.GMM(**kwargs)[source]¶

Bases: object

__init__(**kwargs)[source]¶: Gaussian Mixture Model.

fit(data, **kwargs)[source]¶

Train the generator with data.

Parameters:	data – The training data.

sample(n=1, **kwargs)[source]¶

Sample from trained GMM.

Parameters:	n – Number of examples to sample.

autopandas.generators.kde module¶

class autopandas.generators.kde.KDE(**kwargs)[source]¶

Bases: object

__init__(**kwargs)[source]¶: Kernel Density Estimation (parzen windows).

fit(data, **kwargs)[source]¶

Train the generator with data.

Parameters:	data – The training data.

sample(n=1, **kwargs)[source]¶

Sample from trained KDE.

Parameters:	n – Number of examples to sample.

autopandas.generators.sae module¶

class autopandas.generators.sae.SAE(layers, normalization=False, **kwargs)[source]¶

Bases: autopandas.generators.autoencoder.AE

__init__(layers, normalization=False, **kwargs)[source]¶

Stacked Autoencoder. AE with submodel training.

Parameters:	layers – Dimension list of layers including input, intermediate (at least one) and latent layer.

autoencode(X)[source]¶

decode(X)[source]¶

encode(X)[source]¶

fit(X, epochs=10, validation_data=None, **kwargs)[source]¶

normalize(X, i=None)[source]¶

reset_normalization()[source]¶

sample(n=100, loc=0, scale=1)[source]¶

Parameters:	scale – Standard deviation of gaussian distribution prior.

autopandas.generators.sae.merge(model1, model2)[source]¶

autopandas.generators.vae module¶

class autopandas.generators.vae.KLDivergenceLayer(*args, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Identity transform layer that adds KL divergence to the final model loss.

__init__(*args, **kwargs)[source]¶

call(inputs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

class autopandas.generators.vae.VAE(input_dim, layers=[], latent_dim=2, architecture='fully', epsilon_std=1.0, loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶

Bases: autopandas.generators.autoencoder.AE

__init__(input_dim, layers=[], latent_dim=2, architecture='fully', epsilon_std=1.0, loss='nll', optimizer='rmsprop', decoder_layers=None)[source]¶

Variational Autoencoder.

Parameters:

input_dim – Input/output size.
layers – Dimension of intermediate layers (encoder and decoder). It can be: - an integer (one intermediate layer) - a list of integers (several intermediate layers)
latent_dim – Dimension of latent space layer.
architecture – ‘fully’, ‘cnn’.
espilon_std – Standard deviation of gaussian distribution prior.
decoder_layers – Dimension of intermediate decoder layers for asymmetrical architectures.

autopandas.generators package¶

Submodules¶

autopandas.generators.anm module¶

autopandas.generators.artificial module¶

autopandas.generators.autoencoder module¶

autopandas.generators.copula module¶

autopandas.generators.copycat module¶

autopandas.generators.gmm module¶

autopandas.generators.kde module¶

autopandas.generators.sae module¶

autopandas.generators.vae module¶

Module contents¶