Tag Archives: encoder

Compression of data using Autoencoders

In the last blog, we discussed what autoencoders are. In this blog, we will learn, how autoencoders can be used to compress data and reconstruct back the original data.

Here I have used MNIST dataset. First, I have downloaded MNIST dataset which is having digits images(0 to 9), a total of size 45 MB. Let’s, see the code to download data using python.

# download training and test data from mnist and reshape it
from keras.datasets import mnist
(X_train, _), (_, _) = mnist.load_data()
X_train = X_train.astype('float32') / 255.
output_X_train = X_train.reshape(-1,28,28,1)

# download training and test data from mnist and reshape it

from keras.datasets import mnist

(X_train, _), (_, _) = mnist.load_data()

X_train = X_train.astype('float32') / 255.

output_X_train = X_train.reshape(-1,28,28,1)

Since we want to compress the dataset and reconstruct back it into original data, first we have to create a convolutional autoencoder. Let’s see code:

# creating autoencoder model
encoder_inputs = Input(shape = (28,28,1))

conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(encoder_inputs)
pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1)
conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)
pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)
flat = Flatten()(pool2)

enocoder_outputs = Dense(32, activation = 'relu')(flat)
#upsampling in decoder

dense_layer_d = Dense(7*7*32, activation = 'relu')(enocoder_outputs)
output_from_d = Reshape((7,7,32))(dense_layer_d)
conv1_1 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(output_from_d)
upsampling_1 = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(conv1_1)
upsampling_2 = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(upsampling_1)
decoded_outputs = Conv2DTranspose(1, 3, padding='same', activation='relu')(upsampling_2)

autoencoder = Model(encoder_inputs, decoded_outputs)

# creating autoencoder model

encoder_inputs = Input(shape = (28,28,1))

conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(encoder_inputs)

pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1)

conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)

pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)

flat = Flatten()(pool2)

enocoder_outputs = Dense(32, activation = 'relu')(flat)

#upsampling in decoder

dense_layer_d = Dense(7*7*32, activation = 'relu')(enocoder_outputs)

output_from_d = Reshape((7,7,32))(dense_layer_d)

conv1_1 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(output_from_d)

upsampling_1 = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(conv1_1)

upsampling_2 = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(upsampling_1)

decoded_outputs = Conv2DTranspose(1, 3, padding='same', activation='relu')(upsampling_2)

autoencoder = Model(encoder_inputs, decoded_outputs)

From this autoencoder model, I have created encoder and decoder model. Encoder model will compress the data and decoder model will be used while reconstructing original data. Then trained the auotoencoder model.

decoder_input = Input(shape = (32,))
next_layer = decoder_input
for layer in autoencoder.layers[-6:]:  # to get input layer for decoder
    next_layer = layer(next_layer)

decoder = Model(decoder_input, next_layer)

encoder = Model(encoder_inputs, enocoder_outputs)

m = 256 # batch size
n_epoch = 100
autoencoder.compile(optimizer='adam', loss='mse')
autoencoder.fit(output_X_train,output_X_train, epochs=n_epoch, batch_size=m, shuffle=True)

decoder_input = Input(shape = (32,))

next_layer = decoder_input

for layer in autoencoder.layers[-6:]: # to get input layer for decoder

next_layer = layer(next_layer)

decoder = Model(decoder_input, next_layer)

encoder = Model(encoder_inputs, enocoder_outputs)

m = 256 # batch size

n_epoch = 100

autoencoder.compile(optimizer='adam', loss='mse')

autoencoder.fit(output_X_train,output_X_train, epochs=n_epoch, batch_size=m, shuffle=True)

Using encoder model we can save compressed data into a text file. Which having size of 18 MB( Much less then original size 45 MB).

encoded = encoder.predict(output_X_train)
with open('compressed_data.txt', 'w') as data_file:
    for data in encoded:
        for each_data in data:
            data_file.write(str(each_data))
            data_file.write('\n')

encoded = encoder.predict(output_X_train)

with open('compressed_data.txt', 'w') as data_file:

for data in encoded:

for each_data in data:

data_file.write(str(each_data))

data_file.write('\n')

Now next thing is how we can reconstruct this compressed data when original data is needed. The simple solution is, we can save our decoder model and its weight which will be used further to reconstruct this compressed data. Let’s save decoder model and it’s weights.

decoder.save_weights('decoder.h5')
decoder_json = decoder.to_json()
with open('decoder.json', 'w') as json_file:
    json_file.write(decoder_json)

decoder.save_weights('decoder.h5')

decoder_json = decoder.to_json()

with open('decoder.json', 'w') as json_file:

json_file.write(decoder_json)

Finally we are having our compressed data and decoder model. Let’s see code how we can simply reconstruct back using these two.

# reading compressed data
with open('compressed_data.txt') as data_file:
    data = data_file.readlines()

compressed_data = [float(x.strip()) for x in data]
compressed_data= [compressed_data[i:i+32] for i in range(0, len(compressed_data), 32)] 

# load decoder model and its weights
json_file = open('decoder.json', 'r')
loaded_json_model = json_file.read()
decoder = model_from_json(loaded_json_model)
decoder.load_weights('decoder.h5')

decoded_imgs  = decoder.predict(np.array(compressed_data))

# reading compressed data

with open('compressed_data.txt') as data_file:

data = data_file.readlines()

compressed_data = [float(x.strip()) for x in data]

compressed_data= [compressed_data[i:i+32] for i in range(0, len(compressed_data), 32)]

# load decoder model and its weights

json_file = open('decoder.json', 'r')

loaded_json_model = json_file.read()

decoder = model_from_json(loaded_json_model)

decoder.load_weights('decoder.h5')

decoded_imgs = decoder.predict(np.array(compressed_data))

Above are our output from decoder model.

It looks fascinating to compress data to less size and get same data back when we need, but there are some real problem with this method.

The problem is autoencoders can not generalize. Autoencoders can only reconstruct images for which these are trained. But with the advancement in deep learning those days are not far away when you will use this type compression using deep learning.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Variational Autoencoders

Leave a reply

Variational autoencoders are an extension of autoencoders and used as generative models. You can generate data like text, images and even music with the help of variational autoencoders.

Autoencoders are the neural network used to reconstruct original input. To know more about autoencoders please got through this blog. They have a certain application like denoising autoencoders and dimensionality reduction for data visualization. But apart from that, they are fairly limited.

To overcome this limitation, variational autoencoders comes into place. A common autoencoder learns a function which does not train autoencoder to generate images from a particular distribution. Also, if you try to create a generative model using autoencoders, you do not want to generate data as therein input. You want the output data with some variations which mostly look like input data.

Variational Autoencoder Model

A variational autoencoder has encoder and decoder part mostly same as autoencoders, the difference is instead of creating a compact distribution from its encoder, it learns a latent variable model. These latent variables are used to create a probability distribution from which input for the decoder is generated. Another is, instead of using mean squared or cross entropy loss function (as in autoencoders ) it has its own loss function.

I will not go further into the mathematics behind it, Lets jump into the code which will give more understanding about variational autoencoders. To know more about the mathematics behind it please go through this tutorial.

I have implemented variational autoencoder in keras using MNIST dataset. So lets first download the data.

# download training and test data from mnist and reshape it

from keras.datasets import mnist
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
X_train = X_train.astype('float32') / 255.
X_train = X_train.reshape(-1,28,28,1)

X_test = X_test.astype('float32') / 255.
X_test = X_test.reshape(-1,28,28,1)
print(X_train.shape, X_test.shape)

# download training and test data from mnist and reshape it

from keras.datasets import mnist

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

X_train = X_train.astype('float32') / 255.

X_train = X_train.reshape(-1,28,28,1)

X_test = X_test.astype('float32') / 255.

X_test = X_test.reshape(-1,28,28,1)

print(X_train.shape, X_test.shape)

Now create an encoder model as it is created in autoencoders.

# Create encoder network

inputs = Input(shape = (28,28,1))
conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(inputs)
conv1_1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(conv1)
pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1_1)
conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)
pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)

flat = Flatten()(pool2)
input_to_z = Dense(32, activation = 'relu')(flat)

# Create encoder network

inputs = Input(shape = (28,28,1))

conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(inputs)

conv1_1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(conv1)

pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1_1)

conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)

pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)

flat = Flatten()(pool2)

input_to_z = Dense(32, activation = 'relu')(flat)

Latent Distribution Parameters and Function

Now encode the output of the encoder to latent distribution parameters. Here, I have created two parameters mu and sigma which represents the mean and standard distribution of the distribution.

latent_dim = 2 # dimension of latent variable
mu = Dense(latent_dim, name='mu')(input_to_z)
sigma = Dense(latent_dim, name='log_var')(input_to_z)

encoder = Model(inputs, mu)

latent_dim = 2 # dimension of latent variable

mu = Dense(latent_dim, name='mu')(input_to_z)

sigma = Dense(latent_dim, name='log_var')(input_to_z)

encoder = Model(inputs, mu)

Here I have taken latent space dimension equal to 2. This is the bottleneck which means we are passing our entire set of data to two single variables. So if we increase our latent space dimension to 5, 10 or higher, we can get better results in the output. But this will create more data in the bottleneck.

Now create a Gaussian distribution function with mean zero and standard deviation of 1. This distribution will give variation in the input to the decoder, which will help to get variation in the output. Then decoder will predict the output using distribution.

# create latent distribution function and generate vectors

def sampling(args):
    mu, sigma = args
    epsilon = K.random_normal(shape=(K.shape(mu)[0], latent_dim),
                              mean=0., stddev=1.)
    return mu + K.exp(sigma) * epsilon

z = Lambda(sampling)([mu, sigma])

#create decoder network which is reverse of encoder

decoder_inputs = Input(K.int_shape(z)[1:])
dense_layer_d = Dense(7*7*32, activation = 'relu')(decoder_inputs)
output_from_z_d = Reshape((7,7,32))(dense_layer_d)
trans1_d = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(output_from_z_d)
trans1_1_d = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(trans1_d)
trans2_d = Conv2DTranspose(1, 3, padding='same', activation='relu')(trans1_1_d)


decoder = Model(decoder_inputs, trans2_d)
z_decoded = decoder(z)

# create latent distribution function and generate vectors

def sampling(args):

mu, sigma = args

epsilon = K.random_normal(shape=(K.shape(mu)[0], latent_dim),

mean=0., stddev=1.)

return mu + K.exp(sigma) * epsilon

z = Lambda(sampling)([mu, sigma])

#create decoder network which is reverse of encoder

decoder_inputs = Input(K.int_shape(z)[1:])

dense_layer_d = Dense(7*7*32, activation = 'relu')(decoder_inputs)

output_from_z_d = Reshape((7,7,32))(dense_layer_d)

trans1_d = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(output_from_z_d)

trans1_1_d = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(trans1_d)

trans2_d = Conv2DTranspose(1, 3, padding='same', activation='relu')(trans1_1_d)

decoder = Model(decoder_inputs, trans2_d)

z_decoded = decoder(z)

Loss Function

For the loss function, a variational autoencoder uses the sum of two losses, one is the generative loss which is a binary cross entropy loss and measures how accurately the image is predicted, another is the latent loss, which is KL divergence loss, measures how closely a latent variable match Gaussian distribution. This KL divergence makes sure that our distribution generated from encoder do not go away from the origin. Then train the model.

#calculate reconstruction loss and KL divergence

class calc_output_with_los(keras.layers.Layer):

    def vae_loss(self, x, z_decoded):
        x = K.flatten(x)
        z_decoded = K.flatten(z_decoded)

        xent_loss = keras.metrics.binary_crossentropy(x, z_decoded)

        kl_loss = -5e-4 * K.mean(1 + sigma - K.square(mu) - K.exp(sigma), axis=-1)
        return K.mean(xent_loss + kl_loss)

    def call(self, inputs):
        x = inputs[0]
        z_decoded = inputs[1]
        loss = self.vae_loss(x, z_decoded)
        self.add_loss(loss, inputs=inputs)
        return x

outputs = calc_output_with_los()([inputs, z_decoded])

# define variational autoencoder model and train it

vae = Model(inputs, outputs)
m = 256
n_epoch = 10
vae.compile(optimizer='adam', loss=None)
vae.fit(X_train, epochs=n_epoch, batch_size=m, shuffle=True, validation_data=(X_test, None))

#calculate reconstruction loss and KL divergence

class calc_output_with_los(keras.layers.Layer):

def vae_loss(self, x, z_decoded):

x = K.flatten(x)

z_decoded = K.flatten(z_decoded)

xent_loss = keras.metrics.binary_crossentropy(x, z_decoded)

kl_loss = -5e-4 * K.mean(1 + sigma - K.square(mu) - K.exp(sigma), axis=-1)

return K.mean(xent_loss + kl_loss)

def call(self, inputs):

x = inputs[0]

z_decoded = inputs[1]

loss = self.vae_loss(x, z_decoded)

self.add_loss(loss, inputs=inputs)

return x

outputs = calc_output_with_los()([inputs, z_decoded])

# define variational autoencoder model and train it

vae = Model(inputs, outputs)

m = 256

n_epoch = 10

vae.compile(optimizer='adam', loss=None)

vae.fit(X_train, epochs=n_epoch, batch_size=m, shuffle=True, validation_data=(X_test, None))

Our model is ready and we can generate images from it very easily. All we need to do is sample latent variable from distribution and pass it to the decoder. Lets test with the following code:

n = 15  # figure with 15x15 digits

digit_size = 28
figure = np.zeros((digit_size * n, digit_size * n))

grid_x = np.linspace(-1, 1, n)
grid_y = np.linspace(-1, 1, n)

for i, yi in enumerate(grid_x):
    for j, xi in enumerate(grid_y):
        z_sample = np.array([[xi, yi]]) * 1.
        x_decoded = decoder.predict(z_sample)

        digit = x_decoded[0].reshape(digit_size, digit_size)
        figure[i * digit_size: (i + 1) * digit_size,
               j * digit_size: (j + 1) * digit_size] = digit

plt.figure(figsize=(10, 10))
plt.imshow(figure)
plt.show()

n = 15 # figure with 15x15 digits

digit_size = 28

figure = np.zeros((digit_size * n, digit_size * n))

grid_x = np.linspace(-1, 1, n)

grid_y = np.linspace(-1, 1, n)

for i, yi in enumerate(grid_x):

for j, xi in enumerate(grid_y):

z_sample = np.array([[xi, yi]]) * 1.

x_decoded = decoder.predict(z_sample)

digit = x_decoded[0].reshape(digit_size, digit_size)

figure[i * digit_size: (i + 1) * digit_size,

j * digit_size: (j + 1) * digit_size] = digit

plt.figure(figsize=(10, 10))

plt.imshow(figure)

plt.show()

Here is the output generated from sampled distribution in the above code.

The full code can be find here.

Hope you understand the basics of variational autoencoders. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Referenced papers: Auto-Encoding Variational Bayes, Tutorial on Variational Autoencoders

Denoising Autoencoders

Leave a reply

In my previous blog, we have discussed what is an autoencoder, its applications and a simple implementation in keras. In this blog, we will see a variant of autoencoder – ‘ denoising autoencoders ‘.

A denoising autoencoder is an extension of autoencoders. An autoencoder tries to learn identity function( output equals to input ), which makes it risking to not learn useful feature. One method to overcome this problem is to use denoising autoencoders.

For training a denoising autoencoder, we need to use noisy input data. For that, we need to add some noise to an original image. The amount of corrupting data depends on the amount of information present in data. Usually, 25-30 % data is being corrupted. This can be higher if your data contains less information. Let see how you can add noise to data in code:

# adding some noise to data

input_x_train = output_X_train + 0.5 * np.random.normal(loc=0.0, scale=1.0, size=output_X_train.shape) 
input_x_test = output_X_test + 0.5 * np.random.normal(loc=0.0, scale=1.0, size=output_X_test.shape)

# adding some noise to data

input_x_train = output_X_train + 0.5 * np.random.normal(loc=0.0, scale=1.0, size=output_X_train.shape)

input_x_test = output_X_test + 0.5 * np.random.normal(loc=0.0, scale=1.0, size=output_X_test.shape)

To calculate loss, the output of the denoising autoencoder is then compared to original input instead of the corrupted one. Such a loss function train model to learn interesting features rather than learning identity function.

I have implemented denoising autoencoder in keras using MNIST data, which will give you an overview, how a denoising autoencoder works.

# creating denoising autoencoder model
inputs = Input(shape = (28,28,1))

conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(inputs)
pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1)
conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)
pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)

upsampling_1 = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(pool2)
upsampling_2 = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(upsampling_1)
outputs = Conv2DTranspose(1, 3, padding='same', activation='relu')(upsampling_2)

autoencoder = Model(inputs, outputs)
m = 256
n_epoch = 10
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.fit(input_x_train,output_X_train, epochs=n_epoch, batch_size=m, shuffle=True)

# creating denoising autoencoder model

inputs = Input(shape = (28,28,1))

conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(inputs)

pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1)

conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)

pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)

upsampling_1 = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(pool2)

upsampling_2 = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(upsampling_1)

outputs = Conv2DTranspose(1, 3, padding='same', activation='relu')(upsampling_2)

autoencoder = Model(inputs, outputs)

m = 256

n_epoch = 10

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

autoencoder.fit(input_x_train,output_X_train, epochs=n_epoch, batch_size=m, shuffle=True)

following is the result of denoising autoencoder.

The full code can be find here.

Hope you understand the usefulness of denoising autoencoder. In the next blog, we will feature variational autoencoders. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Autoencoders

Leave a reply

Let’s start with a simple definition of autoencoders. ‘ Autoencoders are the neural networks trained to reconstruct their original input’.

Now, you might be thinking what’s the use of reconstructing same data. Let me give you an example If you want to transfer data of GB’s of size and somehow if you can compress it into MB’s and then able to reconstruct back the data to the original size, isn’t that a better way to transfer data. This is one of the applications of autoencoders.

Autoencoders generally consists of two parts, one is encoder and other is decoder. Encoder downscale data to less number of features and decoder upscale the extracted features to original one.

There are some practical applications of autoencoders:

Dimensionality reduction for data visualization
Image Denoising
Generative Models

Visualizing a 10-dimensional vector is difficult. To overcome this problem we need to reduce that 10-dimensional vector into 2-D or 3-D. One of the famous algorithm PCA (Principal Component Analysis) tries to solve this problem. PCA uses linear transformations while autoencoders can use both linear and non-linear transformations for dimensionality reduction. Which makes autoencoders to generate more complex and interesting features than PCA.

Autoencoders can be used to remove the noise present in the image. It can also be used to generate new images required for a specific task. We will see more about these two applications in the next blog.

Now, let’s start with the simple implementation of autoencoders in Keras using MNIST data. First, let’s download MNIST training and test data and reshape it.

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
X_train = X_train.astype('float32') / 255.
output_X_train = X_train.reshape(-1,28,28,1)

X_test = X_test.astype('float32') / 255.
output_X_test = X_test.reshape(-1,28,28,1)

print(X_train.shape, X_test.shape)

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

X_train = X_train.astype('float32') / 255.

output_X_train = X_train.reshape(-1,28,28,1)

X_test = X_test.astype('float32') / 255.

output_X_test = X_test.reshape(-1,28,28,1)

print(X_train.shape, X_test.shape)

Encoder

MNIST data consists of images of digits. So, it is better to use a convolutional neural network in our encoders and decoders. In our encoder, I have used conv and max-pooling layers to extract the compressed representation. Then flatten the encoder output to 32 features. Which will be the input to the decoder.

encoder_inputs = Input(shape = (28,28,1))

conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(encoder_inputs)
pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1)
conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)
pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)
flat = Flatten()(pool2)

enocder_outputs = Dense(32, activation = 'relu')(flat)

encoder_inputs = Input(shape = (28,28,1))

conv1 = Conv2D(16, (3,3), activation = 'relu', padding = "SAME")(encoder_inputs)

pool1 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv1)

conv2 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(pool1)

pool2 = MaxPooling2D(pool_size = (2,2), strides = 2)(conv2)

flat = Flatten()(pool2)

enocder_outputs = Dense(32, activation = 'relu')(flat)

Decoder

In the decoder, we need to upsample the extracted 32 features into the original size of the image. To achieve this, I have used Conv2DTranspose functions from keras. Then the final layer of the decoder will give the reconstructed output which will be similar to the original input.

dense_layer_d = Dense(7*7*32, activation = 'relu')(enocder_outputs)
output_from_d = Reshape((7,7,32))(dense_layer_d)
conv1_1 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(output_from_d)
upsampling_1 = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(conv1_1)
upsampling_2 = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(upsampling_1)
decoded_outputs = Conv2DTranspose(1, 3, padding='same', activation='relu')(upsampling_2)

autoencoder = Model(encoder_inputs, decoded_outputs)

dense_layer_d = Dense(7*7*32, activation = 'relu')(enocder_outputs)

output_from_d = Reshape((7,7,32))(dense_layer_d)

conv1_1 = Conv2D(32, (3,3), activation = 'relu', padding = "SAME")(output_from_d)

upsampling_1 = Conv2DTranspose(32, 3, padding='same', activation='relu', strides=(2, 2))(conv1_1)

upsampling_2 = Conv2DTranspose(16, 3, padding='same', activation='relu', strides=(2, 2))(upsampling_1)

decoded_outputs = Conv2DTranspose(1, 3, padding='same', activation='relu')(upsampling_2)

autoencoder = Model(encoder_inputs, decoded_outputs)

To minimize reconstruction loss, we train the network with a large dataset and update weights. Now, our model is created, the next thing is to compile and train the model.

m = 256
n_epoch = 10
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.fit(output_X_train,output_X_train, epochs=n_epoch, batch_size=m, shuffle=True)

m = 256

n_epoch = 10

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

autoencoder.fit(output_X_train,output_X_train, epochs=n_epoch, batch_size=m, shuffle=True)

Below are the results from autoencoder trained above. The first line of digits shows the original input (test images) while the second line represents the reconstructed inputs from the model.

The full code can be find here.

Hope you understand the basics of autoencoders, where these can be used and how a simple autoencoder be implemented. In the next blog, we will see how to denoise an image using autoencoders. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Referenced Research Paper: http://proceedings.mlr.press/v27/baldi12a/baldi12a.pdf

TheAILearner

Mastering Artificial Intelligence

Tag Archives: encoder

Compression of data using Autoencoders

Variational Autoencoders

Variational Autoencoder Model

Latent Distribution Parameters and Function

Loss Function

Referenced papers: Auto-Encoding Variational Bayes, Tutorial on Variational Autoencoders

Denoising Autoencoders

Autoencoders

Encoder

Decoder