EAT-NAS: Elastic Architecture Transfer for Neural Architecture Search

Recently, Jiemin Fang et. al. , has published a paper that introduces a method to accelerate the neural architecture search named as “elastic architecture transfer for accelerating large-scale neural architecture search“. In this blog we will learn what is neural architecture search, what are the limitations associated with it and how this paper is overcoming those limitations.

Neural Architecture Search

Neural architecture search, as its name suggests, is a method to automatically search the best network architecture for a given problem. If you have worked on neural networks, you may have encountered with the problem of selecting best hyperparameters for the the network i.e. which optimizer to select , what learning rate to use, how many layers to add and so on. To solve this problem different methods have been aroused like, evolutionary search, reinforcement learning, gradient based-optimization etc.

A neural architecture search method is not fully automated, as they rely on human designed architecture at the starting point. These methods consists of three components:

  1. Search Space: A well designed search space in which our method will search best parameters
  2. Search Method: Which method to use like reinforcement learning, evolutionary search, etc.
  3. evaluation strategy: Which parameter is used to find best architecture model.

Problem: In any neural architecture search method it requires a large amount of computational cost. Even with the recent advancement in this field, it still requires a lot of GPU days to find best architecture.

EAT-NAS

To solve the problem of computational cost, current studies first search architectures in small datasets and then directly apply to large datasets. But applying architectures searched for small datasets directly to large dataset does not guarantee performance in large datasets. To solve this problem, authors of EAT-NAS have introduced an elastic architecture transfer method to accelerate neural architecture search.

How EAT-NAS works:

In this method, architectures are first searched on small dataset, and the best one is selected as basic architecture for large dataset. This basic architecture is then transferred elastically to large dataset to accelerate search process on large dataset as shown in the figure below.

Authors have searched architecture on CIFAR-10 dataset and then elastically transferred it to large imagenet dataset. Let’s see the whole EAT-NAS process.

  1. Framework: First search for a top performing architecture on CIFAR-10. Here it is MobilenetV2.
  2. Search Space: A well-designed search space is required which is consist of five elements ( conv operation, kernel size, skip connection, width and depth factor)
  3. Population Quality: Selection of top performing model depends on its quality which is decided by the mean and variance of the accuracy of models.
  4. Architecture Scale Search: It also searches the width factor denoting the expansion ratio of filter number and depth factor denoting the number of layers per block( in selected MobilenetV2).
  5. Offspring Architecture Generator: After transferring basic architecture to large dataset, a generator take this architecture as initial seed. Then a transformation function is applied to this model to generate best architecture for large dataset.

In the above figure, the upper one is the basic architecture searched on CIFAR-10 and transferred elastically to search architecture for Imagenet dataset( lower one). It takes 22 hours on 4 GPUs to search the basic architecture on CIFAR-10 and 4 days on 8 GPUs to transfer to ImageNet. Which is quite less compared to other methods used for neural architecture search. Also, they have achieved 73.8 % accuracy on the imagenet dataset which surpasses accuracy achieved by architectures searched from scratch on imagenet dataset.

Referenced Research Paper: EAT-NAS

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Leave a Reply