Recently, Jiemin Fang et. al. , has published a paper that introduces a method to accelerate the neural architecture search named as “elastic architecture transfer for accelerating large-scale neural architecture search“. In this
Neural Architecture Search
Neural architecture search, as its name suggests, is a method to automatically search the best network architecture for a given problem. If you have worked on neural networks, you may have encountered with the problem of selecting best hyperparameters for
A neural architecture search method is not fully automated, as they rely on human designed architecture at the starting point. These methods consists of three components:
- Search Space: A well designed search space in which our method will search best parameters
- Search Method: Which method to use like reinforcement learning, evolutionary search, etc.
- evaluation strategy: Which parameter is used to find best architecture model.
Problem: In any neural architecture search method it requires a large amount of computational cost. Even with the recent advancement in this field, it still requires a lot of GPU days to find best architecture.
EAT-NAS
To solve the problem of computational cost, current studies first search architectures in small datasets and then directly apply to large datasets. But applying architectures searched for small datasets directly to large dataset does not guarantee performance in large datasets. To solve this problem, authors of EAT-NAS have introduced an elastic architecture transfer method to accelerate neural architecture search.
How EAT-NAS works:
In this method, architectures are first searched on small dataset, and the best one is selected as basic architecture for large dataset. This basic architecture is then transferred elastically to large dataset to accelerate search process on large dataset as shown in the figure below.
Authors have searched architecture on CIFAR-10 dataset and then elastically transferred it to large imagenet dataset. Let’s see the whole EAT-NAS process.
- Framework: First search for a top performing architecture on CIFAR-10. Here it is MobilenetV2.
- Search Space: A well-designed search space is required which is consist of five elements ( conv operation, kernel size, skip connection, width and depth factor)
- Population Quality: Selection of top performing model depends on its quality which is decided by the mean and variance of the accuracy of models.
- Architecture Scale Search: It also searches the width factor denoting the expansion ratio of filter number and depth factor denoting the number of layers per block( in selected MobilenetV2).
- Offspring Architecture Generator: After transferring basic architecture to
large dataset, a generator take this architecture asinitial seed. Then a transformation function is applied to this model to generatebest architecture for large dataset.
In the above figure, the upper one is the basic architecture searched on CIFAR-10 and transferred elastically to search architecture for Imagenet dataset( lower one). It takes 22 hours on 4 GPUs to search the basic architecture on CIFAR-10 and 4 days on 8 GPUs to transfer to ImageNet. Which is quite less compared to other methods used for neural architecture search. Also, they have achieved 73.8 % accuracy on the
Referenced Research Paper: EAT-NAS
Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.