List of Topics:
Location Research Breakthrough Possible @S-Logix pro@slogix.in

Office Address

Social List

Two-stage Neural Architecture Optimization with Separated Training and Search - 2023

two-stage-neural-architecture-optimization-with-separated-training-and-search.jpg

Two-stage Neural Architecture Optimization with Separated Training and Search | S-Logix

Research Area:  Machine Learning

Abstract:

Neural architecture search (NAS) has been a popular research topic for designing deep neural networks (DNNs) automatically. It is able to improve the design efficiency of neural architectures significantly for given learning tasks. Recently, instead of conducting architecture search in the original neural architecture space, many NAS approaches have been proposed to learn continuous representations from neural architectures for architecture search or estimation. In particular, Neural Architecture Optimization (NAO) is a representative method which encodes neural architectures as continuous representations by an auto-encoder and then performs continuous optimization in the encoded space with gradient-based methods. However, as NAO only considers the top-ranked architectures in learning the continuous representation, it could fail to construct a satisfied continuous optimization space which contains the expected high-quality neural architectures. Taking this cue, in this paper we propose a two-stage NAO (TNAO) to learn a more completed continuous representation of neural architectures which could provide a better optimization space for NAS. Specifically, by designing a pipeline that separates the training and search stages, we first build the training set via random sampling from the entire neural architecture search space, which is with the aim of collecting the well-distributed neural architectures for training. Moreover, to exploit the architectural semantic information with limited data effectively, we propose an improved Transformer auto-encoder for learning the continuous representation, which is supervised by ranking information of the neural architecture performance. Lastly, towards more effective optimization of neural architectures, we adopt a population-based swarm intelligence algorithm, i.e. competitive swarm optimization (CSO), with a newly designed remapping scoring scheme. To evaluate the efficiency of the proposed TNAO, comprehensive experimental studies are conducted on two common search spaces, i.e., NAS-Bench-101 and NAS-Bench-201. The architecture with the top 0.02% performance is discovered on NAS-Bench-101 and the best architecture in the CIFAR-10 dataset is obtained on NAS-Bench-201.

Keywords:  
Training
Representation learning
Pipelines
Semantics
Estimation
Transformers
Feature extraction

Author(s) Name:  Longze He; Boyu Hou; Junwei Dong; Liang Feng

Journal name:  

Conferrence name:  International Joint Conference on Neural Networks

Publisher name:  IEEE

DOI:  10.1109/IJCNN54540.2023.10191955

Volume Information: