Journal / Conference
Thirty-third Conference on Neural Information Processing Systems(NIPS, 2019)
[PDF link: here]
[Code link: PENDING]
Keywords
Neural Architecture Transformation Search(NATS), Object Detection
Abstract
Recently, Neural Architecture Search has achieved great success in large-scale image classification. In contrast, there have been limited works focusing on architecture search for object detection, mainly because the costly ImageNet pretraining is always required for detectors. Training from scratch, as a substitute, demands more epochs to converge and brings no computation saving. To overcome this obstacle, we introduce a practical neural architecture transformation search(NATS) algorithm for object detection in this paper. Instead of searching and constructing an entire network, NATS explores the architecture space on the base of existing network and reusing its weights. We propose a novel neural architecture search strategy in channel-level instead of path-level and devise a search space specially targeting at object detection. With the combination of these two designs, an architecture transformation scheme could be discovered to adapt a network designed for image classification to task of object detection. Since our method is gradient-based and only searches for a transformation scheme, the weights of models pretrained in ImageNet could be utilized in both searching and retraining stage, which makes the whole process very efficient. The transformed network requires no extra parameters and FLOPs, and is friendly to hardware optimization, which is practical to use in real-time application. In experiments, we demonstrate the effectiveness of NATS on networks like ResNet and ResNeXt. Our transformed networks, combined with various detection frameworks, achieve significant improvements on the COCO dataset while keeping fast.
Method/Framework
We build up a hyper-net with all candidate operations included.Rather than treat each operation as a searchable unit, we split it into groups in channel-level and treat each channel group as searchable unit.When decoding the hyper-net into discrete architecture, we calculate the intensity of each operation in channel-domain, and rebuild the final structure according to the intensity.
Experiments
We use the MS-COCO for experiment in this paper. It contains 83K training images in train2014 and 40K validation images in val2014. In its 2017 version, it has 118K images in train2017 set and 5K images in val2017(a.k.a minival). The dataset is widely believed challenging in particular due to huge variation of object scales and large number of objects per image. We consider AP@IoU as evaluation metric which averages mAP across IoU threshold ranging from 0.50 to 0.95 with an interval of 0.05. During searching stage, we use train2014 for training model parameters and use 35K images from val2014 that are not in minival for calibrating architecture parameters. During retraining stage, our searched model is trained with train2017 and evaluated with minival as convention.
Highlight
- We propose a neural architecture transformation search method that could utilize pretrained models.
- We propose a new search space in channel level that mixed operation could be searched.
- Our method is gradient-based and is searching-efficient.
Citation
@incollection{NIPS2019_9576,
title = {Efficient Neural Architecture Transformation Search in Channel-Level for Object Detection},
author = {Peng, Junran and Sun, Ming and ZHANG, ZHAO-XIANG and Tan, Tieniu and Yan, Junjie},
booktitle = {Advances in Neural Information Processing Systems 32},
editor = {H. Wallach and H. Larochelle and A. Beygelzimer and F. d\textquotesingle Alch\'{e}-Buc and E. Fox and R. Garnett},
pages = {14313–14322},
year = {2019},
publisher = {Curran Associates, Inc.},
url = {http://papers.nips.cc/paper/9576-efficient-neural-architecture-transformation-search-in-channel-level-for-object-detection.pdf}
}