Scale-Aware Trident Networks for Object Detection (ICCV2019)

Journal / Conference

The IEEE International Conference on Computer Vision (ICCV, 2019)

[PDF link: here]

[Code link: here]

Keywords

Object Detection, TridentNet

Abstract

Scale variation is one of the key challenges in object detection. In this work, we first present a controlled experiment to investigate the effect of receptive fields for scale variation in object detection. Based on the findings from the exploration experiments, we propose a novel Trident Network (TridentNet) aiming to generate scale-specific feature maps with a uniform representational power. We construct a parallel multi-branch architecture in which each branch shares the same transformation parameters but with different receptive fields. Then, we adopt a scale-aware training scheme to specialize each branch by sampling object instances of proper scales for training. As a bonus, a fast approximation version of TridentNet could achieve significant improvements without any additional parameters and computational cost compared with the vanilla detector. On the COCO dataset, our TridentNet with ResNet-101 backbone achieves state-of-the-art single-model results of 48.4 mAP.

Method/Framework

Illustration of the proposed TridentNet. The multiple branches in trident blocks share the same parameters with different dilation rates to generate scale-specific feature maps. Objects of specified scales are sampled for each branch during training. The final proposals or detections from multiple branches will be combined using Non-maximum Suppression(NMS). Here we only show the backbone network of TridentNet. The RPN and Fast R-CNN heads are shared among branches and ignored for simplicity.

Experiments

We conduct experiments on the COCO dataset. we train models on the union of 80k training images and 35k subset of validation images (trainval35k), and evaluate on a set of 5k validation images (minival). We also report the final results on a set of 20k test images (test-dev).

Highlight

  • We present our investigation results about the effect of the receptive field in scale variation. To our best knowledge, we are the first to design controlled experiments to explore the receptive field on the object detection task.
  • We propose a novel Trident Network to deal with scale variation problem for object detection.
  • We propose a fast approximation, TridentNet Fast, with only one major branch via our weight-sharing trident-block design, thus introducing no additional parameters and computational cost during inference.
  • We validate the effectiveness of our approach on the standard COCO benchmark with thorough ablation studies. Compared with the state-of-the-art methods, our proposed method achieves an mAP of 48.4 using a single model with ResNet-101 backbone.

Citation

@InProceedings{Li_2019_ICCV,
author = {Li, Yanghao and Chen, Yuntao and Wang, Naiyan and Zhang, Zhaoxiang},
title = {Scale-Aware Trident Networks for Object Detection},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}

}

Leave a Reply

Your email address will not be published. Required fields are marked *

Zhaoxiang Zhang © 2020