Large-Scale Object Detection in the Wild with Imbalanced Data Distribution, and Multi-Labels (TPAMI 2024)

研究介绍：

使用更多数据进行训练一直是深度学习时代提升性能最稳定、最有效的方法。Open Images 数据集是最大的目标检测数据集，为通用和复杂的场景提供了巨大的机遇和挑战。但为了管理这种庞大的数据规模而设计的半自动化收集和标记流程会导致与标签相关的问题，包括每个物体显式或隐式的多个标签以及标签分布高度不平衡。在本文中，我们定量分析了大规模目标检测中的主要问题，并详细而全面地展示了我们的解决方案。首先，我们设计了一个并发 softmax 来处理物体检测中的多标签问题，并提出了一种带有混合训练调度器的软平衡采样方法来解决标签不平衡问题。此方法获得了 3.34 个点的显著性能提升，在 Open Images 公共目标检测测试集上实现了最佳单一模型性能，mAP 为 60.90%。然后，我们引入了一个精心设计的集成机制，大大提高了单个模型的性能，实现了 67.17% 的整体 mAP，比 2018 年 Open Images 公开测试的最佳结果高出 4.29 个点。我们的结果发布在 https://www.kaggle.com/c/open-images-2019-object-detection/leaderboard 上。

Abstract

Training with more data has always been the most stable and effective way of improving performance in the deep learning era. The Open Images dataset, the largest object detection dataset, presents significant opportunities and challenges for general and sophisticated scenarios. However, its semi-automatic collection and labeling process, designed to manage the huge data scale, leads to label-related problems, including explicit or implicit multiple labels per object and highly imbalanced label distribution. In this work, we quantitatively analyze the major problems in large-scale object detection and provide a detailed yet comprehensive demonstration of our solutions. First, we design a concurrent softmax to handle the multi-label problems in object detection and propose a soft-balance sampling method with a hybrid training scheduler to address the label imbalance. This approach yields a notable improvement of 3.34 points, achieving the best single-model performance with a mAP of 60.90% on the public object detection test set of Open Images. Then, we introduce a well-designed ensemble mechanism that substantially enhances the performance of the single model, achieving an overall mAP of 67.17%, which is 4.29 points higher than the best result from the Open Images public test 2018. Our result is published on https://www.kaggle.com/c/open-images-2019-object-detection/leaderboard.

Zhaoxiang Zhang (张兆翔)

Large-Scale Object Detection in the Wild with Imbalanced Data Distribution, and Multi-Labels (TPAMI 2024)

发表回复取消回复

What is new

Opening Positions

Zhaoxiang Zhang (张兆翔)

发表回复 取消回复

What is new

Opening Positions

发表回复取消回复