The DEtection TRansformer (DETR) algorithm has received considerable attention in the research community and is gradually emerging as a mainstream approach for object detection and other perception tasks. However, the current field lacks a unified and comprehensive benchmark specifically tailored for DETR-based models. To address this issue, we develop a unified, highly modular, and lightweight codebase called detrex, which supports a majority of the mainstream DETR-based instance recognition algorithms, covering various fundamental tasks, including object detection, segmentation, and pose estimation. We conduct extensive experiments under detrex and perform a comprehensive benchmark for DETR-based models. Moreover, we enhance the performance of detection transformers through the refinement of training hyper-parameters, providing strong baselines for supported algorithms. We hope that detrex could offer research communities a standardized and unified platform to evaluate and compare different DETR-based models while fostering a deeper understanding and driving advancements in DETR-based instance recognition.
The modular design for DETR-based algorithms under detrex
The design comparisons of detrex with other codebases
Benchmarking the performance of DETR variants with a ResNet-50 backbone on COCO val2017. The best and second-best results are highlighted in bold and underlined, respectively.
Comparisons of the effectiveness of various backbones based on DINO-4scale detector.
Ablation study on DETR variants with NMS post-processing. We set the default NMS threshold to 0.8.
Ablation study on DINO and DETA with different frozen stages based on ResNet-50 backbone. The frozen stage "1" means only freezing the stem in the backbone. Frozen stage "2" means freezing the stem and the first residual stage and stage "0" means there is no frozen layer in the backbone.
Ablation studies on hyper-parameters for DETR variants. For a fair comparison, we use ResNet-50 as the default backbone and freeze the stem in the backbone.
Comparison the performance of DETR variants between detrex implementations and their original implementations
@misc{ren2023detrex,
title={detrex: Benchmarking Detection Transformers},
author={Tianhe Ren and Shilong Liu and Feng Li and Hao Zhang and Ailing Zeng and Jie Yang and Xingyu Liao and Ding Jia and Hongyang Li and He Cao and Jianan Wang and Zhaoyang Zeng and Xianbiao Qi and Yuhui Yuan and Jianwei Yang and Lei Zhang},
year={2023},
eprint={2306.07265},
archivePrefix={arXiv},
primaryClass={cs.CV}
}