less than 1 minute read

DEtection TRansformer(DETR)

It has similar architecture with Faster R-CNN,
but the part about making anchors and NMS are replaced to Transformer.

Backbone: Resnet
Transformer encoder-decoder
FFN

Transformer

SimpleDet implementation, from FAIR

https://github.com/facebookresearch/detr

git clone https://github.com/facebookresearch/detr.git
pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
conda install cython scipy
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc
sudo apt-get upgrade libstdc++6

python main.py --batch_size 2 --no_aux_loss --eval --resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth --coco_path ../simpledet/data/coco/images/val2017/

set the labels from 0 to ~
num_classes = max_id + 1

epochs = 1000, because the transformer required to be trained well.
del classifier weight, bias.

  • you could be removing the query of transformer, also.
checkpoint = torch.load("detr-r50-e632da11.pth", map_location='cpu')
del checkpoint["model"]["class_embed.weight"]
del checkpoint["model"]["class_embed.bias"]
torch.save(checkpoint,"detr-r50_no-class-head.pth")

#del checkpoint["model"]["query_embed.weight"]
#torch.save(checkpoint,"detr-r50_no-class-head_no-query.pth")

Training strategy for DETR

They recommend to start with a pre-trained weight file, ‘detr-r50-e632-da11.pth.’
Go with that default pre-trained state, but try to change some parts.
low_lr: 1e-4, 1e-5
many epochs: 500, 1000
lr_scheduler: gradual, cosine
proper object queries for your dataset
eos_coef:

hard to train.