MMDetection Custom Config
When creating an MMDetection model
, the model config
object is stored in model.cfg
and the pretrained weight file path is stored in model.weights_path
. In order to update model config attribute, you need to create an cfg_options
dictionary:
Example: Changing loss weights for loss_bbox and loss_cls
You can pass the cfg_options
argument when creation a model
model_type = models.mmdet.retinanet
backbone = model_type.backbones.resnet50_fpn_1x
'cfg_options' = {
'model.bbox_head.loss_bbox.loss_weight': 2,
'model.bbox_head.loss_cls.loss_weight': 0.8 }
# Passing cfg_options to the `model()` method to update loss weights
model = model_type.model(backbone=backbone(pretrained=True), num_classes=len(parser.class_map), cfg_options=cfg_options)
Install IceVision and IceData
The following downloads and runs a short shell script. The script installs IceVision, IceData, the MMDetection library, and Yolo v5 as well as the fastai and pytorch lightning engines.
Install from pypi...
# # Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet Installation
# !wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh
# # Choose your installation target: cuda11 or cuda10 or cpu
# !bash icevision_install.sh cuda11
... or from icevision master
# Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet Installation
!wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh
# Choose your installation target: cuda11 or cuda10 or cpu
!bash icevision_install.sh cuda11 master
# Restart kernel after installation
import IPython
IPython.Application.instance().kernel.do_shutdown(True)
Imports
All of the IceVision components can be easily imported with a single line.
from icevision.all import *
Download and prepare a dataset
Now we can start by downloading the Fridge Objects dataset. This tiny dataset contains 134 images of 4 classes: - can, - carton, - milk bottle, - water bottle.
IceVision provides methods to load a dataset, parse annotation files, and more.
For more information about how the fridge dataset as well as its corresponding parser, check out the fridge folder in icedata.
# Download the dataset
url = "https://cvbp-secondary.z19.web.core.windows.net/datasets/object_detection/odFridgeObjects.zip"
dest_dir = "fridge"
data_dir = icedata.load_data(url, dest_dir)
Parse the dataset
# Create the parser
parser = parsers.VOCBBoxParser(annotations_dir=data_dir / "odFridgeObjects/annotations", images_dir=data_dir / "odFridgeObjects/images")
# Parse annotations to create records
train_records, valid_records = parser.parse()
parser.class_map
Creating datasets with augmentations and transforms
# Transforms
# size is set to 384 because EfficientDet requires its inputs to be divisible by 128
image_size = 384
train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=image_size, presize=512), tfms.A.Normalize()])
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(image_size), tfms.A.Normalize()])
# Datasets
train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)
# Show an element of the train_ds with augmentation transformations applied
samples = [train_ds[0] for _ in range(3)]
show_samples(samples, ncols=3)
Creating a model
This shows how to customize a retinanet model by using the cfg_options
object:
model_type = models.mmdet.retinanet
backbone = model_type.backbones.resnet50_fpn_1x(pretrained=True, cfg_options=cfg_options)
We've selected a few of the many options below. You can easily pick which option you want to try by setting the value of selection
. This shows you how to easily customize your model.
# Just the config options you would like to update
selection = 0
extra_args = {}
# Example: Changing both loss weights: loss_bbox, loss_cls
if selection == 0:
model_type = models.mmdet.retinanet
backbone = model_type.backbones.resnet50_fpn_1x
cfg_options = {
'model.bbox_head.loss_bbox.loss_weight': 2,
'model.bbox_head.loss_cls.loss_weight': 0.8,
}
# Example: changing anchor boxes ratios:
elif selection == 1:
model_type = models.mmdet.retinanet
backbone = model_type.backbones.resnet50_fpn_1x
cfg_options = {
'model.bbox_head.anchor_generator.ratios': [1.0]
}
cfg_options
{'model.bbox_head.loss_bbox.loss_weight': 2,
'model.bbox_head.loss_cls.loss_weight': 0.8}
Now it is just a one-liner to instantiate the model. If you want to try another option, just edit the line at the top of the previous cell.
# Instantiate the model and pass the `cfg_options` dictionary
model = model_type.model(backbone=backbone(pretrained=True), num_classes=len(parser.class_map), cfg_options=cfg_options)
print(model.bbox_head.loss_bbox.loss_weight)
print(model.bbox_head.loss_cls.loss_weight)
print(model.bbox_head.anchor_generator.ratios)
print(model.cfg.model.bbox_head.loss_cls.loss_weight)
print(model.cfg.model.bbox_head.loss_bbox.loss_weight)
print(model.cfg.model.bbox_head.anchor_generator.ratios)
2
0.8
tensor([0.50000, 1.00000, 2.00000])
0.8
2
[0.5, 1.0, 2.0]
Print config settings
# You have access to model's weights_path
model.weights_path
# You also have access to the whole config object
model.cfg.__dict__
# Double checking model new attributes for `loss_cls`
model.cfg.model.bbox_head.loss_cls
# Double checkingmodel new attributes for `anchor_generator`
model.cfg.model.bbox_head.anchor_generator
Path('checkpoints/retinanet/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth')
{'alpha': 0.25,
'gamma': 2.0,
'loss_weight': 0.8,
'type': 'FocalLoss',
'use_sigmoid': True}
{'octave_base_scale': 4,
'ratios': [0.5, 1.0, 2.0],
'scales_per_octave': 3,
'strides': [8, 16, 32, 64, 128],
'type': 'AnchorGenerator'}
Data Loader
The Data Loader is specific to a model_type. The job of the data loader is to get items from a dataset and batch them up in the specific format required by each model. This is why creating the data loaders is separated from creating the datasets.
We can take a look at the first batch of items from the valid_dl
. Remember that the valid_tfms
only resized (with padding) and normalized records, so different images, for example, are not returned each time. This is important to provide consistent validation during training.
# Data Loaders
train_dl = model_type.train_dl(train_ds, batch_size=8, num_workers=4, shuffle=True)
valid_dl = model_type.valid_dl(valid_ds, batch_size=8, num_workers=4, shuffle=False)
Metrics
The fastai and pytorch lightning engines collect metrics to track progress during training. IceVision provides metric classes that work across the engines and libraries.
The same metrics can be used for both fastai and pytorch lightning.
metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]
Training
IceVision is an agnostic framework meaning it can be plugged into other DL learning engines such as fastai2, and pytorch-lightning.
Training using fastai
learn = model_type.fastai.learner(dls=[train_dl, valid_dl], model=model, metrics=metrics)
learn.lr_find()
# For Sparse-RCNN, use lower `end_lr`
# learn.lr_find(end_lr=0.005)
SuggestedLRs(lr_min=8.317637839354575e-05, lr_steep=0.00010964782268274575)
learn.fine_tune(20, 1e-4, freeze_epochs=1)
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 1.229304 | 0.983039 | 0.092624 | 00:06 |
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 0.899269 | 0.741807 | 0.156066 | 00:05 |
1 | 0.810302 | 0.558525 | 0.365440 | 00:05 |
2 | 0.728987 | 0.518454 | 0.478187 | 00:05 |
3 | 0.663082 | 0.367077 | 0.609001 | 00:06 |
4 | 0.604693 | 0.365837 | 0.717380 | 00:05 |
5 | 0.555109 | 0.308284 | 0.837230 | 00:05 |
6 | 0.507140 | 0.260290 | 0.871975 | 00:05 |
7 | 0.464969 | 0.248942 | 0.881214 | 00:05 |
8 | 0.429737 | 0.238713 | 0.864140 | 00:05 |
9 | 0.398711 | 0.215525 | 0.916083 | 00:05 |
10 | 0.372950 | 0.206479 | 0.911718 | 00:05 |
11 | 0.356441 | 0.191768 | 0.909661 | 00:05 |
12 | 0.343991 | 0.215923 | 0.911696 | 00:05 |
13 | 0.331874 | 0.218360 | 0.887766 | 00:05 |
14 | 0.313435 | 0.194519 | 0.908697 | 00:05 |
15 | 0.299692 | 0.212282 | 0.900371 | 00:05 |
16 | 0.286332 | 0.189399 | 0.916208 | 00:05 |
17 | 0.282384 | 0.195011 | 0.911806 | 00:05 |
18 | 0.275653 | 0.200540 | 0.904800 | 00:05 |
19 | 0.273256 | 0.200690 | 0.904800 | 00:05 |
Training using Pytorch Lightning
class LightModel(model_type.lightning.ModelAdapter):
def configure_optimizers(self):
return SGD(self.parameters(), lr=1e-4)
light_model = LightModel(model, metrics=metrics)
trainer = pl.Trainer(max_epochs=20, gpus=1)
trainer.fit(light_model, train_dl, valid_dl)
Using the model - inference and showing results
The first step in reviewing the model is to show results from the validation dataset. This is easy to do with the show_results
function.
model_type.show_results(model, valid_ds, detection_threshold=.5)
Prediction
NOTE: For a more detailed look at inference check out the inference tutorial
infer_dl = model_type.infer_dl(valid_ds, batch_size=4, shuffle=False)
preds = model_type.predict_from_dl(model, infer_dl, keep_images=True)
show_preds(preds=preds[:4])
Happy Learning!
If you need any assistance, feel free to join our forum.