How to use Mask RCNN
Installing IceVision
We ussually install IceVision with [all]
, but we can also use [inference]
to install only the packages that inference methods depend on.
!pip install icevision[all] icedata
Imports
from icevision.all import *
Data
We'll be using the Penn-Fudan dataset, which is already available under datasets
.
data_dir = icedata.pennfudan.load_data()
class_map = icedata.pennfudan.class_map()
As usual, let's create the parser and perfom a random data split.
parser = icedata.pennfudan.parser(data_dir)
train_records, valid_records = parser.parse()
HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=170.0), HTML(value='')))
Let's use the usual aug_tfms
for training transforms with two small modifications:
- Decrease the rotation limit from 45 to 10.
- Use a more aggresive crop function.
shift_scale_rotate = tfms.A.ShiftScaleRotate(rotate_limit=10)
crop_fn = partial(tfms.A.RandomSizedCrop, min_max_height=(384//2, 384), p=.5)
train_tfms = tfms.A.Adapter(
[
*tfms.A.aug_tfms(size=384, presize=512, shift_scale_rotate=shift_scale_rotate, crop_fn=crop_fn),
tfms.A.Normalize(),
]
)
And for validation transforms, the simple resize_and_pad
.
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(size=348), tfms.A.Normalize()])
Now we can create the Dataset
and take a look on how the images look after the transforms.
train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)
samples = [train_ds[1] for _ in range(6)]
show_samples(samples, denormalize_fn=denormalize_imagenet, ncols=3, display_label=False, show=True)
Now we're ready to create the DataLoaders:
train_dl = mask_rcnn.train_dl(train_ds, batch_size=16, shuffle=True, num_workers=4)
valid_dl = mask_rcnn.valid_dl(valid_ds, batch_size=16, shuffle=False, num_workers=4)
Metrics
Metrics are a work in progress for Mask RCNN.
# metrics = [COCOMetric(COCOMetricType.mask)]
Model
Similarly to faster_rcnn
, we just need the num_classes
to create a Mask RCNN model.
model = mask_rcnn.model(num_classes=len(class_map))
Training - fastai
We just need to create the learner and fine-tune.
Optional
You can use learn.lr_find()
for finding a good learning rate.
learn = mask_rcnn.fastai.learner(dls=[train_dl, valid_dl], model=model)
learn.fine_tune(10, 5e-4, freeze_epochs=2)
epoch | train_loss | valid_loss | time |
---|---|---|---|
0 | 1.609195 | 0.804607 | 00:19 |
1 | 1.115979 | 0.536064 | 00:15 |
epoch | train_loss | valid_loss | time |
---|---|---|---|
0 | 0.777911 | 0.435238 | 00:22 |
1 | 0.618750 | 0.363259 | 00:19 |
2 | 0.555032 | 0.341701 | 00:18 |
3 | 0.507919 | 0.335720 | 00:19 |
4 | 0.486197 | 0.330338 | 00:19 |
5 | 0.450105 | 0.278701 | 00:19 |
6 | 0.426801 | 0.280237 | 00:18 |
7 | 0.417322 | 0.296473 | 00:16 |
8 | 0.411688 | 0.289382 | 00:17 |
9 | 0.401172 | 0.283575 | 00:17 |
Visualize predictions
Let's grab some images from valid_ds
to visualize. For more info on how to do inference, check the inference tutorial.
mask_rcnn.show_results(model, valid_ds, class_map=class_map)
Happy Learning!
If you need any assistance, feel free to join our forum.