How to use negative examples
In some scenarios it might be useful to explicitly show the model images that should be considered background, these are called "negative examples" and are images that do not contain any annotations.
In this tutorial we're going to be training two raccoons detectors and observe how they perform on images of dogs and cats. One of the models will be trained only with images of raccoons, while the other will also have access to images of dogs and cats (the negative examples).
As you might already have imagined, the model that was trained with raccoon images predicts all animals to be raccoons!
Installing IceVision and IceData
If on Colab run the following cell, else check the installation instructions
Install from pypi...
# Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet Installation
!wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh
# Choose your installation target: cuda11 or cuda10 or cpu
!bash icevision_install.sh cuda11
... or from icevision master
# # Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet Installation
# !wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh
# # Choose your installation target: cuda11 or cuda10 or cpu
# !bash icevision_install.sh cuda11 master
# Restart kernel after installation
import IPython
IPython.Application.instance().kernel.do_shutdown(True)
Imports
from icevision.all import *
Raccoon dataset
The dataset is stored on github, so a simple git clone
will do.
!git clone https://github.com/datitran/raccoon_dataset
fatal: destination path 'raccoon_dataset' already exists and is not an empty directory.
The raccoon dataset uses the VOC annotation format, icevision natively supports this format:
raccoon_data_dir = Path('raccoon_dataset')
raccoon_parser = parsers.VOCBBoxParser(annotations_dir=raccoon_data_dir /'annotations', images_dir=raccoon_data_dir /'images')
Let's go ahead and parse our data with the default 80% train, 20% valid, split.
raccoon_train_records, raccoon_valid_records = raccoon_parser.parse()
show_records(random.choices(raccoon_train_records, k=3), ncols=3, class_map=class_map)
0%| | 0/200 [00:00<?, ?it/s]
[1m[1mINFO [0m[1m[0m - [1m[34m[1mAutofixing records[0m[1m[34m[0m[1m[0m | [36micevision.parsers.parser[0m:[36mparse[0m:[36m122[0m
0%| | 0/200 [00:00<?, ?it/s]
Pets dataset
With icedata we can easily download the pets dataset:
pets_data_dir = icedata.pets.load_data() / 'images'
Here we have a twist, instead of using the standard parser (icedata.pets.parser
) which would parse all annotations, we will instead create a custom parser that only parsers the images.
Remember the steps for generating a custom parser (check this tutorial for more information).
pets_template_record = ObjectDetectionRecord()
Parser.generate_template(pets_template_record)
class MyParser(Parser):
def __init__(self, template_record):
super().__init__(template_record=template_record)
def __iter__(self) -> Any:
def __len__(self) -> int:
def record_id(self, o: Any) -> Hashable:
def parse_fields(self, o: Any, record: BaseRecord, is_new: bool):
record.set_img_size(<ImgSize>)
record.set_filepath(<Union[str, Path]>)
record.detection.add_bboxes(<Sequence[BBox]>)
record.detection.set_class_map(<ClassMap>)
record.detection.add_labels(<Sequence[Hashable]>)
And now we use that to fill the required methods. We don't have to use the .detection
methods since we don't want bboxes for these images.
class PetsImageParser(Parser):
def __init__(self, template_record, data_dir):
super().__init__(template_record=template_record)
self.image_filepaths = get_image_files(data_dir)
def __iter__(self) -> Any:
yield from self.image_filepaths
def __len__(self) -> int:
return len(self.image_filepaths)
def record_id(self, o) -> Hashable:
return o.stem
def parse_fields(self, o, record, is_new):
if is_new:
record.set_img_size(get_img_size(o))
record.set_filepath(o)
Now we're ready to instantiate the parser and parse the data:
pets_parser = PetsImageParser(pets_template_record, pets_data_dir)
pets_train_records, pets_valid_records = pets_parser.parse()
show_records(random.choices(pets_train_records, k=3), ncols=3)
0%| | 0/7390 [00:00<?, ?it/s]
[1m[1mINFO [0m[1m[0m - [1m[34m[1mAutofixing records[0m[1m[34m[0m[1m[0m | [36micevision.parsers.parser[0m:[36mparse[0m:[36m122[0m
0%| | 0/7390 [00:00<?, ?it/s]
Transforms
Let's define a simple list of transforms, they are the same for both datasets.
presize = 512
size = 384
train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=size, presize=presize), tfms.A.Normalize()])
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(size=size), tfms.A.Normalize()])
Datasets and DataLoaders
We create the raccoon dataset and dataloader as normal:
model_type = models.ross.efficientdet
batch_size = 8
raccoon_train_ds = Dataset(raccoon_train_records, train_tfms)
raccoon_valid_ds = Dataset(raccoon_valid_records, valid_tfms)
raccoon_train_dl = model_type.train_dl(raccoon_train_ds, batch_size=batch_size, num_workers=4, shuffle=True)
raccoon_valid_dl = model_type.valid_dl(raccoon_valid_ds, batch_size=batch_size, num_workers=4, shuffle=False)
For adding the pets data, we simply have to combine the list of records. Note that the pets dataset contains a lot more images than the raccoon dataset, so we'll get only 100 images for train and 30 for valid, feel free to change these numbers and explore the results!
combined_train_ds = Dataset(raccoon_train_records + pets_train_records[:100], train_tfms)
combined_valid_ds = Dataset(raccoon_valid_records + pets_valid_records[:30], valid_tfms)
combined_train_dl = model_type.train_dl(combined_train_ds, batch_size=batch_size, num_workers=4, shuffle=True)
combined_valid_dl = model_type.valid_dl(combined_valid_ds, batch_size=batch_size, num_workers=4, shuffle=False)
Let's take a look at the combined dataset:
show_samples(random.choices(combined_train_ds, k=6), class_map=class_map, ncols=3)
Metrics
As usual, let's stick with our COCOMetric
:
metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]
Models
We're now ready to train a separate model for each dataset and see how the results change!
Raccoons only
backbone = model_type.backbones.tf_lite0
raccoon_model = model_type.model(
backbone=backbone(pretrained=True),
num_classes=len(raccoon_parser.class_map),
img_size=size
)
raccoon_learn = model_type.fastai.learner(
dls=[raccoon_train_dl, raccoon_valid_dl],
model=raccoon_model,
metrics=metrics
)
raccoon_learn.lr_find()
SuggestedLRs(valley=0.005248074419796467)
raccoon_learn.fine_tune(30, 0.005, freeze_epochs=5)
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 1.597212 | 1.354181 | 0.000307 | 00:05 |
1 | 1.480324 | 1.354777 | 0.001507 | 00:04 |
2 | 1.319088 | 1.254850 | 0.034810 | 00:04 |
3 | 1.087830 | 1.081156 | 0.053211 | 00:04 |
4 | 0.928458 | 0.898133 | 0.098449 | 00:04 |
/home/ppotrykus/anaconda3/envs/icevision-dev/lib/python3.8/site-packages/effdet/bench.py:45: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
indices_all = cls_topk_indices_all // num_classes
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 0.579946 | 0.861432 | 0.069980 | 00:05 |
1 | 0.571890 | 0.824743 | 0.077679 | 00:05 |
2 | 0.546770 | 0.768505 | 0.090791 | 00:05 |
3 | 0.518292 | 0.676042 | 0.382004 | 00:05 |
4 | 0.507741 | 0.644902 | 0.433709 | 00:05 |
5 | 0.495793 | 0.692048 | 0.240042 | 00:05 |
6 | 0.471785 | 0.577972 | 0.508806 | 00:05 |
7 | 0.456929 | 0.624429 | 0.383962 | 00:05 |
8 | 0.436571 | 0.644672 | 0.429864 | 00:05 |
9 | 0.426388 | 0.581951 | 0.486592 | 00:05 |
10 | 0.398668 | 0.562105 | 0.508463 | 00:05 |
11 | 0.384981 | 0.508337 | 0.543620 | 00:05 |
12 | 0.366980 | 0.497647 | 0.554876 | 00:05 |
13 | 0.357205 | 0.494502 | 0.553882 | 00:05 |
14 | 0.343215 | 0.491536 | 0.568322 | 00:05 |
15 | 0.326071 | 0.490430 | 0.584215 | 00:05 |
16 | 0.322718 | 0.501275 | 0.553775 | 00:05 |
17 | 0.311050 | 0.486727 | 0.592019 | 00:05 |
18 | 0.299104 | 0.476103 | 0.588887 | 00:05 |
19 | 0.297170 | 0.500982 | 0.536913 | 00:05 |
20 | 0.286925 | 0.482961 | 0.560512 | 00:05 |
21 | 0.278372 | 0.486913 | 0.544892 | 00:05 |
22 | 0.272823 | 0.497099 | 0.557630 | 00:05 |
23 | 0.271815 | 0.485606 | 0.560342 | 00:05 |
24 | 0.261464 | 0.475296 | 0.577226 | 00:05 |
25 | 0.260394 | 0.479651 | 0.566950 | 00:05 |
26 | 0.256806 | 0.477946 | 0.575284 | 00:05 |
27 | 0.257800 | 0.480531 | 0.576717 | 00:05 |
28 | 0.257753 | 0.484309 | 0.560253 | 00:05 |
29 | 0.253717 | 0.482582 | 0.566893 | 00:05 |
If only raccoon photos are showed during training, everything is a raccoon!
model_type.show_results(raccoon_model, combined_valid_ds)
Raccoons + pets
combined_model = model_type.model(
backbone(pretrained=True),
num_classes=len(raccoon_parser.class_map),
img_size=size
)
combined_learn = model_type.fastai.learner(
dls=[combined_train_dl, combined_valid_dl],
model=combined_model,
metrics=metrics
)
combined_learn.fine_tune(30, 1e-2, freeze_epochs=5)
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 2.676468 | 20.114677 | 0.001932 | 00:07 |
1 | 1.999201 | 2.396560 | 0.075933 | 00:06 |
2 | 1.456598 | 32.068226 | 0.075432 | 00:06 |
3 | 1.105171 | 27.473427 | 0.141161 | 00:06 |
4 | 0.951076 | 24.071144 | 0.057629 | 00:06 |
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 0.610488 | 9.588224 | 0.115317 | 00:08 |
1 | 0.580771 | 6.247436 | 0.336951 | 00:07 |
2 | 0.559531 | 4.243937 | 0.357085 | 00:08 |
3 | 0.520752 | 1.651748 | 0.401737 | 00:08 |
4 | 0.519708 | 2.837212 | 0.371190 | 00:08 |
5 | 0.521373 | 6.292931 | 0.328725 | 00:09 |
6 | 0.476777 | 2.089493 | 0.456670 | 00:08 |
7 | 0.499804 | 6.462561 | 0.316811 | 00:08 |
8 | 0.981566 | 21.373594 | 0.285337 | 00:09 |
9 | 0.986519 | 12.665498 | 0.379301 | 00:08 |
10 | 0.717527 | 8.835299 | 0.468607 | 00:08 |
11 | 0.584880 | 7.107671 | 0.488671 | 00:08 |
12 | 0.497443 | 6.248025 | 0.465282 | 00:08 |
13 | 0.495668 | 4.710539 | 0.476700 | 00:08 |
14 | 0.484720 | 1.786158 | 0.453496 | 00:08 |
15 | 0.419404 | 2.497736 | 0.438899 | 00:08 |
16 | 0.379092 | 2.484356 | 0.502668 | 00:08 |
17 | 0.362614 | 1.563345 | 0.490237 | 00:09 |
18 | 0.337136 | 1.657581 | 0.538952 | 00:09 |
19 | 0.316169 | 1.152076 | 0.509543 | 00:08 |
20 | 0.304117 | 1.089841 | 0.521393 | 00:07 |
21 | 0.299743 | 1.460610 | 0.500326 | 00:08 |
22 | 0.278153 | 1.020550 | 0.489060 | 00:08 |
23 | 0.265477 | 0.737581 | 0.546207 | 00:08 |
24 | 0.252644 | 0.702136 | 0.494025 | 00:09 |
25 | 0.248898 | 0.650576 | 0.548857 | 00:08 |
26 | 0.273597 | 0.597172 | 0.544202 | 00:08 |
27 | 0.263164 | 0.943378 | 0.550043 | 00:08 |
28 | 0.243946 | 0.758281 | 0.553848 | 00:08 |
29 | 0.245725 | 0.694360 | 0.548657 | 00:09 |
When negative samples are used during training, the model get's way better understading what is not a raccoon.
model_type.show_results(combined_model, combined_valid_ds)
Happy Learning!
That's it folks! If you need any assistance, feel free to join our forum.