How to use negative examples
In some scenarios it might be useful to explicitly show the model images that should be considered background, these are called "negative examples" and are images that do not contain any annotations.
In this tutorial we're going to be training two raccoons detectors and observe how they perform on images of dogs and cats. One of the models will be trained only with images of raccoons, while the other will also have access to images of dogs and cats (the negative examples).
As you might already have imagined, the model that was trained with raccoon images predicts all animals to be raccoons!
Info
The first half of this tutorial is almost an exact copy from this tutorial.
Installation
!pip install icevision[all] icedata
Imports
from icevision.all import *
Define class_map
Even when training with pets data the only class we want to detect is raccoon.
class_map = ClassMap(['raccoon'])
Raccoon dataset
The dataset is stored on github, so a simple git clone
will do.
!git clone https://github.com/datitran/raccoon_dataset
The raccoon dataset uses the VOC annotation format, icevision natively supports this format:
raccoon_data_dir = Path('raccoon_dataset')
raccoon_parser = parsers.voc(annotations_dir=raccoon_data_dir /'annotations',
images_dir=raccoon_data_dir /'images',
class_map=class_map)
Let's go ahead and parse our data with the default 80% train, 20% valid, split.
raccoon_train_records, raccoon_valid_records = raccoon_parser.parse()
show_records(random.choices(raccoon_train_records, k=3), ncols=3, class_map=class_map)
Pets dataset
With icedata we can easily download the pets dataset:
pets_data_dir = icedata.pets.load_data()
Here we have a twist, instead of using the standard parser (icedata.pets.parser
) which would parse all annotations, we will instead create a custom parser that returns an empty list for labels
and bboxes
.
Remember the steps for generating a custom parser (check this tutorial for more information), first define all your mixins and call generate_template
:
class PetsImageParser(parsers.Parser, parsers.FilepathMixin, parsers.LabelsMixin, parsers.BBoxesMixin):
pass
PetsImageParser.generate_template()
def __iter__(self) -> Any:
def imageid(self, o) -> Hashable:
def image_width_height(self, o) -> Tuple[int, int]:
return get_image_size(self.filepath(o))
def filepath(self, o) -> Union[str, Path]:
def bboxes(self, o) -> List[BBox]:
def labels(self, o) -> List[int]:
And now we use that to fill the required methods:
class PetsImageParser(parsers.Parser, parsers.FilepathMixin, parsers.LabelsMixin, parsers.BBoxesMixin):
def __init__(self, data_dir):
self.image_filepaths = get_image_files(data_dir)
def __iter__(self) -> Any:
yield from self.image_filepaths
def imageid(self, o) -> Hashable:
return o.stem
def filepath(self, o) -> Union[str, Path]:
return o
def image_width_height(self, o) -> Tuple[int, int]:
return get_image_size(self.filepath(o))
def labels(self, o) -> List[int]:
return []
def bboxes(self, o) -> List[BBox]:
return []
Now we're ready to instantiate the parser and parse the data:
pets_parser = PetsImageParser(pets_data_dir)
pets_train_records, pets_valid_records = pets_parser.parse()
show_records(random.choices(pets_train_records, k=3), ncols=3, class_map=class_map)
Transforms
Let's define a simple list of transforms, they are the same for both datasets.
presize = 512
size = 384
train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=size, presize=presize), tfms.A.Normalize()])
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(size=size), tfms.A.Normalize()])
Datasets and DataLoaders
We create the raccoon dataset and dataloader as normal:
raccoon_train_ds = Dataset(raccoon_train_records, train_tfms)
raccoon_valid_ds = Dataset(raccoon_valid_records, valid_tfms)
raccoon_train_dl = efficientdet.train_dl(raccoon_train_ds, batch_size=16, num_workers=4, shuffle=True)
raccoon_valid_dl = efficientdet.valid_dl(raccoon_valid_ds, batch_size=16, num_workers=4, shuffle=False)
For adding the pets data, we simply have to combine the list of records. Note that the pets dataset contains a lot more images than the raccoon dataset, so we'll get only 100 images for train and 30 for valid, feel free to change these numbers and explore the results!
combined_train_ds = Dataset(raccoon_train_records + pets_train_records[:100], train_tfms)
combined_valid_ds = Dataset(raccoon_valid_records + pets_valid_records[:30], valid_tfms)
combined_train_dl = efficientdet.train_dl(combined_train_ds, batch_size=16, num_workers=4, shuffle=True)
combined_valid_dl = efficientdet.valid_dl(combined_valid_ds, batch_size=16, num_workers=4, shuffle=False)
Let's take a look at the combined dataset:
show_samples(random.choices(combined_train_ds, k=6), class_map=class_map, ncols=3)
Metrics
As usual, let's stick with our COCOMetric
:
metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]
Models
We're now ready to train a separate model for each dataset and see how the results change!
Raccoons only
raccoon_model = efficientdet.model('tf_efficientdet_lite0', num_classes=len(class_map), img_size=size)
raccoon_learn = efficientdet.fastai.learner(dls=[raccoon_train_dl, raccoon_valid_dl], model=raccoon_model, metrics=metrics)
raccoon_learn.fine_tune(30, 1e-2, freeze_epochs=5)
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 280.362213 | 288.541992 | 0.000031 | 00:09 |
1 | 255.560837 | 275.054901 | 0.001539 | 00:07 |
2 | 201.386978 | 203.328262 | 0.000882 | 00:07 |
3 | 138.040710 | 53.258076 | 0.000992 | 00:07 |
4 | 98.638695 | 21.866730 | 0.185121 | 00:07 |
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 0.789586 | 23.308876 | 0.220375 | 00:08 |
1 | 0.738028 | 13.141761 | 0.234339 | 00:08 |
2 | 0.692019 | 9.430372 | 0.289227 | 00:08 |
3 | 0.646733 | 5.213285 | 0.367285 | 00:08 |
4 | 0.598996 | 2.703029 | 0.499523 | 00:08 |
5 | 0.566802 | 2.002367 | 0.428408 | 00:08 |
6 | 0.540306 | 1.171565 | 0.512689 | 00:08 |
7 | 0.514873 | 1.075465 | 0.408631 | 00:08 |
8 | 0.494372 | 0.699056 | 0.547269 | 00:08 |
9 | 0.474496 | 0.772929 | 0.429607 | 00:08 |
10 | 0.456045 | 0.565817 | 0.545960 | 00:08 |
11 | 0.436963 | 0.516108 | 0.502583 | 00:08 |
12 | 0.422333 | 0.526793 | 0.552966 | 00:08 |
13 | 0.406885 | 0.607857 | 0.435570 | 00:08 |
14 | 0.396543 | 0.450901 | 0.538114 | 00:08 |
15 | 0.385096 | 0.433830 | 0.568332 | 00:08 |
16 | 0.372637 | 0.433433 | 0.556526 | 00:08 |
17 | 0.356291 | 0.412439 | 0.615498 | 00:08 |
18 | 0.346603 | 0.424790 | 0.583552 | 00:08 |
19 | 0.335853 | 0.394423 | 0.599287 | 00:08 |
20 | 0.323742 | 0.351422 | 0.660224 | 00:08 |
21 | 0.314364 | 0.338674 | 0.640041 | 00:08 |
22 | 0.307630 | 0.344830 | 0.655402 | 00:08 |
23 | 0.303521 | 0.322497 | 0.656412 | 00:08 |
24 | 0.297675 | 0.319093 | 0.654938 | 00:08 |
25 | 0.288858 | 0.333945 | 0.641190 | 00:08 |
26 | 0.286353 | 0.320516 | 0.664691 | 00:08 |
27 | 0.282811 | 0.305629 | 0.669700 | 00:08 |
28 | 0.280102 | 0.300712 | 0.678493 | 00:08 |
29 | 0.273880 | 0.301367 | 0.671941 | 00:08 |
If only raccoon photos are showed during training, everything is a raccoon!
efficientdet.show_results(raccoon_model, combined_valid_ds, class_map=class_map)
Raccoons + pets
combined_model = efficientdet.model('tf_efficientdet_lite0', num_classes=len(class_map), img_size=size)
combined_learn = efficientdet.fastai.learner(dls=[combined_train_dl, combined_valid_dl], model=combined_model, metrics=metrics)
combined_learn.fine_tune(30, 1e-2, freeze_epochs=5)
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 426.231720 | 3113.471436 | 0.000021 | 00:12 |
1 | 344.480865 | 2650.498291 | 0.000036 | 00:11 |
2 | 206.628662 | 606.958130 | 0.001016 | 00:11 |
3 | 126.901100 | 243.355576 | 0.056071 | 00:11 |
4 | 82.448853 | 109.352119 | 0.030934 | 00:10 |
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 1.062148 | 51.029591 | 0.089626 | 00:12 |
1 | 0.927982 | 24.122263 | 0.161439 | 00:12 |
2 | 0.834056 | 10.649617 | 0.218092 | 00:12 |
3 | 0.797795 | 11.055972 | 0.274900 | 00:12 |
4 | 0.765396 | 7.913944 | 0.280049 | 00:12 |
5 | 0.727915 | 3.130997 | 0.271282 | 00:12 |
6 | 0.714074 | 4.496913 | 0.217573 | 00:12 |
7 | 0.713696 | 3.117041 | 0.384929 | 00:12 |
8 | 0.671189 | 1.860370 | 0.311723 | 00:12 |
9 | 0.637322 | 1.757796 | 0.406194 | 00:12 |
10 | 0.611760 | 2.911715 | 0.369743 | 00:12 |
11 | 0.590360 | 6.035703 | 0.444911 | 00:12 |
12 | 0.574620 | 2.676531 | 0.419094 | 00:12 |
13 | 0.547478 | 4.088236 | 0.514613 | 00:12 |
14 | 0.518071 | 3.934689 | 0.492547 | 00:12 |
15 | 0.501104 | 2.299222 | 0.502294 | 00:12 |
16 | 0.476237 | 3.566992 | 0.424591 | 00:12 |
17 | 0.451466 | 3.532001 | 0.524310 | 00:12 |
18 | 0.437068 | 2.166881 | 0.551229 | 00:12 |
19 | 0.406420 | 2.166020 | 0.590745 | 00:12 |
20 | 0.383731 | 2.122064 | 0.548815 | 00:12 |
21 | 0.378371 | 2.332926 | 0.583251 | 00:12 |
22 | 0.381634 | 1.484042 | 0.551595 | 00:12 |
23 | 0.374729 | 2.162852 | 0.594631 | 00:12 |
24 | 0.364782 | 2.040329 | 0.628670 | 00:12 |
25 | 0.348715 | 1.812243 | 0.597921 | 00:12 |
26 | 0.335386 | 1.620427 | 0.630255 | 00:12 |
27 | 0.322967 | 1.752403 | 0.625841 | 00:12 |
28 | 0.327148 | 1.533834 | 0.629842 | 00:12 |
29 | 0.323913 | 1.508479 | 0.627119 | 00:12 |
When negative samples are used during training, the model get's way better understading what is not a raccoon.
efficientdet.show_results(combined_model, combined_valid_ds, class_map=class_map)
Happy Learning!
That's it folks! If you need any assistance, feel free to join our forum.