OCHuman dataset

From the OCHuman repo:

This dataset focus on heavily occluded human with comprehensive annotations including bounding-box, humans pose and instance mask. This dataset contains 13360 elaborately annotated human instances within 5081 images. With average 0.573 MaxIoU of each person, OCHuman is the most complex and challenging dataset related to human. Through this dataset, we want to emphasize occlusion as a challenging problem for researchers to study.

Disclaimer: it is currently not possible to run this notebook in Colab right away, given you need to download the OCHuman dataset manually. We advise running the notebook locally, as soon as you get access to the dataset.

Installing IceVision

Install from pypi...

# # Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet Installation
# !wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh

# # Choose your installation target: cuda11 or cuda10 or cpu
# !bash icevision_install.sh cuda11

... or from icevision master

# Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet Installation
!wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh

# Choose your installation target: cuda11 or cuda10 or cpu
!bash icevision_install.sh cuda11 master

# Restart kernel after installation
import IPython
IPython.Application.instance().kernel.do_shutdown(True)

Defining OCHuman parser

from icevision.all import *

_ = icedata.ochuman.load_data()

[1m[1mINFO    [0m[1m[0m - [1mThe mmdet config folder already exists. No need to downloaded it. Path : /home/ubuntu/.icevision/mmdetection_configs/mmdetection_configs-2.16.0/configs[0m | [36micevision.models.mmdet.download_configs[0m:[36mdownload_mmdet_configs[0m:[36m17[0m
[1m[1mINFO    [0m[1m[0m - [1m
    MANUALLY download AND unzip the dataset from https://cg.cs.tsinghua.edu.cn/dataset/form.html?dataset=ochuman. 
    You will need the path to the `ochuman.json` annotations file and the `images` directory.
    [0m | [36micedata.datasets.ochuman.data[0m:[36mload_data[0m:[36m7[0m

Parse data

Note: you might need to change the ../../ path used from this point onwards, according to your filesystem (e.g. according to where you stored the dataset).

parser = icedata.ochuman.parser("../../OCHuman/ochuman.json", "../../OCHuman/images/")

train_records, valid_records = parser.parse(data_splitter=RandomSplitter([0.8, 0.2]),
                                            cache_filepath="../../OCHuman/ochuman.pkl")

len(train_records), len(valid_records)

[1m[1mINFO    [0m[1m[0m - [1mLoading cached records from ../../OCHuman/ochuman.pkl[0m | [36micevision.parsers.parser[0m:[36mparse[0m:[36m113[0m

(4064, 1017)

Datasets + augmentations

presize = 1024
size = 512

valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(size), tfms.A.Normalize()])
train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=size, presize=presize, crop_fn=None), tfms.A.Normalize()])

train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)

samples = [train_ds[1] for _ in range(3)]
show_samples(samples, ncols=3)

png

len(train_ds), len(valid_ds)

(4064, 1017)

Dataloaders

model_type = models.torchvision.keypoint_rcnn

train_dl = model_type.train_dl(train_ds, batch_size=16, num_workers=4, shuffle=True)
valid_dl = model_type.valid_dl(train_ds, batch_size=16, num_workers=4, shuffle=False)

Model

model = model_type.model(num_keypoints=19)

Train a `fastai` learner

from fastai.callback.tracker import SaveModelCallback

learn = model_type.fastai.learner(dls=[train_dl, valid_dl], model=model, cbs=[SaveModelCallback()])

learn.lr_find()

/home/ubuntu/anaconda3/envs/ice/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

SuggestedLRs(valley=4.365158383734524e-05)

png

learn.fine_tune(20, 3e-4, freeze_epochs=1)

epoch	train_loss	valid_loss	time
0	4.737989	4.609861	08:19

Better model found at epoch 0 with valid_loss value: 4.609861373901367.

epoch	train_loss	valid_loss	time
0	4.346546	4.307302	09:02
1	4.303442	4.230606	08:46
2	4.201407	4.191602	08:37
3	4.194221	4.123021	08:27
4	4.168465	4.063463	08:32
5	4.112132	4.037125	08:24
6	4.047480	3.952349	08:20
7	3.980796	3.875872	08:29
8	3.898531	3.818884	08:25
9	3.856582	3.771754	08:27
10	3.770988	3.699221	08:24
11	3.736982	3.637545	08:18
12	3.645181	3.561272	08:12
13	3.570732	3.501793	08:20
14	3.529509	3.464969	08:20
15	3.480687	3.416519	08:20
16	3.416651	3.388196	08:26
17	3.375072	3.358102	08:21
18	3.355783	3.351155	08:21
19	3.344901	3.357507	08:17

learn.recorder.plot_loss()

Better model found at epoch 0 with valid_loss value: 4.3073015213012695.
Better model found at epoch 1 with valid_loss value: 4.230605602264404.
Better model found at epoch 2 with valid_loss value: 4.1916022300720215.
Better model found at epoch 3 with valid_loss value: 4.123020648956299.
Better model found at epoch 4 with valid_loss value: 4.063462734222412.
Better model found at epoch 5 with valid_loss value: 4.037125110626221.
Better model found at epoch 6 with valid_loss value: 3.9523494243621826.
Better model found at epoch 7 with valid_loss value: 3.8758718967437744.
Better model found at epoch 8 with valid_loss value: 3.8188838958740234.
Better model found at epoch 9 with valid_loss value: 3.771754026412964.
Better model found at epoch 10 with valid_loss value: 3.699220657348633.
Better model found at epoch 11 with valid_loss value: 3.637545347213745.
Better model found at epoch 12 with valid_loss value: 3.561272144317627.
Better model found at epoch 13 with valid_loss value: 3.5017926692962646.
Better model found at epoch 14 with valid_loss value: 3.464968681335449.
Better model found at epoch 15 with valid_loss value: 3.4165189266204834.
Better model found at epoch 16 with valid_loss value: 3.388195753097534.
Better model found at epoch 17 with valid_loss value: 3.3581018447875977.
Better model found at epoch 18 with valid_loss value: 3.3511552810668945.

png

Show model results

model_type.show_results(model, valid_ds)

png

Save model

torch.save(model.state_dict(), "../../OCHuman/model.pth")

model = model_type.model(num_keypoints=19)
state_dict = torch.load("../../OCHuman/model.pth")
model.load_state_dict(state_dict)

<All keys matched successfully>

Running inference on validation set

infer_dl = model_type.infer_dl(valid_ds, batch_size=8)
preds = model_type.predict_from_dl(model=model, infer_dl=infer_dl, keep_images=True)

show_preds(preds=preds[68:70], show=True, display_label=False, figsize=(10, 10))

  0%|          | 0/128 [00:00<?, ?it/s]

png

`plot_top_losses`

#model.train()
sorted_samples, sorted_preds, losses_stats = model_type.interp.plot_top_losses(model, valid_ds, 
                                                                                  sort_by="loss_total")

[1m[1mINFO    [0m[1m[0m - [1mLosses returned by model: ['loss_classifier', 'loss_box_reg', 'loss_objectness', 'loss_rpn_box_reg', 'loss_keypoint'][0m | [36micevision.models.interpretation[0m:[36mplot_top_losses[0m:[36m218[0m

  0%|          | 0/1017 [00:00<?, ?it/s]

  0%|          | 0/128 [00:00<?, ?it/s]

png