RCNN Implementation

2022-12-07 20:19 问答作者：

I am trying to implement RCNN Paper from scratch. As proposed in the paper I have successfully extracted the region proposals using selective search. The next step is to train a feature extractor which is basically a (N+1) class classifier, where N is number of classes in the data and 1 for the background.

As suggested in the paper, I am using AlexNet (with ImageNet pretrained weights) for feature extractor but facing issue while training this. The training loss and accuracy are coming out as expected but it's not the case during validation. The validation accuracy is going down while validation loss is going up. Below is the snippet I am using for loss and accuracy calculations

for imgs, labels in tepoch:
    imgs = imgs.to(self.device)
    labels = labels.to(self.device)
    outputs = model(imgs)
    _, preds = torch.max(outputs, 1)
    loss = criterion(outputs, labels)

    if phase == "train":
        model.zero_grad()
        loss.backward()
        optimizer.step()
        lr_scheduler.step()

    step_loss = loss.item()
    step_acc = torch.sum(preds == labels.data) / imgs.shape[0]

One might think that the model is overfitting but I beg to differ. The dataset is pretty huge and I am also using augmentation, to give a perspective:

Number of positive samples: 517565

Number of negative samples: 4436934\

Augmentations:

transformation = A.Compose(
        [
            A.ChannelShuffle(p=0.15),
            A.RandomBrightnessContrast(p=0.2),
            A.HueSaturationValue(p=0.2),
            A.HorizontalFlip(p=0.5),
            A.CLAHE(p=0.3),
            A.Sharpen(p=0.3),
            A.Resize(height=224, width=224, always_apply=True, p=1),
            A.Normalize(always_apply=True, p=1),
            ToTensorV2()
        ]
    )

The paper suggests to use a batch size of 128 (96 negative samples and 32 positive samples). It did seem counter intuitive to me to use small number of positive samples but I tried to justify it after observing the training process:

Since the negative samples are more (around 75% of the batch), the model in the starting epochs just predicts everything as 0 (class of negative samples)...which simply gives an accuracy of 75%

Using small number of positive samples then force the model to learn the representation of other classes.

开发者_如何学Python

This is just my hypothesis and open to discussion.

Another point of discussion is the region proposal generation. The paper says that they generated about 2000 region proposal for each image and the threshold used for categorizing positive samples from negative samples leads to generation of much more negative samples than positive samples...which will eventually affect the batch.

Any help with training the feature extractor would be appreciated. I have been trying to train the model for weeks now...HELP!!!

继续阅读：classification deep-learning machine-learning object-detection pytorch

RCNN Implementation

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？