EfficientSAM

Model from Meta. Paper

It is essentially the same as the MobileSAM model, but with a Masked Pretraining objective.

Baseline is trained with the SAM loss(20:1 ratio of Focal Loss and Dice Loss, respectively). TverskyFocal+Dice trained with 7:3 TverskyFocal Loss and Dice Loss. Tversky+Dice trained with 7:3 Tversky Loss and Dice Loss.

Recall:

Precision:

F1:

mIoU:


MobileSAM

Baseline is trained with the SAM loss(20:1 ratio of Focal Loss and Dice Loss, respectively).

Recall:

Precision:

F1:

mIoU:


Normal SAM

Baseline for SAM models is trained with the 20:1 ratio of Focal loss and Dice loss(from paper).

TverskyFocal+Dice is trained with 7:3 ratio of TverskyFocal and Dice loss.


All Models

Recall:

Precision:

F1:

mIoU:


ViT-B Baseline VS TverskyFocal + Dice

mIoU:

f1:


ViT-L Baseline VS TverskyFocal + Dice

mIoU:

f1:


ViT-H Baseline VS TverskyFocal + Dice


DeepLab

Recall:

Precision:

F1:

mIoU:


Test Results

Link


Thoughts

I think for “framing”, the original paper talks about using Tvserky loss for class imbalanced datasets, but I feel like we are using it more for reducing the models bias towards over guessing(kind of same thing but maybe not?).