Transfer learning for endoscopy disease detection and segmentation with MASk-RCNN benchmark architecture
Rezvy S., Zebin T., Braden B., Pang W., Taylor S., Gao XW.
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). We proposed and implemented a disease detection and semantic segmentation pipeline using a modified mask-RCNN infrastructure model on the EDD2020 dataset1. On the images provided for the phase-I test dataset, for'BE', we achieved an average precision of 51.14%, for'HGD' and'polyp' it is 50%. However, the detection score for'suspicious' and'cancer' were low. For phase-I, we achieved a dice coefficient of 0.4562 and an F2 score of 0.4508. We noticed the missed and mis-classification was due to the imbalance between classes. Hence, we applied a selective and balanced augmentation stage in our architecture to provide more accurate detection and segmentation. We observed an increase in detection score to 0.29 on phase -II images after balancing the dataset from our phase-I detection score of 0.24. We achieved an improved semantic segmentation score of 0.62 from our phase-I score of 0.52.