Combining Datasets with Different Label Sets for Improved Nucleus Segmentation and Classification

Published in Published and presented in Bioimaging (BIOSTEC), Rome, 2024

Using deep neural networks (DNNs) to segment and classify cell nuclei can help pathologists diagnose diseases faster, including cancer. DNNs get more accurate with more annotated datasets for training. The published datasets with nuclei annotations and labeling vary in their class label sets. We present a method for training DNNs on numerous datasets with related but distinct classes. Our solution uses class hierarchies, allowing for classes at any level of a dataset. Our results show that pre-training on a different dataset can improve segmentation and classification metrics for the test split’s class set. This strategy allows for the enlargement of the training set. Combining numerous datasets with diverse classifications increases generalization to new datasets. The improvements are both qualitative and quantitative. The suggested technique can be tailored to different loss functions, DNN architectures, and application areas.

Recommended citation: Parulekar A., Kanwat U., Gupta R., Chippa M., Jacob T., Bameta T., Rane S. and Sethi A. (2024). Combining Datasets with Different Label Sets for Improved Nucleus Segmentation and Classification. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOIMAGING; ISBN 978-989-758-688-0, SciTePress, pages 281-288. DOI: 10.5220/0012380800003657
Download Paper | Download Slides