Patch Based Analysis With Machine Learning to Aid Breast Cancer Recurrence Prediction
Loading...
Date
Authors
Rose, Madison
Journal Title
Journal ISSN
Volume Title
Publisher
East Carolina University
Abstract
Since the introduction of whole slide scanners, machine learning research has become a popular area of interest in digital pathology. Many studies have attempted to use machine learning to aid pathology tasks such as breast cancer diagnosis and metastasis detection. However, one area that has less available research is in applying machine learning to predict patient recurrence risk categories. Since H&E-stained images are routinely collected for diagnostic purposes, creating an image-based recurrence prediction method could help increase accessibility and lower cost for recurrence risk category assessment for breast cancer patients. In this study, patches were extracted from a dataset of 102 whole slide images to train a machine learning model to predict slide level breast cancer Oncotype DX risk category using only H&E-stained images with no additional clinical data or region of interest annotations. Multiple patch size and patch quantity combinations were tested. Patches were extracted from each whole slide image and feature extraction was performed before the features were aggregated together to create a bag of features for each case. These bags were then used to train a logistic regression model. The best scoring model utilized 2,000 patches of size 256 x 256 pixels. This model scored 0.628 ± 0.044 accuracy on 5-fold cross validation across the entire dataset.
