Deep learning cancer classification systems have the potential to improve cancer diagnosis. However, development of these computational approaches depends on prior annotation through a pathologist. This initial step relying on a manual, low-resolution, time-consuming process is highly variable and subject to observer variance. To address this issue, we developed a novel method, H&E Molecular neural network (HEMnet). This two-step process utilizes immunohistochemistry as an initial molecular label for cancer cells on a H&E image and then we train a cancer classifier on the overlapping clinical histopathological images. Using this molecular transfer method, we show that HEMnet accurately distinguishes colorectal cancer from normal tissue at high resolution without the need for an initial manual histopathologic evaluation. Our validation study using histopathology images from TCGA samples accurately estimates tumour purity. Overall, our method provides a path towards a fully automated delineation of any type of tumor so long as there is a cancer-oriented molecular stain available for subsequent learning.
H&E Molecular neural network workflow overview a, matched p53 IHC stained and H&E stained WSI derived from two adjacent tissue sections. b, Training was performed on paired normal and cancer slides (five pairs). Test slides were held-back and are unseen. c, Preprocessing to account for technical variations in slide preparation through stain normalization and image registration d, Molecular labels were transferred from p53 to H&E images. Post label transferring, each image was tiled to generate thousands of small samples (224×224 pixels) to train a CNN e, Application of HEMnet to predict cancer from new clinical H&E images.