Survey of Neural Networks in Digital Pathology and Pathology Workflow Christianne Dinsmore 1 1 Department of Computing and Digital Media DePaul University Chicago, IL Abstract Digital pathology has already transformed the anatomic pathology laboratory. Digital slides now allow pathologists to access specimens and related clinical data from anywhere in the world, which permits more efficient workflow, enhanced communication between physicians, widely-accessible educational opportunities, and improved data management for research applications. Pathology images require data filtering, estimation, intuition, quantification and hypothesis testing. The difficulty of this analysis is compounded by an increasing need to be able to reproduce the diagnosis and interpretation of the stained tissue. Pathologists are able to view raw images and associate the image with knowledge of anatomy, cell theory, and oncology training. In this survey, we will explore the use of neural networks in digital pathology and the potential for use in the laboratory workflow to improve turnaround time for specimen processing and diagnosis. 1
1 Introduction The practice of anatomic pathology is a practice of medicine which diagnoses disease based on examination of bodily organs and tissues. The process begins with the receipt of a pathology specimen which is inspected by a pathologist to obtain diagnostic information prior to processing the specimen for microscopic examination. The traditional pathology workflow requires manual acquisition, transportation, and storage of glass slides. In order to streamline the workflow and to enable communication and education between pathology laboratories, whole-slide digital imaging has emerged as a method to increase efficiencies in the laboratory workflow, pathologist communication, education, and research. The most popular staining method for tissue is hematoxylin and eosin, commonly known as H&E. This staining process colors the nuclei of cells blue while the counterstain colors proteins, cytoplasm, and other cell structures pink. In a clinical laboratory, all tissue is initially stained with H&E. If the pathologist requires additional information to distinguish between two similar types of cancer, special and/or advanced stains may be ordered. Figure 1: H&E stained lung tissue (1) Each laboratory has differing volumes depending on their specialty, however, data from several clinical laboratories shows that the average ratio of H&E slides to advanced/special stain slides are 7.5:1. Table 1: Slide volume for Ventana Medical Systems, Inc. customers (2) Laboratory Annual H&E Volume Annual Advanced/Special Volume Dynacare 200,000 52,200 Hackensack Medical Center 156,000 30.550 Alegent Health 130,000 19,750 Good Samaritan, West Islip 204,000 22,000 Tricore Reference Laboratories 300,000 37,000 Quantification and qualification is the foundation of microscopic pathological diagnosis. With the introduction of digital pathology and whole slide image scanning, analytical applications are being undertaken to enhance the slide image, perform cell counts, estimate 2
protein expression in immunohistochemical assays, and other tasks to quantify, classify, and analyze the specimen to aid in diagnosis (3). For example, the College of American Pathology requires a count of lymph nodes, dimensions of the specimen, and dimensions of the tumor as part of the pathological report for lung tissue specimens (4). Gathering this information, however, requires tedious and time-consuming work from the pathologist. What if the process of classification could be automated? What if there was a method of analyzing H&E stained slides to correctly alert the pathologist to order advanced stains? What if automated image analysis could reduce the pathology workload, leaving pathologists more time to focus on more difficult morphologies and research? The Food and Drug Administration (FDA) is heavily involved with the pathology community to ensure that based on scientific evidence, the probable benefits to a patient s health from the use of such devices outweigh any possible risks and will provide clinically significant results (5). Research in the use of neural networks and image processing is moving toward algorithms that can assist pathologists by providing clinically relevant classification of malignant and normal tissue. 2 Image classification Neural networks are the center of research for disease classification activities. In 2001, the National Laboratory for Novel Software Technology found that an artificial neural network ensemble was able to identify lung cancer cells in the images of needle biopsies (6). The neural algorithm used is the fast adaptive neural classifier (FANNC) which uses adaptive resonance theory and field theory. The algorithm requires one-pass learning and has incremental learning ability. When new instances are fed, it does not retrain on the entire training set, it simply learns the knowledge encoded in those instances and adjusts the network as necessary. The FANNC network is composed of four layers of units using a sigmoid function with Gaussian weights to connect the input units with the second-layer units. The second-layer units classify the inputs internally and the third-layer units classify the outputs internally. The associations between the two layers are used to implement supervised learning. Except for the connections between the first- and second-layer units, all of the connections are bidirectional and allow for feedback (7). Figure 2: The architecture of FANNC (7) 3
Assuming that instances input to the input units are, where k is the index of the instance, and n is the number of input units, the value input to the second-layer unit j from the first-layer unit I is Equation 1: Value input from first- to second-layer input (7) ( ) where is the responsive center and is the responsive characteristic width of the Gaussian weight connecting units i and j. The dynamic property of a Gaussian weight is determined by the responsive center and responsive characteristic width, which means that learned knowledge can be encoded in the weight through those two parameters. The second-layer activation function is: Equation 2: Second-layer activation function (7) ( ) where is the bias unit of j and f is the sigmoid function. Another function, called the leakage competition, is performed on all of the second-layer units. The output of the winners is transferred to the third-layer units. The third-layer activation function is: Equation 3: Third-layer activation function (7) ( ) where is the activation value of the second-layer unit j, which must be a winner in the competition and a connection to unit h. is the feed-forward weight connecting unit j to h and f is the sigmoid function. The error between the real network output and predicted output is computed using the average squared error (7). By applying the FANNC algorithm, the National Laboratory for Novel Software Technology was able to design a procedure which performs automatic pathology diagnosis using needle biopsy images with a high rate of overall identification and a low rate of false identification for adenocarcinoma, squamous cell carcinoma, small cell carcinoma, and large cell carcinoma (6). 3 Laboratory workflow The histology workflow begins again when the pathologists determine there are cells of interest, but the specificity of the disease cannot be determined from a hematoxylin and eosin stain. The pathologist submits a requisition to have additional stains, such as Helicobacter pylori, Synaptophysin and Glypican 3 (2). These stains are specifically designed to determine the presence of highly specific antibodies, proteins, and cellular structures in paraffin-embedded 4
tissue. The requisition is sent back to the accessioning personnel who enter the order into the laboratory information system. A histologist must then retrieve the tissue block from a storage location and cut additional slides for staining. Depending on the laboratory, the storage location may be on-site or may be at another facility. Some laboratories are not equipped to perform special stains, so the slides must be sent out via courier to a reference lab to complete the staining process. Lapses in patient safety have been linked to issues in histology workflow. Workflow analysis must be performed to understand the impact of quality issues on patient outcomes. In hematology, pressure to improve productivity and reduce costs led St. Joseph s Medical Center in Towson, Maryland to purchase a Beckman Coulter LH 1500 hematology system which automates the processes of sample sorting, loading and unloading of cassettes, rerun and reflex testing, sample storage, and sample tracking. The implementation of an automated system reduced the slide review rate by 23% (8). The Ventana Symphony automated H&E staining platform helped Oklahoma University reduce average time from gross station to assembled case by 12% and increased mean quarterly productivity by 8.5% (9). Dr. Lewis Hassell found that mobile viewing of digital slides on an ipad device reduces slide evaluation time with only 3% of cases having clinically significant discrepancies in diagnosis (10). These studies show positive results for providing automation in a laboratory workflow. 5 The future The development of a predictive algorithm that could process raw whole-slide images of tissue stained with hematoxylin and eosin, determine if the image contains artifacts that would lead a pathologist to order additional stains, and perform the requisition process automatically would result in significant time savings for the pathology laboratory. The proposed histology workflow, which combines automation in digital pathology and laboratory workflow, involves automation of H&E processing, reflex ordering, and archival of blocks and slides. When a specimen is received by the laboratory, the tissue is processed normally and H&E slides are cut from the block. Each microtome workstation will have a method of preparing uncut blocks and slides for storage by an automated storage system. The slides would be loaded onto the Ventana Symphony platform for H&E staining. Upon completion of the stain run, a whole slide imaging system would create images of each slide. Image analysis software would then review the slide image and classify the image as one that is likely to require additional staining or one that does not require additional analysis. Images that cannot be classified by the software are sent to a pathologist for review. Upon review of the unknown slide, the pathologist would either order more slides or make a diagnosis, thereby providing the analysis software with further training examples. When a slide image is classified as requiring additional slides, a reflex order request would be sent to the laboratory information system. The system would request retrieval of the archived block and route the block to the microtomy technician to cut and stain the additional slides. The additional slides would be processed on the advanced or special staining platforms and scanned to create a whole slide image. The image would be routed to the assigned pathologist for review and the slide would be archived using the automated system. If the pathologist wishes to view the physical slide under the microscope, he or she would make a request and the slide would be retrieved and routed. 5
A process similar to the cytology workflow may be used to ensure that a random sampling of classified normal and non-normal H&E slides are reviewed to provide additional safety and quality to the automated system. The Ventana VANTAGE workflow solution could track whether or not the special stain was needed, which would feed the neural network to provide feedback to the image processing algorithms regarding the correctness of each classification. Combined with an intelligent specimen storage model, the proposed system has the potential to significantly reduce staffing needs specific to the retrieval and re-processing of tissue. Additional Stains Needed No Additional Stains needed Process Tissue H&E Stain Stain and Capture Image Analyze Image Store Slide Receive Specimen Accession/Order In LIS Create Slides Legend H&E Process Advanced/Special Stain Advanced Stain Process Both Store/ Retrieve Block Advanced Stain and Capture Image Figure 3: Proposed neural-network driven workflow Review Image And Diagnose The introduction of automation and the ability for the neural network driven analysis software would decrease turnaround time for most specimens significantly. Further work must be performed to determine if such a system can be designed and at what cost. The combination of digital pathology, workflow management, and neural networks has the potential to provide a whole new level of diagnostic excellence. 6
References 1. Wikipedia Contributors. H&E stain. Wikipedia: The Free Encyclopedia. [Online] November 4, 2011. [Cited: November 6, 2011.] http://en.wikipedia.org/wiki/h%26e_stain. 2. Ventana Medical Systems, Inc. VentanaForce. Salesforce. [Online] November 2011. [Cited: November 7, 2011.] Available to internal Ventana employess only. 3. Benefits of Digital Pathology. Tecotzky, Raymond. 8, s.l. : Advance: For Administrators of the Laboratory, 2008, Vol. 17, p. 40. 4. Butnor, Kelly J., et al., et al. Protocol for the Examination of Specimens from Patients with Primary Non-Small Cell Carcinoma, Small Cell Carcinoma, or Carcinoid Tumor of the Lung. [Online] February 1, 2011. [Cited: November 6, 2011.] http://www.cap.org/apps/docs/committees/cancer/cancer_protocols/2011/lung_11protocol.pdf. Lung 3.1.0.0. 5. Food and Drug Administration. CFR - Code of Federal Regulations Title 21, Volume 8. Part 860 -- Medical Device Classification Procedures. [Online] April 1, 2011. [Cited: 11 6, 2011.] http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/cfrsearch.cfm?fr=860.7. 21CFR860.7. 6. Lung cancer cell identification based on artificial neural network ensembles. Zhou, Zhi Hua, et al., et al. 1, Nanjing : Artificial Intelligence in Medicine, 2001, Vol. 24. S0933-3657(01)00094-X. 7. FANNC: A Fast Neural Network Classifier. Zhou, Z.-H., et al., et al. 1, Nanjing : Knowledge and Information Systems, 2000, Vol. 2. 10.1007/s101150050006. 8. Workflow improvement and impact of the new Beckman coulter LH 1500 high throughput automated hematology workcell. La Porta, AD, Bowden, AS and Barr, S. 2, Towson : Laboratory hematology : official publication of the International Society for Laboratory Hematology, 2004, Vol. 10. 10.1532/LH96.04022. 9. The combined positive impact of Lean methodology and Ventana Symphony autostainer on histology lab workflow. Yip, Clinton, et al., et al. 2, Oklahoma : BMC Clinical Pathology, 2010, Vol. 10. doi:10.1186/1472-6890-10-2. 10. Use of mobile high-resolution device for remote frozen section evaluation of whole slide images. Hassell, Lewis A, Fung, Kar Ming and Ramey, Joel. 41, Oklahoma : J Pathol Inform, 2011, Vol. 2. DOI: 10.4103/2153-3539.84276. 11. Wikipedia Contributors. Workflow. Wikipedia: The Free Encyclopedia. [Online] October 21, 2011. [Cited: November 7, 2011.] http://en.wikipedia.org/wiki/workflow. 12. Computer-assisted image classification: use of neural networks in anatomic pathology. Becker, L. Robert. 2-3, Washington, DC : Cancer Letters, 1993, Vol. 77. 10.1016/0304-3835(94)90093-0. 13. Toward automated workflow analysis and visualization in clinical environments. Vankipuram, Mirthra, et al., et al. 3, Phoenix : Journal of Biomedical Informatics, 2009, Vol. 44. doi:10.1016/j.jbi.2010.05.015. 14. Hidden Markov Models. Eddy, Sean R. Montpellier : Current Opinion in Structural Biology, 2003, Vol. 6. 15. Smyth, Padhraic. Hidden Markov Models and Neural Networks for Fault Detection in Dynamic Systems. [Presentation] Pasadena : California Institute of Technology, 1994. 7