FOCUS ON: Neuroradiology/Head and Neck Imaging
Review
Machine Learning in Neurooncology Imaging: From Study Request to Diagnosis and Treatment
OBJECTIVE. Machine learning has potential to play a key role across a variety of medical imaging applications. This review seeks to elucidate the ways in which machine learning can aid and enhance diagnosis, treatment, and follow-up in neurooncology.
CONCLUSION. Given the rapid pace of development in machine learning over the past several years, a basic proficiency of the key tenets and use cases in the field is critical to assessing potential opportunities and challenges of this exciting new technology.
Keywords: artificial intelligence, machine learning, neuroimaging, neurooncology
Interest in machine learning has grown substantially over the past 5 years, particularly in the realm of medical imaging. This article focuses on the role machine learning can play in the journey from initial diagnosis through treatment and follow-up in neurooncology. Imaging, clinical history, and a detailed physical examination are critical for high-quality diagnosis and treatment. Future applications range from extending structural and physiologic data we can infer from images to streamlining complex noninterpretive processes that impact patient satisfaction and care. This article explores both the current state-of-the-art and that of the near future related to application of machine learning. Discussion regarding technical concepts of creating machine learning algorithms are presented in detail in a prior AJR article [1].
Selecting the appropriate imaging protocol is a common quality assurance problem in radiology. Inappropriate choice of protocol contributes to health care cost and waste. This time-consuming process relies on the radiologist's knowledge of imaging protocols and attention to the clinician's specific requests, which often requires reading through the medical record, reviewing prior imaging studies, or both. Although much of the interest in machine learning has focused on interpretation of pixel data, algorithms can also be applied to gaining knowledge from text using a set of techniques called natural language processing (NLP). NLP based on narrative clinical information from the electronic medical record has been used to identify the correct imaging study to order (decision support) as well as automating choice of examination protocol and prioritization. Recent studies show that machine learning algorithms accurately acquire knowledge from text and use ordering information like study indications to determine protocols for brain and body examinations, including the need for a contrast agent [2–6].
Scheduling is a persistent source of patient and ordering provider dissatisfaction. Some algorithms have shown promise in routing patients to imaging site on the basis of patient location and setting expectations by predicting examination and wait times [7, 8]. For example, patients referred for newly diagnosed tumors could be scheduled to be imaged at the site of neurosurgical consultation with a neuronavigation protocol that may be more extensive than a routine follow-up examination. Patients with long-term stability could be scanned at locations that are more convenient to them with a tailored protocol, which may be shorter and not require gadolinium-based contrast medium.
Machine learning methods can be used to improve quality at various stages of image acquisition and reconstruction. Differences in vendors, field strength, sequence parameters, and acquisition orientation can lead to image quality heterogeneity. Prescribing repeatable coverage from scan to scan, ensuring that sequence parameters are standardized, and assessing image quality after acquisition are areas of active investigation with some commercially available products [9–12]. Furthermore, the era of low-dose CT and fast MRI has heralded a boom of machine learning applications to acquire high-quality images in less time. Using advanced machine learning techniques such as sparse dictionary learning and convolutional neural networks (CNNs) for noise reduction in low-dose CT can reduce dose substantially [13–16]. Similarly, these techniques are being applied in MRI reconstruction from highly undersampled k-space [17–19]. Machine learning has been used to simulate higher or lower field strength after acquisition of paired data, generate superresolution images, improve the signal-to-noise ratio of perfusion imaging, and reduce scan time for lengthy acquisitions such as advanced diffusion imaging [20–25].
Machine learning can also be applied to decrease the amount of time needed to perform complex image reconstructions, bringing these techniques closer to clinical applicability [26]. Additionally, machine learning has been explored for modality conversion or creating a synthetic CT image from a conventional MRI study, which could then be used for surgical or radiotherapy planning [27–29]. Because multiparametric imaging is needed to arrive at the appropriate diagnosis and help guide therapy in many complex neurooncologic cases, the field would greatly benefit from efforts to improve image quality, reduce scan time, and potentially eliminate the need for redundant examinations.
Nearly every radiology practice prioritizes worklists by a number of factors including the time when a study was obtained, patient location, ordering provider concern (urgent vs routine), or even advanced criteria like same-day appointments. Recent work has used machine learning to improve triaging by identifying critical findings (e.g., hemorrhage on head CT) within the image data [30–33]. An important component of triage is early notification of the treatment team. One commercial product recently approved by the U.S. Food and Drug Administration (FDA) includes an algorithm for detecting potential stroke when evaluating CT images [34]. These algorithms are tireless and can evaluate studies regardless of patient location or class, increasing the likelihood that acute complications like hemorrhage or infection in patients with brain tumors will be detected quickly.
Consistent display of imaging sequences (a hanging protocol) regardless of vendor or protocol is a constant headache for radiologists and is a particular challenge with the myriad of brain tumor follow-up MRI studies. Identifying pulse sequences from DICOM meta-data (e.g., TR/TE, series description) relies on a set of hand-coded rules built individually at each institution. These rules must account for variation between sequence names applied by different vendors, repeated series, and manual editing [35]. There is tremendous potential in training models to use not only metadata but also pixel data to accurately identify modality, body part, image plane, and pulse sequence to drive hanging protocols. Studies have shown that high-quality hanging protocols have direct impact on radiologist productivity. In neurooncology, assessing treatment response requires review of multiple pulse sequences over serial examinations, so high-quality hanging protocols would improve radiologist satisfaction and likely quality of care.
Radiologists understand the value of detailed patient context during image interpretation and that reviewing the electronic medical record is a time-consuming endeavor. Machine learning is being used to generate context-based summaries of the electronic health record [36, 37]. In neurooncology, medical histories often are complex and span long periods of time with several providers. As a result, much of the effort spent trying to put together the full clinical picture could be mitigated by electronic health record mining to present the interpreting radiologist with the most current and relevant information.
Computer-aided detection (CAD) was first developed to help radiologists interpret radiographic images, including mammograms and chest radiographs [38, 39]. The prevalent CAD paradigm highlights suspicious findings during image interpretation. These suspicious findings were identified using hand-crafted features derived from human knowledge of disease appearance. The ability to generalize and scale these hand-crafted features across modalities or the case of rare diseases is a major limitation of traditional CAD. Advances in machine learning, including deep CNNs, no longer rely on hand-crafted features and instead identify features for a particular diagnosis during the training process without human intervention. CNNs show promise in both lesion detection and segmentation [40].
Although lesion detection identifies the location of a potential abnormality within images, lesion segmentation marks individual pixels containing an abnormality (a segmentation mask). A segmentation mask can be used to calculate lesion volume and quantify signal characteristics, edge morphology, and texture. In neurooncology, radiologists measure lesion size over time to assess treatment response. These measurements rely on manual segmentation, which is time-consuming and tedious. Because of the time and complexity of this task, radiologists often use approximations like largest single diameter or 2D measurements rather than tumor volume. In addition to being less time-consuming, segmentation masks avoid human subjectivity and can be reliably reproduced. Whereas many traditional machine learning techniques have been used for segmentation with mixed results, recent deep learning–based algorithms have pushed the state of the art to near-human performance on a variety of benchmarks. Among the most commonly cited applications is the automated segmentation of various brain tumor components including regions of enhancement, edema, and necrosis. Such a tool would greatly impact a wide range of indications including diagnosis, surgical guidance, radiation therapy planning, and follow-up. Other well-studied segmentation tasks in neuroradiology with potential clinical utility include tools for detection and quantification of normal gray and white matter, microhemorrhage, and infarct [41–51].
The success of machine learning research is directly linked to high-quality large datasets. Several open-source neuroradiology datasets are available, most arising from the Medical Image Computing and Computer Assisted Intervention Society. The Brain Tumor Image Segmentation dataset and the Cancer Imaging Archives/Cancer Genome Atlas give information for brain tumors specifically [52, 53]. Although publicly available datasets of labeled brain tumor images have facilitated the development of several promising applications, a number of associated challenges persist. These large datasets lack consistent labeling strategies, especially of fine details. Further, included images are often older, scanned at lower field strengths, and with low resolution. Because these images typically depict newly diagnosed lesions, an area for continued growth is the evaluation of real-world complexity like resection cavities and treatment effects. Machine learning algorithms are sensitive to the data used for training; although studies of their efficacy have shown promise, their performance can still be improved.
Radiomics, a process that converts medical images into mineable high-dimensional data for diagnosis, classification, and outcome prediction, has broadened the study of tumors beyond established imaging features and metrics [54, 55]. The 2016 World Health Organization classification of CNS tumors underscores the importance of genetic information for brain tumor diagnosis and treatment [56]. Correspondingly, the study of the relationship between imaging features and genetic data (radiogenomics) has seen a sharp rise. This increase is particularly evident in brain tumor research because of a large amount of imaging data collected on individual tumors. Identification of imaging phenotypes that correlate with genetic markers such as isocitrate dehydrogenase (IDH) mutation, 1p/19q code-letion, ATRX, and telomerase reverse transcriptase are highly relevant because they have been strongly associated with prognosis. The utility of CNNs in radiogenomics was first described in 2015 by Pan et al. [57], who used anatomic MR images to predict tumor grade. More recently, machine learning has been used to predict IDH mutation, 1p/19q codeletion, and methylguanine-DNA methyltransferase methylation, with the most promising results to date achieving prediction accuracies of 83–94% [58–62]. Further applications include survival prediction integrating anatomic imaging with clinical and therapeutic response assessment data.
One limitation of radiogenomics is that classification can be impeded by overlap of certain imaging features among different genetic alterations and the spatial heterogeneity of features that can change during the course of treatment. Recent studies have shown that genetic alterations caused by treatment result in intratumor heterogeneity; the recurrent portion of the tumor can have an entirely different genetic make-up than the original tumor site and typically manifests in a much more aggressive phenotype [63]. A strategy for incorporating multimodal imaging and machine learning to identify such differences, both spatially and in terms of severity, can help surgeons sample the most malignant tumor region and resect infiltrative tumor beyond the contrast-enhancing lesion. Such a strategy will also facilitate prediction of subsequent outcome measures at different points of care. To date, implementations in brain tumor imaging have focused primarily on anatomic and DW images and have not taken advantage of methodologic advances from other fields that can incorporate diverse datasets from multiple time points. Many new and flexible machine learning techniques including long short-term memory recurrent neural networks have been optimized for this task and show tremendous promise for incorporating time-series data for analysis and prediction [64]. Integrating other physiologic and metabolic imaging results from MRI, PET, or both will also be critical for accurate diagnosis.
As machine learning becomes more prevalent and data sources become more complex, the need for consistency in acquisition is magnified. Radiologists will need to increase attention to standardization of imaging and data collection methods. In the meantime, we need to collate datasets from multiple institutions and accurately represent the range of acquisition heterogeneity.
Recently, the FDA recognized that the traditional approach to medical device regulation does not fit well with the iterative nature of digital health application development [65]. The agency outlined plans to address this gap including draft guidance on software designed for patient and clinical decision support (CDS) [66]. The draft guidance aims to identify types of decision support that are not considered medical devices and thus are not subject to regulation. Examples of exempt CDS included matching patient-specific information with current practice and treatment guidelines and software that identifies drug-drug or drug-allergy interactions. In the draft guidance, the FDA explicitly states that applications to process or analyze medical images remain medical devices and are therefore subject to regulation under existing legislation.
Although the draft guidance does not mention machine learning directly, it states that to avoid regulation as a medical device “the CDS function must be intended to enable health care professionals to independently review the basis for the recommendations presented by the software so that they do not rely primarily on such recommendations, but rather on their own judgment, to make clinical decisions for individual patients” [66]. Even though these draft guidances are subject to change, machine learning algorithms relevant to neurooncology are unlikely to be exempt from review as medical devices in the foreseeable future.
In their response letter to the draft guidance, the American Medical Informatics Association called for the FDA to host a public forum to discuss standards for transparency and performance of decision support software in machine learning–based environments [67]. Some open challenges to be addressed include developing quality assurance processes and validation paradigms as well as determining responsibility for algorithm mistakes. This space undoubtedly will continue to evolve.
Application of machine learning provides radiologists with tools to increase consistency and productivity and to uncover new diagnostic possibilities. Because of a complex interplay of factors, including federal regulation of algorithms that provide diagnosis, radiologists are likely to see the impact of machine learning in areas such as acquisition and workflow enhancements before general diagnostic support. The modern radiologist must therefore have a functional understanding of machine learning concepts and play an active role in developing and implementing these techniques.

Audio Available | 