Original Research
FOCUS ON: Genitourinary Imaging
April 11, 2019

Unenhanced CT Texture Analysis of Clear Cell Renal Cell Carcinomas: A Machine Learning–Based Study for Predicting Histopathologic Nuclear Grade

Abstract

OBJECTIVE. The purpose of this study is to investigate the predictive performance of machine learning (ML)–based unenhanced CT texture analysis in distinguishing low (grades I and II) and high (grades III and IV) nuclear grade clear cell renal cell carcinomas (RCCs).
MATERIALS AND METHODS. For this retrospective study, 81 patients with clear cell RCC (56 high and 25 low nuclear grade) were included from a public database. Using 2D manual segmentation, 744 texture features were extracted from unenhanced CT images. Dimension reduction was done in three consecutive steps: reproducibility analysis by two radiologists, collinearity analysis, and feature selection. Models were created using artificial neural network (ANN) and binary logistic regression, with and without synthetic minority oversampling technique (SMOTE), and were validated using 10-fold cross-validation. The reference standard was histopathologic nuclear grade (low vs high).
RESULTS. Dimension reduction steps yielded five texture features for the ANN and six for the logistic regression algorithm. None of clinical variables was selected. ANN alone and ANN with SMOTE correctly classified 81.5% and 70.5%, respectively, of clear cell RCCs, with AUC values of 0.714 and 0.702, respectively. The logistic regression algorithm alone and with SMOTE correctly classified 75.3% and 62.5%, respectively, of the tumors, with AUC values of 0.656 and 0.666, respectively. The ANN performed better than the logistic regression (p < 0.05). No statistically significant difference was present between the model performances created with and without SMOTE (p > 0.05).
CONCLUSION. ML-based unenhanced CT texture analysis using ANN can be a promising noninvasive method in predicting the nuclear grade of clear cell RCCs.
Histopathologic nuclear grade is an independent predictor of tumor aggressiveness and prognosis for renal cell carcinomas (RCCs) [1]. Although various nuclear grading systems have been defined in the past, the most widely used one has been the Fuhrman nuclear grading system [2]. This system consists of four grades: grades I and II are low grade with a good prognosis, and grades III and IV are high grade with a poor prognosis [3, 4]. Pretreatment assessment of tumor behavior has gained importance because of recently emerging nephron-sparing surgery, ablative therapies, and even active surveillance [57]. Because percutaneous tumor biopsy suffers from sampling bias, pretreatment assessment of the biologic aggressiveness is histopathologically problematic for RCCs [8, 9]. Hence, many studies have evaluated the potential value of medical imaging in predicting the nuclear grade of RCCs [1014].
Texture analysis is a quantitative method for measuring repetitive morphologic patterns at the voxel or pixel level that are beyond human perception [1517]. Numerous features can be produced by this method, making the technique rather high dimensional. Recently, texture analysis has been a significant area of interest within the field of radiomics because a growing body of literature suggests that the method can be used to predict histopathologic and genomic characteristics of the lesions or tumors for better risk stratification (e.g., tumor staging, nuclear grading, response to treatment, and survival) [14, 15, 18, 19]. Because nuclear grading is based on the assessment of uniformity or heterogeneity of the tumors microscopically [3], we think that these microscopic heterogeneity differences might affect the tumors macroscopically. Hence, texture analysis might be a plausible option in noninvasive assessment of macroscopic tumor heterogeneity for predicting nuclear grade. To our best knowledge, there are only a few studies using CT texture analysis to predict the nuclear grade of clear cell RCCs [13, 14], which is the predominant type of all RCCs [2023]. However, all studies were limited to the analysis of contrast-enhanced CT images (corticomedullary or nephrographic phase images). Because the nuclear grade is not directly associated with the assessment of tumor vascularity microscopically, it would be logical to use unenhanced CT for this purpose. To our knowledge, no previous research has investigated the value of unenhanced CT texture analysis for predicting the nuclear grade of clear cell RCCs, which might also provide information regarding the tumor heterogeneity or repetitive patterns that are beyond the perception of the human eye.
In this study, we aimed to evaluate the predictive performance of machine learning (ML)–based unenhanced CT texture analysis in distinguishing low and high nuclear grade clear cell RCCs.

Materials and Methods

Ethics

No ethical approval and informed consent were obtained for this retrospective study because all the patients' deidentified data included in this study have been publicly and freely available for scientific purposes.

Source of Data

The clinical, histopathologic, and imaging data of the patients were obtained from The Cancer Genome Atlas–Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) database and The Cancer Imaging Archive [2426]. Currently, the TCGA-KIRC database includes 537 patients with clear cell RCC. Nonetheless, only 267 patients' imaging data can be accessed using The Cancer Imaging Archive. The results shown here are, in whole or in part, based on the data generated by TCGA Research Network [27]. The authors of this study acknowledge that they previously used this database (TCGA-KIRC) a few times in different contexts [28, 29].

Inclusion and Exclusion Criteria

To extract the preoperative unenhanced CT images, all imaging datasets were independently reviewed by a radiologist (with 3 years of experience) and a radiology resident (with 2 years of training) to minimize the chance of omission. The study inclusion and exclusion criteria are presented in Figure 1 in detail. Patients with multiple tumors were excluded because of the uncertainty in the database about which tumor had high and low nuclear grade. Imaging sources with fewer than five imaging studies were excluded from the study to minimize the heterogeneity of the imaging protocol further. Only 81 patients with 81 clear cell RCCs (56 high and 25 low nuclear grade) met our eligibility criteria. Patients' demographics are presented in Table 1.
Fig. 1 —Flowchart of inclusion and exclusion criteria. TCGA-KIRC = The Cancer Genome Atlas–Kidney Renal Clear Cell Carcinoma, TCIA = The Cancer Imaging Archive, CE = contrast-enhanced, RCC = renal cell carcinoma.
TABLE 1: Patient Demographics and Tumor Characteristics
CharacteristicValue
Age (y), mean62
Sex 
 Female38 (46.9)
 Male43 (53.1)
Tumor size (mm), mean (range)a75.9 (25–164)
Nuclear grade 
 Low25 (30.9)
 High56 (69.1)
TNM stage 
 Stage I32 (39.5)
 Stage II8 (9.9)
 Stage III23 (28.4)
 Stage IV18 (22.2)

Note—Except where noted otherwise, data are number (%) of patients.

a
Based on the maximum 3D diameter.

Reference Standard

The reference standard for statistical classifications was the nuclear grade of the clear cell RCCs, as reported in TCGA-KIRC [24, 25]. The nuclear grades I and II were grouped as low grade, and grades III and IV were grouped as high grade.

CT Parameters

The CT acquisition parameters were as follows: slice thickness, 5 mm; tube voltage, 120–140 kV; tube current, 95–575 mA; and pixel size, 0.562 × 0.562 mm2 to 0.976 × 0.976 mm2. The TCGA-KIRC database includes various data sources. Therefore, achieving a completely uniform imaging protocol was not possible, so that all images underwent some preprocessing steps to minimize possible consequences of the protocol heterogeneity.

Technical Study Design

To give a simplified overview to the readers, a general technical study pipeline is given in Figure 2.
Fig. 2 —Flowchart showing technical study pipeline. Please note that order of steps in dimension reduction is important for reproducibility of analysis. Asterisk denotes that clinical variables were included in this step. LoG = Laplacian of Gaussian, LR = logistic regression, ANN = artificial neural network, CV = cross-validation, SMOTE = synthetic minority oversampling technique.

Image Processing

Before texture feature extraction, all images underwent the following image processing steps: gray-level normalization using a ± 3σ technique (scale, 100) [12, 30], pixel resampling and rescaling using cubic B-spline interpolation (resultant pixel size, 1 × 1 mm2) [31], and gray-level discretization (bin-width, 0.3) [31].

Tumor Segmentation

The tumors were manually segmented from the unenhanced CT images using 3D Slicer software (version 4.8.1). The largest cross-sectional area of the tumors was selected for the tumor segmentation (Fig. 3). While the segmentation was being performed, available contrast-enhanced CT or contrast-enhanced MRI studies were also used in all cases to ascertain the tumor boundaries.
Fig. 3A — Segmentation style in 79-year-old woman with right-sided high nuclear grade clear cell renal cell carcinoma.
A, Tumor is seen on unenhanced CT.
Fig. 3B — Segmentation style in 79-year-old woman with right-sided high nuclear grade clear cell renal cell carcinoma.
B, Unenhanced CT shows how tumor is segmented using largest axial cross-sectional area (blue shaded area), excluding perinephric fat tissue.
Fig. 3C — Segmentation style in 79-year-old woman with right-sided high nuclear grade clear cell renal cell carcinoma.
C, Contrast-enhanced CT shows better tumor contour delineation than unenhanced CT; this was used as guide for more accurate segmentation of tumors.

Texture Feature Extraction

Texture features were extracted from original, filtered, and wavelet-transformed images using PyRadiomics software (version, 2.0.1.post26+g8cedd8f) [32]. A Laplacian of Gaussian (LoG) filter was used for image filtration with values of 2, 4, and 6 mm, which represent fine, medium, and coarse patterns, respectively. We generated the wavelet-based texture features using Py-Wavelet (version 0.5.2) [33]. The wavelet transformation is an image-processing technique to decompose images using high- and low-frequency band-pass filter combinations to obtain possible hidden and significant information from the images [34].
The extracted texture features were as follows: 72 first-order, 56 gray-level dependence matrix, 96 gray-level cooccurrence matrix, 64 gray-level run-length matrix, 64 gray-level size zone matrix, 20 neighboring gray-tone difference matrix, and 372 wavelet-based texture features using four wavelet-transformed images (high-high, high-low, low-high, and low-low frequency bands) along with the other six feature groups. The total number of the features extracted was 744 per lesion. Detailed definitions and mathematic formulas for these features have been described elsewhere [33, 35].

Dimension Reduction With Reproducibility Analysis

To evaluate the interobserver reproducibility of the texture features in the tumor segmentation process, two radiologists (with 3 years and 1 year of experience) independently segmented randomly selected 25 tumors regardless of their nuclear grade. The segmentations were done on the same image slice determined by the radiologist with 3 years of experience. Both radiologists were blind to the nuclear grade of the clear cell RCCs. Intra-class correlation coefficients (ICCs) were calculated for each texture feature using SPSS (version 20, IBM). Only the features with ICC ≥ 0.9, which indicates excellent reproducibility, were included in the further dimension reduction steps [36].

Dimension Reduction With Collinearity Analysis

The level of collinearity among the features was assessed using Pearson correlation coefficient (r). The threshold for collinearity was r = 0.7 [37]. The features with high collinearity were excluded from the analysis. In case a feature pair had high collinearity, the one with the lowest collinearity with the other features remained in the analysis.

Dimension Reduction With Feature Selection

The feature selection for artificial neural network (ANN) was performed using the Waikato Environment for Knowledge Analysis toolkit (version 3.8.2, The University of Waikato) [38, 39]. For feature selection, a wrapper attribute evaluator along with an incremental wrapper-based subset search method was used [38, 39]. In the search method, the texture features were ranked by their probabilistic significance computed as a two-way function (attribute-class and class-attribute association) [40]. The feature selection was performed using a nested cross-validation approach with 10-fold inner and 10-fold outer loops. The measures used to evaluate the performance of the attribute combinations in the feature selection process were accuracy for discrete class and root mean squared error for numeric class.
The feature selection for binary logistic regression analysis was performed with univariate analysis using SPSS. The variables with p < 0.05 in the univariate analysis were included in the multivariate binary logistic regression.
In addition to quantitative texture features, age, sex, and maximum 3D tumor size were also included as clinical variables in the feature selection process for both ANN and binary logistic regression.

Machine Learning–Based Classifications

ML-based classifications were performed using the Waikato Environment for Knowledge Analysis toolkit. ANN and binary logistic regression were used for model development. For creating ANN, we used multilayer perceptron, which is also known as the simplest form of deep neural network. Because our data had an imbalance between classes, we also used the synthetic minority oversampling technique (SMOTE) [41]. To balance the number of low nuclear grade clear cell RCCs against high nuclear grade clear cell RCCs, we applied 124% of SMOTE to the low nuclear grade clear cell RCCs, resulting in 56 labeled cases for each class (low vs high nuclear grade clear cell RCCs). The SMOTE algorithm creates synthetic instances that are not exact replications [41]. Hence, the method increases the representation of the minority group, while preserving the structure of the actual data [42, 43]. Ten-fold cross-validation was performed for the validation of the models. The performance evaluation was done using the AUC. Accuracy, sensitivity, specificity, precision, F-measure (weighted-harmonic mean of precision and recall), and Matthews correlation coefficient were also calculated. For sensitivity, specificity, and precision, overall weighted scores were also calculated. Comparison of the 10-fold cross-validated models created with and without SMOTE was made using the Wilcoxon signed-rank test [44]. A two-tailed p < 0.05 indicated statistical significance.

Results

Dimension Reduction With Reproducibility Analysis

Among 744 texture features, 653 had excellent reproducibility (ICC ≥ 0.9) in the analysis conducted by two radiologists. Therefore, these features were included in the further dimension reduction steps.

Dimension Reduction With Collinearity Analysis

By excluding the features with high col-linearity, the number of features was further reduced to 47. These features were included in the algorithm-based feature selection process.

Dimension Reduction With Feature Selection

Using the wrapper-based classifier-specific algorithm, the number of the selected features decreased to five for the ANN classifier. None of the selected texture features was from the original image. Four of the selected features were from the LoG filtered–images. The remaining feature was originated from the wavelet-transformed image. The dominant feature classes were the gray-level cooccurrence matrix and gray-level dependence matrix. The selected features for ANN and their respective ICCs are presented in Table 2. Distribution of the normalized texture feature values in the actual data (without SMOTE) is shown in graphs and plots in Figures 4A and 4C. The distribution of the normalized texture feature values with SMOTE is shown in graphs and plots in Figures 4B and 4D. Col-linearity status of the selected features for ANN is given in a matrix in Figure 5.
TABLE 2: Selected Texture Features for Artificial Neural Network and Their Intraclass Correlation Coefficients (ICCs)
CodeSelected Texture FeaturesICC
Image TypeFeature ClassFeature Name
TexF1LoG filter (4 mm)Gray-level dependence matrixLarge dependence high gray-level emphasis0.998
TexF2LoG filter (4 mm)Gray-level size zone matrixZone entropy0.987
TexF3aLoG filter (6 mm)Gray-level dependence matrixLarge dependence low gray-level emphasis0.902
TexF4aLoG filter (6 mm)Gray-level cooccurrence matrixInformational measure of correlation-20.963
TexF5aWavelet high-high frequency bandGray-level cooccurrence matrixInverse difference moment normalized0.987

Note—LoG = Laplacian of Gaussian.

a
Indicates common features selected for artificial neural network and logistic regression algorithms.
Fig. 4A —Distribution of selected texture features for artificial neural network.
A, Graphs show distribution on actual data (A and C) without synthetic minority oversampling technique (SMOTE) and balanced data (B and D) with SMOTE. Deviation plots (A and B) show mean value (lines) of normalized (zero to one) texture parameters with their respective one SD (shaded areas). Colored heat maps (C and D) show distribution and differences of normalized (zero to one) texture parameter values by presenting each tumor's value. Despite significant overlaps between feature values as assessed visually in both deviation plots and heat maps, they all somehow contribute to models' performance. This refers to black-box approach of machine learning algorithms, particularly artificial neural network. High = high nuclear grade, low = low nuclear grade, TexF1 = large dependence high gray-level emphasis in image with Laplacian of Gaussian (LoG) filter of 4 mm, TexF2 = zone entropy in image with LoG filter of 4 mm, TexF3 = large dependence low gray-level emphasis in image with LoG filter of 6 mm, TexF4 = informational measure of correlation–2 in image with LoG filter of 6 mm, TexF5 = inverse difference moment normalized in wavelet image (high-high frequency).
Fig. 4B —Distribution of selected texture features for artificial neural network.
B, Graphs show distribution on actual data (A and C) without synthetic minority oversampling technique (SMOTE) and balanced data (B and D) with SMOTE. Deviation plots (A and B) show mean value (lines) of normalized (zero to one) texture parameters with their respective one SD (shaded areas). Colored heat maps (C and D) show distribution and differences of normalized (zero to one) texture parameter values by presenting each tumor's value. Despite significant overlaps between feature values as assessed visually in both deviation plots and heat maps, they all somehow contribute to models' performance. This refers to black-box approach of machine learning algorithms, particularly artificial neural network. High = high nuclear grade, low = low nuclear grade, TexF1 = large dependence high gray-level emphasis in image with Laplacian of Gaussian (LoG) filter of 4 mm, TexF2 = zone entropy in image with LoG filter of 4 mm, TexF3 = large dependence low gray-level emphasis in image with LoG filter of 6 mm, TexF4 = informational measure of correlation–2 in image with LoG filter of 6 mm, TexF5 = inverse difference moment normalized in wavelet image (high-high frequency).
Fig. 4C —Distribution of selected texture features for artificial neural network.
C, Graphs show distribution on actual data (A and C) without synthetic minority oversampling technique (SMOTE) and balanced data (B and D) with SMOTE. Deviation plots (A and B) show mean value (lines) of normalized (zero to one) texture parameters with their respective one SD (shaded areas). Colored heat maps (C and D) show distribution and differences of normalized (zero to one) texture parameter values by presenting each tumor's value. Despite significant overlaps between feature values as assessed visually in both deviation plots and heat maps, they all somehow contribute to models' performance. This refers to black-box approach of machine learning algorithms, particularly artificial neural network. High = high nuclear grade, low = low nuclear grade, TexF1 = large dependence high gray-level emphasis in image with Laplacian of Gaussian (LoG) filter of 4 mm, TexF2 = zone entropy in image with LoG filter of 4 mm, TexF3 = large dependence low gray-level emphasis in image with LoG filter of 6 mm, TexF4 = informational measure of correlation–2 in image with LoG filter of 6 mm, TexF5 = inverse difference moment normalized in wavelet image (high-high frequency).
Fig. 4D —Distribution of selected texture features for artificial neural network.
D, Graphs show distribution on actual data (A and C) without synthetic minority oversampling technique (SMOTE) and balanced data (B and D) with SMOTE. Deviation plots (A and B) show mean value (lines) of normalized (zero to one) texture parameters with their respective one SD (shaded areas). Colored heat maps (C and D) show distribution and differences of normalized (zero to one) texture parameter values by presenting each tumor's value. Despite significant overlaps between feature values as assessed visually in both deviation plots and heat maps, they all somehow contribute to models' performance. This refers to black-box approach of machine learning algorithms, particularly artificial neural network. High = high nuclear grade, low = low nuclear grade, TexF1 = large dependence high gray-level emphasis in image with Laplacian of Gaussian (LoG) filter of 4 mm, TexF2 = zone entropy in image with LoG filter of 4 mm, TexF3 = large dependence low gray-level emphasis in image with LoG filter of 6 mm, TexF4 = informational measure of correlation–2 in image with LoG filter of 6 mm, TexF5 = inverse difference moment normalized in wavelet image (high-high frequency).
Fig. 5 —Correlation matrix presenting auto- and cross-correlation of selected features for artificial neural network. No significant cross-correlation (Pearson correlation coefficient > 0.7) is present among features, which proves that our collinearity analysis during dimension reduction worked well. TexF1 = large dependence high gray-level emphasis in image with Laplacian of Gaussian (LoG) filter of 4 mm, TexF2 = zone entropy in image with LoG filter of 4 mm, TexF3 = large dependence low gray-level emphasis in image with LoG filter of 6 mm, TexF4 = informational measure of correlation–2 in image with LoG filter of 6 mm, TexF5 = inverse difference moment normalized in wavelet image (high-high frequency), abs = absolute value, cor = correlation.
For the binary logistic regression, the number of the selected features decreased to six after a univariate analysis. None of the selected texture features was from the original image. Five of the selected features were from the LoG filtered–images. The remaining feature originated from the wavelet-transformed image. The dominant feature class was the gray-level cooccurrence matrix. The selected features for the logistic regression and their respective ICCs are presented in Table 3. Distribution of the normalized texture feature values in the actual data (without SMOTE) is shown in graphs and plots in Figures S1A and S1C, which can be viewed in the AJR electronic supplement to this article (available at www.ajronline.org). The distribution of the normalized texture feature values with SMOTE is shown in graphs and plots in Figures S1B and S1D. Collinearity status of the selected features for logistic regression is given in a matrix in Figure S2, which can be viewed in the AJR electronic supplement to this article (available at www.ajronline.org).
TABLE 3: Selected Texture Features for Logistic Regression (LR) and Their Intraclass Correlation Coefficients (ICCs)
CodeSelected Texture FeaturesICC
Image TypeFeature ClassFeature Name
LR-TexF1LoG filter (2 mm)First-orderMean0.958
LR-TexF2LoG filter (2 mm)Gray-level size zone matrixZone entropy0.981
LR-TexF3LoG filter (4 mm)Gray-level cooccurrence matrixInformational measure of correlation-10.937
LR-TexF4aLoG filter (6 mm)Gray-level dependence matrixLarge dependence low gray-level emphasis0.902
LR-TexF5aLoG filter (6 mm)Gray-level cooccurrence matrixInformational measure of correlation-20.963
LR-TexF6aWavelet high-high frequency bandGray-level cooccurrence matrixInverse difference moment normalized0.987

Note—LoG = Laplacian of Gaussian.

a
Indicates common features selected for artificial neural network and logistic regression algorithms.
After the feature selection, three features were common in the subsets selected for ANN and logistic regression. For both ANN and binary logistic regression, none of the clinical variables was selected as valuable for the model development.

Machine Learning–Based Classifications

The ANN algorithm alone correctly classified 81.5% of the clear cell RCCs regarding nuclear grade (low vs high), with an AUC value of 0.714. Overall weighted sensitivity, specificity, and precision were 81.5%, 65.2%, and 81.4%, respectively.
The ANN with SMOTE correctly classified 70.5% of the clear cell RCCs, with an AUC value of 0.702. Overall weighted sensitivity, specificity, and precision were 70.5%, 70.5%, and 71.1%, respectively.
The logistic regression algorithm alone correctly classified 75.3% of the clear cell RCCs, with an AUC value of 0.656. Overall weighted sensitivity, specificity, and precision were 75.3%, 53.5%, and 74.2%, respectively.
The logistic regression algorithm with SMOTE correctly classified 62.5% of the clear cell RCCs, with an AUC value of 0.666. Overall weighted sensitivity, specificity, and precision were 62.5%, 62.5%, and 62.5%, respectively. Detailed performance metrics are presented in Table 4.
TABLE 4: Performance Metrics of the Classifications
AlgorithmConfusion MatrixAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)F-MeasureMatthews Correlation CoefficientAUC
Low Nuclear Grade, No. of Patients or Labeled DataHigh Nuclear Grade, No. of Patients or Labeled DataReference Standard
ANN1312Low81.552.094.681.30.6340.5410.714
 353High 94.652.081.50.876  
ANN plus SMOTE3521Low70.562.578.674.50.6800.4160.702
 1244High 78.662.567.70.727  
LR916Low75.336.092.936.00.4740.3630.656
 452High 92.936.092.90.839  
LR plus SMOTE3422Low62.560.764.363.00.6180.2500.666
 2036High 64.360.762.10.632  

Note—Note that sensitivity, specificity, precision, and F-measure are given for each class as opposed to the values in the Results section, in which overall weighted values are presented. ANN = artificial neural network, SMOTE = synthetic minority oversampling technique, LR = logistic regression.

Comparison of Models With and Without Synthetic Minority Oversampling Technique

To address the concerns regarding the data imbalance between classes and their potential influence on the classifications using ANN and logistic regression algorithm, no statistically significant difference was found between the model performances (10-fold cross-validated AUCs) created with actual data (without SMOTE) and those created with SMOTE (p > 0.05).

Comparison of Artificial Neural Network and Logistic Regression

With or without SMOTE, the performance (10-fold cross-validated AUCs) of the ANN and binary logistic regression in predicting nuclear grade (low vs high) were statistically significantly different (p < 0.05). The ANN performed better than the logistic regression algorithm.

Discussion

Study Overview

In reviewing the literature, no data were found regarding the use of unenhanced CT in the texture analysis of clear cell RCCs for predicting their nuclear grade. Hence, the current study was designed to determine the potential value of the method using an ML-based analysis using ANN and logistic regression. The results of this study indicate that the ML-based unenhanced CT texture analysis using ANN and logistic regression can correctly classify about three- to four-fifths of the clear cell RCCs regarding their nuclear grade (low vs high grade). Overall, the ANN performed better than the binary logistic regression algorithm.

Previous Works

To our best knowledge, there have been only two studies evaluating the potential value of the CT texture analysis in discriminating nuclear grade of the clear cell RCCs [13, 14]. The authors of both studies only used contrast-enhanced phases. Despite several technical and methodologic differences, both studies reported high predictive performance for the nuclear grade of the clear cell RCCs.
Apart from texture analysis, the prediction of nuclear grade has also been studied by rather conventional methods, such as apparent diffusion coefficient maps [45]. In a recent meta-analysis, the predictive performance of the technique was moderate, with a pooled sensitivity of 78%, specificity of 86%, and hierarchic summary ROC AUC of 0.850 [45].

Practical Implications

The nuclear grade of RCCs is one of the most important prognostic factors associated with survival [4648]. Percutaneous biopsy is a widely accepted method to predict nuclear grade. In most cases, the accuracy of the biopsy is within the range of 75% and 85% [49]. In a meta-analysis, the percutaneous biopsy showed a moderate concordance (87%) with surgical specimens [8]. However, nuclear grading with biopsy is an invasive method and is susceptible to significant sampling bias [8, 9]. At some institutions, the percutaneous biopsy failure rate may be as high as 37% [50]. Because there are expanding treatment options, including active surveillance and ablative therapies, it has been essential to determine the biologic behavior of the tumors in vivo. On the basis of our current study and the other two studies [13, 14], ML-based CT texture analysis, with its noninvasive nature, showed a predictive performance comparable to that of biopsy in predicting nuclear grade of the clear cell RCCs. Also, it may play a key role in active surveillance of renal masses by allowing repeated noninvasive assessment of the nuclear grade during follow-up [7, 51]. In the future, the combined use of unenhanced and contrast-enhanced CT images might also be useful in creating better models for this purpose.

Limitations and Generalizability

Several limitations need to be acknowledged for this study. First, the retrospective nature of the study was inherently disadvantageous. Second, our patient population was rather small and relatively imbalanced between classes, which might lead to over- or underfitting in ML-based classifications. To avoid over- or underfitting, we considered a few measures. We used the simplest form of ANN or deep learning, which is multilayer perceptron, as well as a conventional method, the logistic regression. A convolutional neural network would be an option, but considering its complexity, it would have almost certainly led to over- or underfitting using such a small labeled data. We used a cross-validation algorithm that separates the feature selection and model validation. Otherwise, the model would have been too optimistic. We used SMOTE to avoid the data imbalance problem, which also could have led to overly optimistic results in ML-based classifications [41]. Third, 2D tumor segmentation was used. Three-dimensional segmentation would have been much more representative for tumor texture, but it would not be useful for clinical practice because of the excessive segmentation duration. For the future, integration of semiautomated or fully automated segmentation techniques with texture analysis software programs might provide an opportunity to overcome this limitation. Fourth, the use of a rather thick image slice of 5 mm might be considered a limitation. Nonetheless, rather than using thinner slices, we think that consistency of the slice thickness is much more important so that we constructed our inclusion and exclusion criteria on the basis of this issue. Fifth, the imaging database of TCGA-KIRC in The Cancer Imaging Archive includes patients from various centers and sources with different image acquisition protocols. To minimize inter-scanner variabilities and effects, all image datasets in our study underwent some preprocessing steps before texture analysis, such as normalization, discretization, and pixel rescaling [30, 31]. Sixth, we did not include contrast-enhanced CT images in the study because of inconsistent contrast administration protocols in the database. Seventh, we did not use an independent validation set for ANN because we have a small patient population. On the other hand, we used a nested cross-validation algorithm with 10-fold inner and 10-fold outer loops for the ANN. This method uses all the data by splitting it into training, test, and validation sets, which then simulates an external validation process by reducing the bias and giving an estimate of the error similar to that of independent validation [52, 53]. Nevertheless, independent external validation is also needed for clinical use of the method. Eighth, we did not consider using a nested cross-validation method for the logistic regression analysis to present a rather conventional analysis. Nonetheless, we used at least a simple 10-fold cross-validation method to avoid overly optimistic results. Finally, a fundamental problem with the texture field is the interpretation of the texture features in a context, even if they were somehow validated [15]. In the context of the nuclear grade of clear cell RCCs, the selected features might represent some inherent heterogeneity differences associated with patient survival. Although every feature has been built on very detailed mathematic models, establishing meaningful clinical and pathologic correlates seems problematic at this stage of the developing field of texture analysis [15].

Conclusion

The current study was designed to investigate the potential value of ML-based un-enhanced CT texture analysis in predicting the nuclear grade of clear cell RCCs. On the basis of multiinstitutional data from TCGAKIRC, the ML-based unenhanced CT texture analysis using ANN can be a promising noninvasive method with favorable accuracy.

Footnote

WEB
This is a web exclusive article.

Supplemental Content

File (06_18_20742_suppdata_s01a_cmyk.jpg)
File (06_18_20742_suppdata_s01a_cmyk_thumb.jpg)
File (06_18_20742_suppdata_s01b_cmyk.jpg)
File (06_18_20742_suppdata_s01b_cmyk_thumb.jpg)
File (06_18_20742_suppdata_s01c_cmyk.jpg)
File (06_18_20742_suppdata_s01c_cmyk_thumb.jpg)
File (06_18_20742_suppdata_s01d_cmyk.jpg)
File (06_18_20742_suppdata_s01d_cmyk_thumb.jpg)
File (06_18_20742_suppdata_s02_cmyk.jpg)
File (06_18_20742_suppdata_s02_cmyk_thumb.jpg)

References

1.
Minardi D, Lucarini G, Mazzucchelli R, et al. Prognostic role of Fuhrman grade and vascular endothelial growth factor in pT1a clear cell carcinoma in partial nephrectomy specimens. J Urol 2005; 174:1208–1212
2.
Lohse CM, Blute ML, Zincke H, Weaver AL, Cheville JC. Comparison of standardized and nonstandardized nuclear grade of renal cell carcinoma to predict outcome among 2,042 patients. Am J Clin Pathol 2002; 118:877–886
3.
Fuhrman SA, Lasky LC, Limas C. Prognostic significance of morphologic parameters in renal cell carcinoma. Am J Surg Pathol 1982; 6:655–663
4.
Delahunt B. Advances and controversies in grading and staging of renal cell carcinoma. Mod Pathol 2009; 22(suppl 2):S24–S36
5.
Campbell N, Rosenkrantz AB, Pedrosa I. MRI phenotype in renal cancer: is it clinically relevant? Top Magn Reson Imaging 2014; 23:95–115
6.
Kunkle DA, Egleston BL, Uzzo RG. Excise, ablate or observe: the small renal mass dilemma—a meta-analysis and review. J Urol 2008; 179:1227–1233; discussion, 1233–1234
7.
Jewett MAS, Mattar K, Basiuk J, et al. Active surveillance of small renal masses: progression patterns of early stage kidney cancer. Eur Urol 2011; 60:39–44
8.
Marconi L, Dabestani S, Lam TB, et al. Systematic review and meta-analysis of diagnostic accuracy of percutaneous renal tumour biopsy. Eur Urol 2016; 69:660–673
9.
Volpe A, Mattar K, Finelli A, et al. Contemporary results of percutaneous biopsy of 100 small renal masses: a single center experience. J Urol 2008; 180:2333–2337
10.
Parada Villavicencio C, McCarthy RJ, Miller FH. Can diffusion-weighted magnetic resonance imaging of clear cell renal carcinoma predict low from high nuclear grade tumors. Abdom Radiol (NY) 2017; 42:1241–1249
11.
Rosenkrantz AB, Niver BE, Fitzgerald EF, Babb JS, Chandarana H, Melamed J. Utility of the apparent diffusion coefficient for distinguishing clear cell renal cell carcinoma of low and high nuclear grade. AJR 2010; 195:[web]W344–W351
12.
Schieda N, Lim RS, Krishna S, McInnes MDF, Flood TA, Thornhill RE. Diagnostic accuracy of unenhanced CT analysis to differentiate low-grade from high-grade chromophobe renal cell carcinoma. AJR 2018; 210:1079–1087
13.
Ding J, Xing Z, Jiang Z, et al. CT-based radiomic model predicts high grade of clear cell renal cell carcinoma. Eur J Radiol 2018; 103:51–56
14.
Bektas CT, Kocak B, Yardimci AH, et al. Clear cell renal cell carcinoma: machine learning-based quantitative computed tomography texture analysis for prediction of Fuhrman nuclear grade. Eur Radiol 2018 Aug 30 [Epub ahead of print]
15.
Lubner MG, Smith AD, Sandrasegaran K, Sahani DV, Pickhardt PJ. CT texture analysis: definitions, applications, biologic correlates, and challenges. RadioGraphics 2017; 37:1483–1503
16.
Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology 2016; 278:563–577
17.
Ganeshan B, Miles KA. Quantifying tumour heterogeneity with CT. Cancer Imaging 2013; 13:140–149
18.
Lambin P, Rios-Velazquez E, Leijenaar R, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 2012; 48:441–446
19.
Kocak B, Yardimci AH, Bektas CT, et al. Textural differences between renal cell carcinoma sub-types: machine learning-based quantitative computed tomography texture analysis with independent external validation. Eur J Radiol 2018; 107:149–157
20.
Eble J, Sauter G, Epstein J, Sesterhenn I. World Health Organization: pathology and genetics of tumours of the urinary system and male genital organs. Lyon, France: IARC Press, 2004:9–11
21.
Znaor A, Lortet-Tieulent J, Laversanne M, Jemal A, Bray F. International variations and trends in renal cell carcinoma incidence and mortality. Eur Urol 2015; 67:519–530
22.
Gupta K, Miller JD, Li JZ, Russell MW, Charbonneau C. Epidemiologic and socioeconomic burden of metastatic renal cell carcinoma (mRCC): a literature review. Cancer Treat Rev 2008; 34:193–205
23.
Jemal A, Siegel R, Ward E, Hao Y, Xu J, Thun MJ. Cancer statistics, 2009. CA Cancer J Clin 2009; 59:225–249
24.
Akin O, Elnajjar P, Heller M, et al. TCGAKIRC. Cancer Imaging Archive website. wiki.cancerimagingarchive.net/display/Public/TCGAKIRC. Published 2016. Updated January 4, 2019. Accessed January 15, 2019
25.
Clark K, Vendt B, Smith K, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging 2013; 26:1045–1057
26.
[No authors listed]. The Cancer Imaging Archive (TCIA). The Cancer Imaging Archive website. www.cancerimagingarchive.net/. Accessed January 15, 2019
27.
[No authors listed]. The Cancer Genome Atlas (TCGA). National Cancer Institute, National Institutes of Health website. cancergenome.nih.gov/. Accessed January 15, 2019
28.
Kocak B, Ates E, Durmaz ES, Ulusan MB, Kilickesmez O. Influence of segmentation margin on machine learning-based high-dimensional quantitative CT texture analysis: a reproducibility study on renal clear cell carcinomas. Eur Radiol 2019 Feb 12 [Epub ahead of print]
29.
Kocak B, Durmaz ES, Ates E, Ulusan MB. Radiogenomics in clear cell renal cell carcinoma: machine learning–based high-dimensional quantitative CT texture analysis in predicting PBRM1 mutation status. AJR 2019; 212:[web]W55–W63
30.
Collewet G, Strzelecki M, Mariette F. Influence of MRI acquisition protocols and image intensity normalization methods on texture classification. Magn Reson Imaging 2004; 22:81–91
31.
Shafiq-Ul-Hassan M, Zhang GG, Latifi K, et al. Intrinsic dependencies of CT radiomic features on voxel size and number of gray levels. Med Phys 2017; 44:1050–1062
32.
van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res 2017; 77:e104–e107
33.
Lee G, Gommers R, Wasilewski F, et al. PyWave-lets: wavelet transforms in Python. github.com/PyWavelets/pywt. Published 2006. Accessed December 5, 2018
34.
McDonnell JTE, Bentley PM. Wavelet transforms: an introduction. Electron Commun Eng J 1994; 6:175–186
35.
[No authors listed]. Radiomic features. pyradiomics.readthedocs.io/en/latest/features.html. Published 2016. Accessed January 15, 2019
36.
Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 2016; 15:155–163
37.
Dormann CF, Elith J, Bacher S, et al. Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 2013; 36:27–46
38.
Kohavi R, John GH. Wrappers for feature subset selection. Artif Intell 1997; 97:273–324
39.
Bermejo P, Gamez JA, Puerta JM. Improving incremental wrapper-based subset selection via replacement and early stopping. Int J Pattern Recognit Artif Intell 2011; 25:605–625
40.
Ahmad A, Dey L. A feature selection technique for classificatory analysis. Pattern Recognit Lett 2005; 26:43–56
41.
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 2002; 16:321–357
42.
Fehr D, Veeraraghavan H, Wibmer A, et al. Automatic classification of prostate cancer Gleason scores from multiparametric magnetic resonance images. Proc Natl Acad Sci USA 2015; 112:E6265–E6273
43.
Zhang Y, Oikonomou A, Wong A, Haider MA, Khalvati F. Radiomics-based prognosis analysis for non-small cell lung cancer. Sci Rep 2017; 7:46349
44.
Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 2006; 7:1–30
45.
Woo S, Suh CH, Kim SY, Cho JY, Kim SH. Diagnostic performance of DWI for differentiating high- from low-grade clear cell renal cell carcinoma: a systematic review and meta-analysis. AJR 2017; 209:[web]W374–W381
46.
Klatte T, Patard J-J, de Martino M, et al. Tumor size does not predict risk of metastatic disease or prognosis of small renal cell carcinomas. J Urol 2008; 179:1719–1726
47.
Frank I, Blute ML, Cheville JC, Lohse CM, Weaver AL, Zincke H. An outcome prediction model for patients with clear cell renal cell carcinoma treated with radical nephrectomy based on tumor stage, size, grade and necrosis: the SSIGN score. J Urol 2002; 168:2395–2400
48.
Zisman A, Pantuck AJ, Dorey F, et al. Mathematical model to predict individual survival for patients with renal cell carcinoma. J Clin Oncol 2002; 20:1368–1374
49.
Lane BR, Samplaski MK, Herts BR, Zhou M, Novick AC, Campbell SC. Renal mass biopsy: a renaissance? J Urol 2008; 179:20–27
50.
Lechevallier E, André M, Barriol D, et al. Fine-needle percutaneous biopsy of renal masses with helical CT guidance. Radiology 2000; 216:506–510
51.
Abou Youssif T, Tanguay S. Natural history and management of small renal masses. Curr Oncol 2009; 16(suppl 1):S2–S7
52.
Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics 2006; 7:91
53.
Kuhn M, Johnson K. Over-fitting and model tuning. In: Kuhn M, Johnson K. Applied predictive modeling. New York, NY: Springer New York, 2013:61–92

FOR YOUR INFORMATION

The data supplement accompanying this web exclusive article can be viewed by clicking “Supplemental” at the top of the article.

Information & Authors

Information

Published In

American Journal of Roentgenology
Pages: W132 - W139
PubMed: 30973779

History

Submitted: November 6, 2018
Accepted: December 20, 2018
Version of record online: April 11, 2019

Keywords

  1. clear cell renal cell carcinoma
  2. CT
  3. machine learning
  4. nuclear grade
  5. radiomics

Authors

Affiliations

Burak Kocak
Department of Radiology, Istanbul Training and Research Hospital, Samatya, Istanbul 34098, Turkey.
Emine Sebnem Durmaz
Department of Radiology, Buyukcekmece Mimar Sinan State Hospital, Istanbul, Turkey.
Ece Ates
Department of Radiology, Istanbul Training and Research Hospital, Samatya, Istanbul 34098, Turkey.
Ozlem Korkmaz Kaya
Department of Radiology, Koc University School of Medicine, Koc University Hospital, Istanbul, Turkey.
Ozgur Kilickesmez
Department of Radiology, Istanbul Training and Research Hospital, Samatya, Istanbul 34098, Turkey.

Notes

Address correspondence to B. Kocak ([email protected]).

Metrics & Citations

Metrics

Citations

Export Citations

To download the citation to this article, select your reference manager software.

Articles citing this article

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share on social media