Original Research
Hepatobiliary Imaging
April 2006

CT of Colon Cancer Metastases to the Liver Using Modified RECIST Criteria: Determining the Ideal Number of Target Lesions to Measure


OBJECTIVE. We sought to define the ideal number of target lesions to be measured to assess disease response in patients undergoing chemotherapy for colon cancer metastases to the liver.
MATERIALS AND METHODS. Thirty consecutive patients were recruited for this study. Patients were part of a multisite, randomized, double-arm, phase 3 clinical trial involving chemotherapy with an investigational drug for metastatic colon cancer. Patients were recruited from U.S. and international sites. Institutional review board approval was obtained, and informed consent was obtained from all patients. Our study included CT measurements of hepatic metastases. All patients (n = 30) had a minimum of five target lesions in the liver. Target-lesion size was defined by Response Evaluation Criteria in Solid Tumors (RECIST) criteria. We calculated the patient response at 2 months and at 6 months (complete response, partial response, stable disease, and progressive disease) using RECIST. Patient response was calculated based on the percentage increase or decrease at 2 and 6 months in the greatest diameter of the single largest lesion, two large lesions, three large lesions, four lesions, and five lesions, respectively. The concordance between five-target-lesion measurement and lesser numbers of lesions was analyzed using kappa statistics (StatView, 5.0).
RESULTS. In 93.33% of patients (n = 28/30), there was agreement on patient response irrespective of the number of measurements made on CT. Of these 30 patients, 47% had a partial response (n = 14/30), 43% had stable disease (n = 13/30), and 10% had progressive disease at 2 months (n = 3/30). At 6 months, 43% had a partial response (n = 13/30), 47% had stable disease (n = 14/30), and 10% had progressive disease (n = 3/30). Agreement in response evaluation between lesion groups for multiple measurements was high, with values of 1.0 for multiple-lesion measurements and 0.88 for single-lesion measurements at 2 months. The concordance values were the same at 6 months.
CONCLUSION. In the majority of patients with hepatic metastases of colorectal cancer, measuring the maximal diameter of the single largest lesion yielded the same treatment-response classification as measuring up to five target lesions. This result suggests that it may be possible to reduce the number of lesions measured in clinical trials.


Response Evaluation Criteria in Solid Tumors (RECIST) criteria are the most widely used response evaluation criteria in oncologic clinical trials [1-3]. Modified RECIST uses unidimensional measurements for patient response evaluation.
Prasad et al. [2] in a study of 86 breast cancer patients suggested that bidimensional measurements do not have added value when compared with unidimensional measurements in patient response evaluation.
Target lesions are the most important radiologic markers in oncologic evaluation. Target-lesion size criteria are defined by RECIST [4, 5]. Minimum target-lesion diameter should have double the CT slice thickness. This gives an accurate representation of the lesion diameter on CT, taking into account the potential errors caused by partial volume averaging.
RECIST defines a minimum of five target lesions per organ and a total of 10 target lesions per evaluation [4]. Nontarget lesions are those lesions that do not fulfill the size criteria and lesions such as bone metastases, pleural effusion, and ascites [4].
The criterion of five target lesions per organ on CT is an arbitrary one and is not supported by any objective evidence or study. We sought to evaluate the accuracy of this criterion by comparing the response after follow-up of a single-lesion measurement with measurement of five or a lesser number of lesions. Our goal was to test whether these measurements produce similar categoric results (stable disease, partial response, and so on).

Materials and Methods


Thirty consecutive patients with metastatic colon cancer were recruited for this study. These patients were all adults. There were 16 men and 14 women with a mean age of 56 years. The patients were part of a multiinstitutional cohort study involving patients and clinical sites worldwide. The study was a double-arm, phase 3 clinical trial comparing the efficacy of an investigational drug with conventional chemotherapeutic agents for colon cancer. Institutional review board approval was obtained for this study, and informed consent was obtained from all patients.

Patient Categories

The patients were grouped into five categories based on the number of lesions measured (measurement of single largest lesion, two largest lesions, three large lesions, and so on). The same 30 patients were put into each category based on the number of lesions measured so that each category had 30 patients. The responses obtained at 2 months and at 6 months by measuring single-lesion measurement were compared with responses obtained by multiple measurements.


These patients were evaluated retrospectively by CT for the presence of hepatic and lymph node metastatic lesions. Our patient subgroup (n = 30) included patients with a minimum of five hepatic metastatic lesions. All these lesions fulfilled the target-lesion size criterion. The lesions were well defined, nonconfluent, and nonnecrotic. They were arranged in the order of the greatest diameter. Addition of the greatest diameters of these lesions was performed (largest lesion diameter, sum of two largest diameters, three diameters, and so on). Similar addition of measurements was done for the same lesions at 2 months and at 6 months after the initiation of chemotherapy. Measurements were independently performed by two radiologists who were blinded to patient details. Intraobserver variability was tested at 2 months and at 6 months by giving test cases (n = 10) selected from the study cohort.
A standard CT protocol was followed at the multiple sites where these patients were evaluated. The images were quality checked twice before evaluation to make sure that study protocol was followed at each individual study site. Images with variability in technique were not included in the study. All 30 patients underwent triple-phase CT evaluation with MDCT, and images with a slice thickness of 5 mm were obtained. Mechanical injection of 150 mL of nonionic iodinated contrast medium was made at a rate of 5 mL/sec. For the hepatic artery phase imaging, the scanning delay was 25 sec after the initiation of a contrast bolus. For the portal venous phase imaging, the scanning delay was 65 sec after the initiation of a contrast bolus.

Image Analysis

The sum of greatest diameters at baseline was compared with that at 2 months and at 6 months. Patients were categorized into those with complete response (total disappearance of the lesions), partial response (30% decrease in the sum of diameters of the lesions), progressive disease (20% increase in the sum of diameters of the lesions), and stable disease (between 30% decrease and 20% increase). Responses obtained using the single-largest-lesion diameter were compared with those obtained using multiple lesions.

Statistical Analysis

The reviewed data were collected on a Microsoft Excel spreadsheet specifically designed to allow an automatic recalculation of each patient's response according to the following variables: criteria selected for the definition of progression, maximum number of lesions considered, the sums of greatest diameters of the lesions measured, bidimensional measurements, and threshold values (in percent) of tumor size variations for response assessment. The overall response of each recalculation (i.e., five-lesion measurement in comparison with a lesser number [four, three, and so on] lesion measurements) was then compared using kappa statistics. The measure of concordance was calculated using StatView, 5.0 (SAS Institute). The 95% confidence intervals for the kappa statistics were calculated. The kappa values were interpreted on the basis of reports in the literature. Using this test, concordance may be considered as very satisfactory for values ≥ 0.75 [6].


Comparison of Responses

We found no difference in response at 2 months and at 6 months in 93.33% of patients (n = 28) in all five groups. The majority of patients showed good response to chemotherapy. At 2 months, 43% (n = 13/30) had stable disease (at least 30% reduction in the sum of greatest diameters), and at 6 months, 47% (n = 14/30) had stable disease. This response was consistent in groups with multiple-lesion measurements. In patients with single-lesion measurement, 50% (15/30) had stable disease at 2 months and the same percentage had stable disease at 6 months. In groups with multiple measurements (2 or more) at 2 months, 47% of patients (n = 14/30) showed partial response (30% decrease in the sum of the greatest diameters), and at 6 months, 43% (n = 13/30) had partial response to chemotherapy. When single-lesion measurement was performed, 40% (12/30) had partial response at 2 months, and the same percentage had partial response at 6 months. The radiologists were blinded to the patient details, and the details of the drugs given to individual patients were unknown to the evaluators. Only three patients (3/30) (10%) had progressive disease (20% increase in the sum of the diameters) at 2 months and at 6 months. Intraobserver agreement was 100% at 2 months and at 6 months with a kappa value of 1.00 and a 95% CI of 1.00-1.00.

Statistical Analysis

The results of our study are shown in Table 1 along with the percentage agreement and the measure of concordance (kappa value) between the two measurement sets. Agreement in response evaluation between lesion groups for multiple measurements was high, with kappa values of 1.0 for multiple-lesion measurements and 0.88 for single-lesion measurements at 2 months. The concordance values were the same at 6 months. Therefore, agreement between the five-target-lesion measurements was very high for both multiple-lesion measurements and single-lesion measurements. Concordance in response evaluation between lesion groups for single and multiple measurements is shown in Table 1 along with the measure of correlation between the two lesion groups.
TABLE 1: Comparison of RECIST Category and Number of Target Lesions in 30 Patients
No. (%) of Patients
 Complete ResponsePartial ResponseStable DiseaseProgressive DiseaseAgreement (%)aKappa Values (95% CI)a
Target Lesions2 Mo6 Mo2 Mo6 Mo2 Mo6 Mo2 Mo6 Mo2 Mo6 Mo2 Mo6 Mo
Five0014 (47)13 (43)13 (43)14 (47)3 (10)3 (10)    
Four0014 (47)13 (43)13 (43)14 (47)3 (10)3 (10)1001001.00 (1.00—1.00)1.00 (1.00—1.00)
Three0014 (47)13 (43)13 (43)14 (47)3 (10)3 (10)1001001.00 (1.00—1.00)1.00 (1.00—1.00)
Two0014 (47)13 (43)13 (43)14 (47)3 (10)3 (10)1001001.00 (1.00—1.00)1.00 (1.00—1.00)
12 (40)
12 (40)
15 (50)
15 (50)
3 (10)
3 (10)
0.88 (0.73—1.03)
0.88 (0.73—1.03)
Note—RECIST = Response Evaluation Criteria in Solid Tumors, CI = confidence interval.
With five measurements


Modified RECIST criteria [4] have been most extensively used in oncologic drug trials. The standards set by the RECIST group are simple and were revised in 2000. This group uses the sum of unidimensional measurements to assess patient response [4, 7].
The revised RECIST criteria take into account five target lesions per organ and a maximum of 10 target lesions. RECIST criteria have been devised after extensive evaluation and research by the RECIST group, taking into account comments and suggestions from various investigators and study groups from all over the world.
The criteria defined by the RECIST group have been validated and supported by various studies [8-13]. The scientific basis of the criteria has been detailed by Therasse et al. [4] in their outline on modified RECIST.
Our study raises the possibility that the measurement of five target lesions per organ may not be required to give an accurate disease response evaluation. The response provided by the measurement of the single largest lesion was similar to the response obtained by multiple-lesion measurements. This single largest lesion was a well-defined liver lesion, was nonnecrotic, and showed good enhancement of the edges on CT. When using multiple measurements, the disease response was found to be stable disease in 43% of patients at 2 months and 47% at 6 months, partial response in 47% at 2 months and 43% at 6 months, and progressive disease in 10% at both time points. In these patients, agreement in disease response was not related to the number of lesions measured. In our study protocol, treatment remained the same for patients with stable disease and partial response. So the difference in treatment response using single measurement and multiple measurements did not alter the treatment decision.
We did not encounter a situation among patients in the multiple lesion categories in which there was discordance in response between lesions—that is, one lesion grew while others shrank. So selecting a single target lesion did not grossly overestimate or underestimate response to treatment.
Several studies have tried simplification of the RECIST criteria. The issue of accuracy of unidimensional measurements when compared with bidimensional measurement has been studied by Prasad et al. [2] and James et al. [5]. Both studies emphasized the value of unidimensional measurements and postulated this type of measurement to be simple and cost effective. However, a study comparing the value of measuring a solitary well-defined lesion with multiple-lesion measurements has not been mentioned in the literature. All of these similar studies have used kappa statistics to compare the concordance between categories with unidimensional measurements and bidimensional measurements [1, 2, 5].
Our study simplifies the RECIST criteria further. This has significant clinical implications in the management of patients with cancer. Since 1981, oncologists and radiologists have depended on complicated methods devised by the World Health Organization. In 2000, the revised RECIST was defined and the simpler but validated criteria were devised. Our observation makes clinical evaluation of patients easier to perform. Clinical research trials can be made faster and more cost effective with a reduction in the number of lesions to be measured.
Current standards set by the RECIST group [4] state that all measurable lesions up to a maximum of five lesions per organ and 10 lesions in total, representative of all involved organs, should be identified as target lesions and recorded and measured at baseline. Target lesions should be selected on the basis of their size (those with the longest diameter) and their suitability for accurate repeated measurements. However, this standard of five target lesions per organ is an arbitrary one, and the authors do not explain why this number of lesions should be selected.
Some prior investigations have studied the number of lesions to be measured. Schwartz et al. [14] stated that measuring higher numbers of lesions will decrease the interobserver variability and intraobserver variability. In their study cohort, the variance decreased by at least 90% when six or more lesions were measured bidimensionally. However, the RECIST group has validated and incorporated unidimensional measurement into its new guidelines [4]. Our results depend on the validity of the RECIST criteria for complete response, partial response, stable disease, and progressive disease.
Recent studies have compared 18F-FDG PET to RECIST criteria. Choi et al. [15] suggested that 18F-FDG PET is more sensitive and specific than RECIST criteria in assessing early tumor response. They studied 173 lesions in 36 patients with gastrointestinal stromal primary tumors and found that changes in tumor size significantly underestimated the degree of treatment response compared with 18F-FDG uptake. In the patients who had stable disease according to RECIST criteria in their study (75% of patients), 70% had a 99% reduction in maximum standardized uptake value on 18F-FDG PET. There are conflicting data on the value of volumetric measurements. Prasad et al. [13] stated that volumetric measurements give disease response that is different from standard RECIST criteria in a large proportion of patients. However, Tran et al. [16] showed fair agreement between unidimensional and volumetric measurements. The sensitivity of 18F-FDG PET and volumetric measurements in comparison with RECIST needs to be validated by further studies.
The main limitation of our study was the low number of patients. The results of the study need to be further validated using a larger number of patients. Our radiologic response evaluation could not be compared with clinical patient response because the study is part of an ongoing double-arm oncologic trial involving conventional chemotherapeutic agents and an investigational drug in which the reviewers are blinded to clinical information of patients.
The majority of our patients did not show progression. A study of a similar nature needs to be performed in patients with progressive disease to verify whether our observations can be applied in the setting of a worsening clinical scenario.
This study suggests that the number of target-lesion measurements per organ proposed by the RECIST group can be reduced. Single-lesion measurements gave concordant disease response when compared with multiple-lesion measurements in 93.33% of evaluations. Response evaluation with two-large-lesion measurements gave 100% concordant results to response with five-target-lesion measurements. This fact has significant clinical implications and needs to be applied in routine oncologic practice and pharmaceutical trials.


Presented at the 2004 annual meeting of the Radiological Society of North America, Chicago, IL.
Address correspondence to T. T. Zacharia ([email protected]).


Park JO, Lee SI, Song SY, et al. Measuring response in solid tumors: comparison of RECIST and WHO response criteria. Jpn J Clin Oncol 2003; 33:533-537
Prasad SR, Saini S, Sumner JE, Hahn PF, Sahani D, Boland GW. Radiological measurement of breast cancer metastases to lung and liver: comparison between WHO (bidimensional) and RECIST (unidimensional) guidelines. J Comput Assist Tomogr 2003; 27:380-384
Tsuchida Y, Therasse P. Response evaluation criteria in solid tumors (RECIST): new guidelines. Med Pediatr Oncol 2001; 37:1-3
Therasse P, Arbuck SG, Eisenhauer EA, et al. New guidelines to evaluate the response to treatment in solid tumors: European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada. J Natl Cancer Inst 2000; 92:205-216
James K, Eisenhauer E, Christian M, et al. Measuring response in solid tumors: unidimensional versus bidimensional measurement. J Natl Cancer Inst 1999; 91:523-528
Altman DG. Practical statistics for medical research. London, England: Chapman and Hall, 1994:403-409
Trillet-Lenoir V, Freyer G, Kaemmerlen P, et al. Assessment of tumour response to chemotherapy for metastatic colorectal cancer: accuracy of the RECIST criteria. Br J Radiol 2002; 75:903-908
Rich JN, Reardon DA, Peery T, et al. Phase II trial of gefitinib in recurrent glioblastoma. J Clin Oncol 2004; 22:133-142
McHugh K, Kao S. Response evaluation criteria in solid tumours (RECIST): problems and need for modifications in paediatric oncology? Br J Radiol 2003; 76:433-436
Padhani AR, Ollivier L. The RECIST (Response Evaluation Criteria in Solid Tumors) criteria: implications for diagnostic radiologists. Br J Radiol 2001; 74:983-986
Warren KE, Patronas N, Aikin AA, Albert PS, Balis FM. Comparison of one-, two-, and three-dimensional measurements of childhood brain tumors. J Natl Cancer Inst 2001; 93:1401-1405
Saini S. Radiologic measurement of tumor size in clinical trials: past, present, and future. AJR 2001; 176:333-334
Prasad SR, Jhaveri KS, Saini S, Hahn PF, Halpern EF, Sumner JE. CT tumor measurement for therapeutic response assessment: comparison of unidimensional, bidimensional, and volumetric techniques initial observations. Radiology 2002; 225:416-419
Schwartz LH, Mazumdar M, Brown W, Smith A, Panicek DM. Variability in response assessment in solid tumors: effect of number of lesions chosen for measurement. Clin Cancer Res 2003; 9:4318-4323
Choi H, Charnsangavej C, de Castro Faria S, Tamm EP, Benjamin RS, Johnson MM. CT evaluation of the response of gastrointestinal stromal tumors after imatinib mesylate treatment: a quantitative analysis correlated with FDG PET findings. AJR 2004; 183:1619-1628
Tran LN, Brown MS, Goldin JG, et al. Comparison of treatment response classifications between unidimensional, bidimensional, and volumetric measurements of metastatic lung lesions on chest computed tomography. Acad Radiol 2004; 11:1355-1360

Information & Authors


Published In

American Journal of Roentgenology
Pages: 1067 - 1070
PubMed: 16554580


Submitted: January 8, 2005
Accepted: March 14, 2005


  1. cancer
  2. colon
  3. CT
  4. liver
  5. metastases
  6. oncologic imaging



T. Thomas Zacharia
Department of Radiology, Massachusetts General Hospital, Boston, MA 02114.
Present address: 4225 Ithaca St., Elmhurst, NY 11373.
Sanjay Saini
Department of Radiology, Massachusetts General Hospital, Boston, MA 02114.
Elkan F. Halpern
Department of Radiology, Massachusetts General Hospital, Boston, MA 02114.
James E. Sumner
Department of Radiology, Massachusetts General Hospital, Boston, MA 02114.

Metrics & Citations



Export Citations

To download the citation to this article, select your reference manager software.

Articles citing this article

View Options

View options


View PDF

PDF Download

Download PDF







Copy the content Link

Share on social media