|
|
||||||||
Original Research |
1 Department of Radiology, Massachusetts General Hospital, Boston, MA
02114.
2 Present address: 4225 Ithaca St., Elmhurst, NY 11373.
Received January 8, 2005;
accepted after revision March 14, 2005.
Presented at the 2004 annual meeting of the Radiological Society of North
America, Chicago, IL.
Abstract
|
|
|---|
MATERIALS AND METHODS. Thirty consecutive patients were recruited for this study. Patients were part of a multisite, randomized, double-arm, phase 3 clinical trial involving chemotherapy with an investigational drug for metastatic colon cancer. Patients were recruited from U.S. and international sites. Institutional review board approval was obtained, and informed consent was obtained from all patients. Our study included CT measurements of hepatic metastases. All patients (n = 30) had a minimum of five target lesions in the liver. Target-lesion size was defined by Response Evaluation Criteria in Solid Tumors (RECIST) criteria. We calculated the patient response at 2 months and at 6 months (complete response, partial response, stable disease, and progressive disease) using RECIST. Patient response was calculated based on the percentage increase or decrease at 2 and 6 months in the greatest diameter of the single largest lesion, two large lesions, three large lesions, four lesions, and five lesions, respectively. The concordance between five-target-lesion measurement and lesser numbers of lesions was analyzed using kappa statistics (StatView, 5.0).
RESULTS. In 93.33% of patients (n = 28/30), there was agreement on patient response irrespective of the number of measurements made on CT. Of these 30 patients, 47% had a partial response (n = 14/30), 43% had stable disease (n = 13/30), and 10% had progressive disease at 2 months (n = 3/30). At 6 months, 43% had a partial response (n = 13/30), 47% had stable disease (n = 14/30), and 10% had progressive disease (n = 3/30). Agreement in response evaluation between lesion groups for multiple measurements was high, with values of 1.0 for multiple-lesion measurements and 0.88 for single-lesion measurements at 2 months. The concordance values were the same at 6 months.
CONCLUSION. In the majority of patients with hepatic metastases of colorectal cancer, measuring the maximal diameter of the single largest lesion yielded the same treatment-response classification as measuring up to five target lesions. This result suggests that it may be possible to reduce the number of lesions measured in clinical trials.
Keywords: cancer colon CT liver metastases oncologic imaging
|
|
|---|
Prasad et al. [2] in a study of 86 breast cancer patients suggested that bidimensional measurements do not have added value when compared with unidimensional measurements in patient response evaluation.
Target lesions are the most important radiologic markers in oncologic evaluation. Target-lesion size criteria are defined by RECIST [4, 5]. Minimum target-lesion diameter should have double the CT slice thickness. This gives an accurate representation of the lesion diameter on CT, taking into account the potential errors caused by partial volume averaging.
RECIST defines a minimum of five target lesions per organ and a total of 10 target lesions per evaluation [4]. Nontarget lesions are those lesions that do not fulfill the size criteria and lesions such as bone metastases, pleural effusion, and ascites [4].
The criterion of five target lesions per organ on CT is an arbitrary one and is not supported by any objective evidence or study. We sought to evaluate the accuracy of this criterion by comparing the response after follow-up of a single-lesion measurement with measurement of five or a lesser number of lesions. Our goal was to test whether these measurements produce similar categoric results (stable disease, partial response, and so on).
|
|
|---|
Patient Categories
The patients were grouped into five categories based on the number of
lesions measured (measurement of single largest lesion, two largest lesions,
three large lesions, and so on). The same 30 patients were put into each
category based on the number of lesions measured so that each category had 30
patients. The responses obtained at 2 months and at 6 months by measuring
single-lesion measurement were compared with responses obtained by multiple
measurements.
Imaging
These patients were evaluated retrospectively by CT for the presence of
hepatic and lymph node metastatic lesions. Our patient subgroup (n =
30) included patients with a minimum of five hepatic metastatic lesions. All
these lesions fulfilled the target-lesion size criterion. The lesions were
well defined, nonconfluent, and nonnecrotic. They were arranged in the order
of the greatest diameter. Addition of the greatest diameters of these lesions
was performed (largest lesion diameter, sum of two largest diameters, three
diameters, and so on). Similar addition of measurements was done for the same
lesions at 2 months and at 6 months after the initiation of chemotherapy.
Measurements were independently performed by two radiologists who were blinded
to patient details. Intraobserver variability was tested at 2 months and at 6
months by giving test cases (n = 10) selected from the study
cohort.
A standard CT protocol was followed at the multiple sites where these patients were evaluated. The images were quality checked twice before evaluation to make sure that study protocol was followed at each individual study site. Images with variability in technique were not included in the study. All 30 patients underwent triple-phase CT evaluation with MDCT, and images with a slice thickness of 5 mm were obtained. Mechanical injection of 150 mL of nonionic iodinated contrast medium was made at a rate of 5 mL/sec. For the hepatic artery phase imaging, the scanning delay was 25 sec after the initiation of a contrast bolus. For the portal venous phase imaging, the scanning delay was 65 sec after the initiation of a contrast bolus.
Image Analysis
The sum of greatest diameters at baseline was compared with that at 2
months and at 6 months. Patients were categorized into those with complete
response (total disappearance of the lesions), partial response (30% decrease
in the sum of diameters of the lesions), progressive disease (20% increase in
the sum of diameters of the lesions), and stable disease (between 30% decrease
and 20% increase). Responses obtained using the single-largest-lesion diameter
were compared with those obtained using multiple lesions.
Statistical Analysis
The reviewed data were collected on a Microsoft Excel spreadsheet
specifically designed to allow an automatic recalculation of each patient's
response according to the following variables: criteria selected for the
definition of progression, maximum number of lesions considered, the sums of
greatest diameters of the lesions measured, bidimensional measurements, and
threshold values (in percent) of tumor size variations for response
assessment. The overall response of each recalculation (i.e., five-lesion
measurement in comparison with a lesser number [four, three, and so on] lesion
measurements) was then compared using kappa statistics. The measure of
concordance was calculated using StatView, 5.0 (SAS Institute). The 95%
confidence intervals for the kappa statistics were calculated. The kappa
values were interpreted on the basis of reports in the literature. Using this
test, concordance may be considered as very satisfactory for values
0.75
[6].
|
|
|---|
Statistical Analysis
The results of our study are shown in
Table 1 along with the
percentage agreement and the measure of concordance (kappa value) between the
two measurement sets. Agreement in response evaluation between lesion groups
for multiple measurements was high, with kappa values of 1.0 for
multiple-lesion measurements and 0.88 for single-lesion measurements at 2
months. The concordance values were the same at 6 months. Therefore, agreement
between the five-target-lesion measurements was very high for both
multiple-lesion measurements and single-lesion measurements. Concordance in
response evaluation between lesion groups for single and multiple measurements
is shown in Table 1 along with
the measure of correlation between the two lesion groups.
|
|
|
|---|
The revised RECIST criteria take into account five target lesions per organ and a maximum of 10 target lesions. RECIST criteria have been devised after extensive evaluation and research by the RECIST group, taking into account comments and suggestions from various investigators and study groups from all over the world.
The criteria defined by the RECIST group have been validated and supported by various studies [8-13]. The scientific basis of the criteria has been detailed by Therasse et al. [4] in their outline on modified RECIST.
Our study raises the possibility that the measurement of five target lesions per organ may not be required to give an accurate disease response evaluation. The response provided by the measurement of the single largest lesion was similar to the response obtained by multiple-lesion measurements. This single largest lesion was a well-defined liver lesion, was nonnecrotic, and showed good enhancement of the edges on CT. When using multiple measurements, the disease response was found to be stable disease in 43% of patients at 2 months and 47% at 6 months, partial response in 47% at 2 months and 43% at 6 months, and progressive disease in 10% at both time points. In these patients, agreement in disease response was not related to the number of lesions measured. In our study protocol, treatment remained the same for patients with stable disease and partial response. So the difference in treatment response using single measurement and multiple measurements did not alter the treatment decision.
We did not encounter a situation among patients in the multiple lesion categories in which there was discordance in response between lesionsthat is, one lesion grew while others shrank. So selecting a single target lesion did not grossly overestimate or underestimate response to treatment.
Several studies have tried simplification of the RECIST criteria. The issue of accuracy of unidimensional measurements when compared with bidimensional measurement has been studied by Prasad et al. [2] and James et al. [5]. Both studies emphasized the value of unidimensional measurements and postulated this type of measurement to be simple and cost effective. However, a study comparing the value of measuring a solitary well-defined lesion with multiple-lesion measurements has not been mentioned in the literature. All of these similar studies have used kappa statistics to compare the concordance between categories with unidimensional measurements and bidimensional measurements [1, 2, 5].
Our study simplifies the RECIST criteria further. This has significant clinical implications in the management of patients with cancer. Since 1981, oncologists and radiologists have depended on complicated methods devised by the World Health Organization. In 2000, the revised RECIST was defined and the simpler but validated criteria were devised. Our observation makes clinical evaluation of patients easier to perform. Clinical research trials can be made faster and more cost effective with a reduction in the number of lesions to be measured.
Current standards set by the RECIST group [4] state that all measurable lesions up to a maximum of five lesions per organ and 10 lesions in total, representative of all involved organs, should be identified as target lesions and recorded and measured at baseline. Target lesions should be selected on the basis of their size (those with the longest diameter) and their suitability for accurate repeated measurements. However, this standard of five target lesions per organ is an arbitrary one, and the authors do not explain why this number of lesions should be selected.
Some prior investigations have studied the number of lesions to be measured. Schwartz et al. [14] stated that measuring higher numbers of lesions will decrease the interobserver variability and intraobserver variability. In their study cohort, the variance decreased by at least 90% when six or more lesions were measured bidimensionally. However, the RECIST group has validated and incorporated unidimensional measurement into its new guidelines [4]. Our results depend on the validity of the RECIST criteria for complete response, partial response, stable disease, and progressive disease.
Recent studies have compared 18F-FDG PET to RECIST criteria. Choi et al. [15] suggested that 18F-FDG PET is more sensitive and specific than RECIST criteria in assessing early tumor response. They studied 173 lesions in 36 patients with gastrointestinal stromal primary tumors and found that changes in tumor size significantly underestimated the degree of treatment response compared with 18F-FDG uptake. In the patients who had stable disease according to RECIST criteria in their study (75% of patients), 70% had a 99% reduction in maximum standardized uptake value on 18F-FDG PET. There are conflicting data on the value of volumetric measurements. Prasad et al. [13] stated that volumetric measurements give disease response that is different from standard RECIST criteria in a large proportion of patients. However, Tran et al. [16] showed fair agreement between unidimensional and volumetric measurements. The sensitivity of 18F-FDG PET and volumetric measurements in comparison with RECIST needs to be validated by further studies.
The main limitation of our study was the low number of patients. The results of the study need to be further validated using a larger number of patients. Our radiologic response evaluation could not be compared with clinical patient response because the study is part of an ongoing double-arm oncologic trial involving conventional chemotherapeutic agents and an investigational drug in which the reviewers are blinded to clinical information of patients.
The majority of our patients did not show progression. A study of a similar nature needs to be performed in patients with progressive disease to verify whether our observations can be applied in the setting of a worsening clinical scenario.
This study suggests that the number of target-lesion measurements per organ proposed by the RECIST group can be reduced. Single-lesion measurements gave concordant disease response when compared with multiple-lesion measurements in 93.33% of evaluations. Response evaluation with two-large-lesion measurements gave 100% concordant results to response with five-target-lesion measurements. This fact has significant clinical implications and needs to be applied in routine oncologic practice and pharmaceutical trials.
|
|
|---|
This article has been cited by other articles:
![]() |
M H S E DARKEH, C SUZUKI, and M R TORKZAD The minimum number of target lesions that need to be measured to be representative of the total number of target lesions (according to RECIST) Br. J. Radiol., August 1, 2009; 82(980): 681 - 686. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. L. Hillman, M.-W. An, M. J. O'Connell, R. M. Goldberg, P. Schaefer, J. C. Buckner, and D. J. Sargent Evaluation of the Optimal Number of Lesions Needed for Tumor Evaluation Using the Response Evaluation Criteria in Solid Tumors: A North Central Cancer Treatment Group Investigation J. Clin. Oncol., July 1, 2009; 27(19): 3205 - 3210. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Suzuki, H. Jacobsson, T. Hatschek, M. R. Torkzad, K. Boden, Y. Eriksson-Alm, E. Berg, H. Fujii, A. Kubo, and L. Blomqvist Radiologic Measurements of Tumor Response to Treatment: Practical Approaches and Limitations RadioGraphics, March 1, 2008; 28(2): 329 - 344. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. H. Schwartz, D. M. Panicek, and M. Mazumdar Measuring hepatic metastases to colon cancer. Am. J. Roentgenol., November 1, 2006; 187(5): W552 - W552. [Full Text] [PDF] |
||||
![]() |
T. T. Zacharia and S. Saini Reply Am. J. Roentgenol., November 1, 2006; 187(5): W553 - W553. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |