AJR F and L Medical Products: Radiation Protection & More
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Dang, P. A.
Right arrow Articles by Dreyer, K. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dang, P. A.
Right arrow Articles by Dreyer, K. J.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Hotlight (NEW!)
Right arrow
What's Hotlight?
DOI:10.2214/AJR.07.3508
AJR 2008; 191:313-320
© American Roentgen Ray Society


Original Research

Extraction of Recommendation Features in Radiology with Natural Language Processing: Exploratory Study

Pragya A. Dang1, Mannudeep K. Kalra1, Michael A. Blake1, Thomas J. Schultz1, Elkan F. Halpern1 and Keith J. Dreyer1

1 All authors: Department of Radiology, Massachusetts General Hospital, 25 New Chardon St., Ste. 400E, Boston, MA 02114.

Received December 6, 2007; accepted after revision February 4, 2008.

 
K. J. Dreyer and T. J. Schultz receive royalties from patent licensing of the Leximer natural language processing engine to Nuance, which is the commercial vendor of the product. The other authors have no financial disclosure to make and had complete and independent access to the study data and the manuscript.

Address correspondence to K. J. Dreyer (kdreyer{at}partners.org).


Abstract
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
OBJECTIVE. The purposes of this study were to validate a natural language processing program for extraction of recommendation features, such as recommended time frames and imaging technique, from electronic radiology reports and to assess patterns of recommendation features in a large database of radiology reports.

MATERIALS AND METHODS. This study was performed on a radiology reports database covering the years 1995–2004. From this database, 120 reports with and without recommendations were selected and randomized. Two radiologists independently classified these reports according to presence of recommendations, time frame, and imaging technique suggested for follow-up or repeated examinations. The natural language processing program then was used to classify the reports according to the same criteria used by the radiologists. The accuracy of classification of recommendation features was determined. The program then was used to determine the patterns of recommendation features for different patients and imaging features in the entire database of 4,211,503 reports.

RESULTS. The natural language processing program had an accuracy of 93.2% (82/88) for identifying the imaging technique recommended by the radiologists for further evaluation. Categorization of recommended time frames in the reports with the 88 recommendations obtained with the program resulted in 83 (94.3%) accurate classifications and five (5.7%) inaccurate classifications. Recommendations of CT were most common (27.9%, 105,076 of 376,918 reports) followed by those for MRI (17.8%). In most (85.4%, 322,074/376,918) of the reports with imaging recommendations, however, radiologists did not specify the time frame.

CONCLUSION. Accurate determination of recommended imaging techniques and time frames in a large database of radiology reports is possible with a natural language processing program. Most imaging recommendations are for high-cost but more accurate radiologic studies.

Keywords: radiology practice • recommendations • recommended imaging techniques


Introduction
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Increased use of high-cost imaging examinations and escalating health care costs have directed attention in radiology practice at the frequency and types of recommendations and at radiologists' recommendations of additional imaging (self-referrals) [15]. A study [5] conducted with a small database showed frequent recommendations of expensive radiology examinations. Assessment of recommended techniques and time frames in a large database of radiology reports, however, can lead to more accurate estimation of recommendation practice. It also can provide solutions to several clinical and administrative concerns, such as the proportion of recommendations in radiology subspecialties, consistency between radiologists with regard to recommendation practices for similar patient and clinical attributes, increase or decrease in recommendation rates and patterns with evolving technology, type of recommended imaging techniques and time frames, and possible development of a decision support program to aid radiologists in making appropriate recommendations and monitoring compliance with the recommendations.

Analysis of recommendations by means of manual auditing of a large number of radiology reports would be time consuming and therefore most likely impractical. In this respect, results of a study [6] have validated the use of a natural language processing engine (Leximer, Nuance) for classification of electronic structured and unstructured radiology reports on the basis of the presence or absence of recommendations. In the version used in that study [6], the program was trained merely to determine the presence or absence of recommendations. The purposes of this study were to validate the natural language processing program for extraction of recommendation features, such as time frame and imaging technique, from electronic radiology reports and to assess patterns of recommendation features in a large database of radiology reports.


Materials and Methods
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
This study was approved by the human research committee of our institution and involved retrospective analysis of radiology reports. Because the data were deidentified, the institutional review board waived the requirement for informed consent from patients.

Validation Study
Results of a previous study [6] validated the natural language processing engine Leximer (Nuance) for categorization of radiology reports according to the presence of recommendations. Recommendations in radiology reports were defined as recommendations, requests, or suggestions for any further actions, such as imaging, clinical correlation, and surgical or pathologic correlation, in a specified or unspecified time frame.

In this validation study, the Leximer program was used to first categorize radiology reports for the presence or absence of recommendations in a database comprising 4,279,179 electronic radiology reports made from 1995 to 2004. Of these reports, 67,676 had incomplete or no text and therefore could not be processed with the program. These reports were excluded from data analysis. Thus a total of 4,211,503 reports were analyzed. This database comprised reports of all imaging techniques, including CT, MRI, radiography, fluoroscopy, nuclear medicine, sonography, angiography, special procedures, and unspecified imaging examinations.

From the database, one of the investigators selected 88 consecutive radiology reports with recommendations and 32 consecutive reports without recommendations from the year 2005. These 120 reports covered all of the aforementioned imaging techniques and were interpreted by 42 radiologists at our institution. Reports with and those without recommendations were randomized for evaluation by radiologists and with the natural language processing engine.

Report Analysis by Radiologists
To validate the accuracy of the natural language processing engine for classifying the recommendation features, two radiologists with 11 and 7 years of experience independently analyzed the 120 radiology reports. Each radiologist classified the reports into those with and those without recommendations. In addition, reports also were categorized according to recommended imaging technique and time frame. These two radiologists were not involved in the training of the natural language processing program, and they were not aware of the clinical records, previous radiology reports, or results of classification with the program.

Report Analysis with Natural Language Processing Engine
The natural language processing program was run to analyze the unstructured radiology reports by reducing the entropy or noise (data without much diagnostic value) and preserving the outcome or signal (data with meaning or intent) through use of natural language processing principles [6]. The program parsed specified signals or outcome, such as recommendations from other contents, through phrase-level extraction, text parsing (breaking text into smaller parts with punctuation-based phrase isolation through use of an internally developed parser), and syntactic algorithms (created to group phrases). The natural language processing principles are described in the report of the validation study [6].

In this study, the natural language processing engine was further trained to identify recommendation features such as recommended imaging technique (for example, CT, MRI) and time frame by further integrating the resulting signal phrases with syntactic extraction algorithms [6]. If the recommendation was "perform a CT in 3 months to assess for stability," then the recommended time frame was defined as 90 days, and the recommended imaging technique was defined as CT.

The natural language processing program was run to identify reports with recommendations and isolated sentences that had high signal for the recommendation concepts. These sentences were used to generate another statistical histogram for obtaining terms describing recommendations. Terms such as "recommend," "suggest," "follow up," "CT," "MR," "one year," and "6 weeks" obtained by the histogram as strong signals for the recommendation features were checked and verified by radiologists. This final list of terms was used for forming decision trees.

Separate rules were defined for this version of the program to extract specific recommendation features from the syntax of identified sentences. This classification was then validated with multiple iterations to achieve higher accuracy. One of the authors, who had not participated in the validation study, used approximately 2,000 radiology reports from examinations with different imaging techniques and 70 decision tree–optimizing iterations to train the program for extraction of recommendation features. These training sets of reports were selected to include all imaging techniques and body regions and were distinct from the set of reports used in the validation study.

With a computer program written in the C# programming language, a report was obtained from a text file and broken down into its composite elements or phrases (text parsing for phrase-level extraction), and these phrases were processed for signal extraction with the decision trees. The terms that matched the terms in the algorithms of the decision trees enhanced or diminished the likelihood that the phrase represented a recommendation signal. These algorithms are used to find matches between the phrases parsed with the natural language processing program (raw concepts) and leaf nodes in the decision tree. For example, if a report stated "MRI is recommended," "MRI" was the raw concept in the report that matched the leaf node "MRI" in the decision tree, which then rolls to "MRI" (parent node) and "Magnetic Resonance" and "imaging modality" (root node) in that order.

The tree is structured in such a way that generalized concepts such as "imaging modality" are at a higher level, "MR" is a middle-level concept, and specific concepts such as "MRI," "MR angiography," and "MR spectroscopy" are at a lower level, each level representing a branch. The tree is seven layers deep and has a base of 170 discrete nodes. The stemming algorithms increase the leaf-node hits within the radiology reports to more than 1,000. For example, terms such as "MR angiogram," "MR angiography," and "MR angiographic" may not match a specific leaf node but are mapped to "MR angio" with stemming algorithms. These stemming algorithms determine the hits at each leaf node while traversing upward and stopping at the root node.

Decision tree logic was used to normalize these terms to standard ontologies for statistical analysis. For example, all time frames recommended were converted to days. Different concepts used for the same imaging technique were grouped together. For example, MRI, MR, magnetic resonance imaging, and MR imaging all were grouped under MR.

The radiology reports selected for the validation study were assessed with the natural language processing program first for classification into two categories, reports with and those without recommendations. The reports with recommendations then were classified into those containing imaging recommendations and those containing nonimaging recommendations. The recommended imaging techniques extracted included CT, MRI, radiography, sonography, fluoroscopy, nuclear medicine, mammography, angiography, special procedures (such as myelography, ERCP, imaging-guided biopsy, other imaging procedures, and arthrography) and other, unspecified imaging techniques. Nonimaging recommendations included unspecified nonimaging recommendation, surgery, clinical correlation, pathology, histopathology, and endoscopy. The time frame identified was converted to number of days. When a report had a recommendation for the same day or within hours, the time frame was classified 0. No time frame specified was classified –1.

Recommendation Features in Entire Database
After the validation study, the entire database of 4,211,503 radiology reports from 1995–2004 was analyzed with the natural language processing program for recommendations, recommended imaging technique, and time frame. For construction of this database, all reports from the radiology information system (RIS) were transferred through a Health Level 7 link. This link is an automated interface for transferring all unstructured and structured radiology reports from the RIS to the database repository within 5 minutes of their entry in the RIS.

In 2001, radiologist order entry was introduced at our institution to assist physicians in ordering examinations. The radiologist order entry database has clinical indications and International Classification of Diseases 9 codes for the examinations performed. The radiologist order entry data were transferred to the database repository through an open database connectivity link, a standard database access method that allows access to data from any application. Therefore, the source of clinical indications for a study was the radiologist order entry component of our RIS.

The natural language processing program initially categorized the radiology reports on the basis of presence of imaging and nonimaging recommendations. The results of the analysis and other information from our RIS were stratified in the comprehensive database into data fields such as age group, patient type (inpatient or outpatient), referring physician, imaging technique, and clinical indication.

For reports with imaging recommendations, the patterns of the recommendation features, recommended imaging technique, and recommended time frame were determined for different age groups (0–9 years, 10–19 years, 20–29 years, 30–39 years, 40–49 years, 50–59 years, 60–69 years, and older than 70 years), radiology subspecialties (nuclear medicine, thoracic radiology, pediatric radiology, neuroradiology, emergency radiology, cardiac imaging, breast imaging, musculoskeletal imaging, and abdominal imaging), clinical indications (text obtained from the radiologist order entry, such as lung cancer and back pain), patient types (inpatient and outpatient), referring physicians (name obtained from radiologist order entry), and imaging techniques. Results of the natural language processing analysis were displayed and exported in the form of graphs and pivot tables with an online analytic processing server, which provided rapid results for various analytic multidimensional queries performed on the database. Temporal trends of the volume of imaging examinations and recommendation features such as recommended imaging technique and recommended time frame from 1995 to 2004 also were assessed.

Different radiologists in tertiary health centers often report different imaging techniques and examinations performed on different body regions. To ensure that the differences in recommended imaging techniques and time frames for different radiologists were not due to interpretation of different imaging techniques, we stratified the data for imaging techniques and body regions.

Data Analysis
SAS statistical software (version 9.1, SAS) and Excel spreadsheet software (Microsoft) were used to analyze recommendation features in the entire database of radiology reports. Interobserver agreement between the two radiologists was determined with the kappa test. The two radiologists' classifications were considered the standard of reference for determining the accuracy of classification of the recommendation features, such as recommended imaging technique and recommended time frame, with the natural language processing program. The program result was labeled inaccurate when the recommended technique or time frame was wrongly classified. For example, for a report with the recommendation "CT is recommended at 6 weeks," categorization of recommendation type as a non-CT technique or time frame other than 6 weeks was labeled inaccurate.

Statistical analysis of the data with multiple logistic regression tests was performed to determine the effect of predictor variables (age group, sex, imaging technique, radiology subspecialty, patient type, clinical indication, International Classification of Diseases 9 code) on the outcome variables (recommended time frame, recommended imaging technique). A Mantel-Haenszel chi-square trend test was used to determine significant differences in trends of radiology examination volumes and recommendations over the duration of the study. Bonferroni correction of the multiple logistic regression and Mantel-Haenszel trend tests were not performed because most p values were extremely small (p < 0.0001), possibly owing to exaggeration of the statistical difference in the large study sample used in our study.


Results
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Validation Study
Both radiologists found that among the 120 radiology reports used in the validation study, 88 reports had recommendations and 32 did not. On an individual report basis, there was perfect interobserver agreement ({kappa} = 1, p < 0.05) between the two radiologists for presence and absence of recommendations, recommended imaging technique, and time frame associated with the recommendations. As reported in our previous study [6], the natural language processing engine had an accuracy of 100% for classifying the reports on the basis of the presence and absence of recommendations.

According to both radiologists, the range of time frames recommended in the reports was day 0 (same day) (2.3%, 2/88) to 720 days (2.3%, 2/88). For most (73.9%, 65/88) of the radiology reports in the validation sample, however, the time frame for recommendation, whether nonimaging or imaging, was not specified. After unspecified time frame, 3 months (4.5%, 4/88) and 1 week (3.4%, 3/88) were the most frequently recommended follow-up time frames in reports with imaging recommendations.

The natural language processing engine accurately classified the recommended time frame in 83 (94.3%) of the 88 reports with recommendations. The time frame was inaccurate in five (5.7%) of the reports. Inaccurate detection of recommended time frame resulted mainly from the presence of numbers other than time intervals in the impression section of radiology reports. Such errors occurred in reports such as one with an impression section that read "3 cm rounded fluid-filled cyst in the right pelvis, likely physiologic ovarian cyst; recommend 6 week follow up ultrasound." For such reports the program falsely classified the time frame recommended as 3 weeks instead of 6 weeks. It misclassified a time frame as 12 years instead of 12 months for a report stating "follow-up CT chest is suggested in 12 months to ensure stability and complete 2 years of surveillance."

For recommended imaging technique, the program accurately classified 93.2% (82/88) and inaccurately classified 6.8% (6/88) of the reports. When imaging tests in addition to the recommended imaging technique were mentioned in the impression section, the program falsely classified the other technique as the recommended technique. For example, a report with the statement "hepatocellular carcinoma can be PET negative, and therefore continued follow up with MRI is advised" was misclassified, and the recommended imaging technique was labeled nuclear medicine rather than MRI. In a report that stated "CTA is recommended for further evaluation of the PCA, basilar artery, and the ophthalmic arteries, which are not visualized on the MR angiography," the program wrongly classified the recommended technique as MRI rather than CT, resulting in inaccurate classification. For a report in which sonography was referred to as a scan, the program falsely categorized the technique as a nuclear medicine recommendation because of the use of the term "scan."

Recommendation Features in Entire Database
A total of 348,689 radiologic examinations were performed in 1995 and 547,310 examinations in 2004 with an average annual increase of 5.2% ± 1.6%. The average annual increase in the volume of CT examinations was 14.6% ± 2.5% (30,852 examinations in 1995, 103,390 examinations in 2004), in MRI examinations was 26.0% ± 6.9% (7,513 to 54,237), and in sonographic examinations was 9.8% ± 1.4% (28,482 to 65,770). Most (87.5%, 3,683,901/4,211,503) of the radiology reports had no recommendations of any sort, imaging or nonimaging. Only 12.5% (527,602/4,211,503) of the radiology reports had recommendations for subsequent action.

Of the reports with recommendations, 71.4% (376,918/527,602) contained recommendations of further imaging and 28.6% (150,684/527,602) had nonimaging recommendations. Figure 1 summarizes the frequency of recommended imaging techniques. Patterns for recommended imaging techniques by age group (Fig. 2) showed that CT was the most frequently recommended imaging technique for patients older than 20 years and that radiography was most frequently recommended for younger patients. Among the reports of different imaging studies, there was a significant statistical difference between recommended imaging techniques (p < 0.001) (Table 1). Recommendations of nonimaging evaluation and CT were the most common types of recommendations regardless of the specialty of the ordering physician. Recommendations of different imaging techniques, including CT, MRI, nuclear medicine, radiography, and sonography, in reports from different radiology subspecialties are summarized in Table 2. Within each radiology subspecialty, there were differences between radiologists (p < 0.001) in rates of recommended imaging techniques and time frames, although the examination types reported were not different.


Figure 1
View larger version (13K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 1 Bar diagram shows percentage of reports (y-axis) with recommended imaging technique (x-axis) in 10-year database from 1995 to 2004. Most frequently recommended technique was CT (27.9% of reports) followed by MRI (17.8%), radiography (17.5%), and sonography (10.6%).

 

Figure 2
View larger version (13K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 2 Line graph shows distribution of imaging techniques recommended (y-axis) for different age groups (x-axis). Trend in recommendations of CT examinations increased with age. Recommendations of MRI remained stable among age groups. Recommendations of sonography showed decline after 40 years.

 

View this table:
[in this window]
[in a new window]

 
TABLE 1: Imaging Techniques Recommended in CT, MRI, Sonography, and Radiography Reports

 

View this table:
[in this window]
[in a new window]

 
TABLE 2: Frequency of Recommended Imaging Techniques by Radiologic Subspecialty

 

The most frequently recommended imaging techniques among inpatients and outpatients are summarized in Figure 3. The patterns for the recommendation features in radiology examinations performed for common presenting clinical indications are summarized in Table 3. The trends for recommended imaging techniques from 1995 to 2004 are summarized in Figure 4. There was a significant increase in recommendations of high-cost imaging examinations such as CT, MRI, and sonography (p < 0.0001).


Figure 3
View larger version (31K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 3 Pie charts show recommendation patterns in inpatient (A) and outpatient (B) radiology reports. Nonimaging recommendations were most frequent in both groups. Among imaging recommendations, radiography was more frequently recommended for inpatients than for outpatients.

 

View this table:
[in this window]
[in a new window]

 
TABLE 3: Commonly Recommended Imaging Techniques for Common Clinical Indications

 

Figure 5
View larger version (14K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 4 Line graph shows temporal trends for recommended imaging techniques (y-axis) from 1995 to 2004 (x-axis). Recommendations of CT, MRI, and sonography increased, but recommendations of radiography decreased.

 

In the reports containing imaging recommendations, the time frame recommended ranged from 0 to 1,825 days (same day to 5 years). Reports such as those on CT colonography had much longer intervals for performing follow-up imaging, such as "follow-up colonography at 3–5 years," than did other reports. There were also recommendations of mammographic screening in 5 years. However, recommended time frames were not specified in 85.4% (322,074/376,918) of the radiology reports with imaging recommendations. Only 14.6% (54,844/376,918) of the radiology reports contained a specified time frame, 6 months being the most frequent time frame recommended (28.6%, 15,703 of 54,844 reports with a specific recommended time frame). Irrespective of the age group, the time frames in most reports were not specified. Table 4 summarizes the commonly recommended time frames in the reports of different imaging examinations. Regardless of the radiologic subspecialty origin of the report, the most frequently recommended time frame was unspecified.


View this table:
[in this window]
[in a new window]

 
TABLE 4: Frequently Recommended Time Frames for Imaging Studies

 


Figure 4
View larger version (30K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
 

Discussion
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Natural language processing is an automated technique that processes and converts text in unstructured narrative documents into coded forms appropriate for computer-based analysis to extract specific information and to produce summaries. Natural language processing techniques have evolved over the last few decades and are being extensively applied in the field of medicine. Previous reports [614] have described the use of this technique for extracting specific information from clinical patient records and from radiology reports. Natural language processing has been used for automatic detection and extraction of information related to specific clinical conditions, such as acute bacterial pneumonia from chest radiography reports [7], stroke findings in neuroradiology reports [8], suspected tuberculosis in chest radiography reports [9], and all findings in cancer-related radiology reports [10].

Natural language processing has been used to identify clinical information in radiology reports and to map them to structured representations containing medical terms [11]. It also has been used to automatically create an enriched document containing structured components obtained from free-text reports [11]. The document provided reliable and efficient access to clinical information in patient reports for a broad range of clinical applications. The natural language processing program Leximer has been used to assess the presence or absence of findings and recommendations in a database of radiology reports [6]. Extraction of clinical information from reports can help in a variety of applications, such as quality assessment, clinical research, and development of decision support guidelines. Because natural language processing makes it possible to analyze millions of documents in a matter of few hours, it is a practical and appealing method of data mining for relevant clinical information.

Studies [6, 7, 11, 12] have shown that natural language processing is an accurate technique for assessing unstructured free-text clinical reports and documents with a positive predictive value (precision) of 76.0–99.4%, sensitivity (recall) of 70.0–98.2%, and specificity of 85.0–99.9%. The results of this study show that the current version of the natural language processing program Leximer is accurate for classifying radiology reports on the basis of recommended imaging techniques (accuracy, 93.2%) and recommended time frames (accuracy, 94.3%).

Results with the natural language processing program revealed that most radiology reports with imaging recommendations (85.4%) did not have a specified time frame irrespective of patient age, clinical indication, radiology subspecialty, and imaging technique. Many reports with unspecified time frames had suggestions for follow-up or repeated imaging based on clinical correlation or course of disease. As of this writing, the natural language processing program does not discriminate whether an unspecified time frame is for follow-up imaging to assess disease progression or for obtaining additional information to help with diagnosis. In the latter case it is conceivable that the radiologist intends to perform imaging at the earliest convenience. In the former scenario, it is unlikely that immediate imaging is intended, and some referring physicians may prefer that the radiologist specify the time frame in which follow-up would be relevant.

More than one fourth (28.6%) of the overall recommendations in our study were for nonimaging follow-up or evaluations. Nonimaging recommendations contributed to more than one third of all recommendations in reports of patients younger than 20 years and only one fourth of all recommendations for patients older than 50 years. This finding underscores the importance of the most recent version of the natural language processing program versus the previous version, which only discriminated presence or absence of recommendations. For cost studies of recommendation practices, the version with recommendation feature extraction can offer substantial advantages.

Among reports with imaging recommendations, fewer than 10% of radiology reports lacked recommendations of specific imaging studies. Approximately one fourth of all recommended imaging techniques were CT. Over the 10-year period of the database, there was an average annual increase of 14.6% in the volume of CT examinations performed at our institution and an 18.7% increase in recommendations of CT. Rates of recommendation of CT, however, were lower for children and younger adults ({approx} 10%) compared with those for patients older than 50 years ({approx} 20%). Nonetheless we noticed that radiology reports of CT examinations were accompanied by a high rate of follow-up CT. Compared with recommendation of most other imaging techniques, higher recommendation rates for CT were found in most radiology subspecialties and age groups. Substantial differences in recommending CT also were found between different radiologists in a given subspecialty. Use of the natural language processing program helped identify clinical indications with high rates of recommendations of CT, such as lung malignancy and abdominal or pelvic pain. High rates of recommendation of CT may have implications on risk associated with radiation dose because collective or cumulative radiation dose increases with multiple CT and radiation-based examinations.

High-cost imaging tests, such as CT, MRI, and sonography, constituted approximately 56% of all recommended imaging techniques and 40% of all recommendations in our database. Compared with 1995, in 2004, there was a 14.4% ± 1.5% increase in the volume of these examinations and an increase of 21.0% ± 4.2% in overall recommendations of CT, MRI, and sonography. An interesting finding was that there also was a decrease in the growth of recommendations of radiography starting in 1998. These patterns may be due to increasing perception or experience among radiologists about the role of CT, MRI, and sonography in obtaining additional information in view of remarkable technologic advances in these techniques. The findings raise concern, however, about possible inappropriate use of these expensive imaging techniques. The integral consulting role of radiologists and the desire and possibly the expectation of some referring physicians for clinical management guidance from radiology reports must be borne in mind. A similar trend toward an increase in use of CT studies and the effect on increasing radiation exposure has been described [3].

There were considerations associated with the natural language processing program used in our study. To the best of our knowledge, the program has not been validated for analysis of reports from multiple imaging centers. It also is not integrated with the clinical or health information system. Therefore, it cannot be used to gather critical clinical information to gauge the effect of imaging recommendations on patient care and management. Furthermore, the program lacks the ability to track reports for distinct medical record numbers. For example, for an individual patient, it does not find how many examinations were performed and what the recommendation features were. This lack of individual information makes it difficult to infer whether a patient or physician has complied with a recommended technique and the time frame for follow-up or repeated imaging. Another limitation of the program is inability to categorize reports with periodic or multiple follow-up recommendations. Multiple interval or periodic recommendations in individual radiology reports are counted as a unit recommendation. This limitation might have led to underestimation of recommendation rates in our study.

A limitation of our study was that we included reports of all imaging techniques but did not perform power analysis to determine the number of cases necessary for accurate validation. In our study, the standard error of accuracy in analysis of 120 reports was 3.3%. Thus even at the lowest limit of the confidence interval (–3.3%), the accuracy would still be close to 90% for both time frame recommended and recommended imaging technique. Standard error of the accuracy estimated with 120 reports depends largely on the number of subjects and reports studied and not on the size of the population or number of reports to which it is applied. Thus an estimate based on analysis of 120 reports is valid regardless of the number of reports to which it was applied, such as numbers as large as the more than 4 million reports used in our study.

Another limitation of our study was that it was a retrospective analysis of radiology reports. Our study was limited in that it represented trends in radiology reports from a single institution. Another consideration may be that we did not determine the effect of the imaging recommendations. Whether an imaging recommendation materialized into an actual examination or was followed up in the recommended time was not assessed. We also did not determine how physicians interpreted a particular recommendation in cases of unspecified imaging requests.

A study of recommendation features for different clinical indications and patient demographic features may aid in assessment and establishment of recommendation policies for varied clinical and patient attributes. It also may help radiologists consistently follow recommendation guidelines for various clinical indications and patient demographic features.

The version of the natural language processing program used in this study is accurate for determining specific recommendation features, such as imaging technique and time frame, in large databases of radiology reports. Further studies can help to determine the best and most conclusive imaging technique as the first study in the presence of clinical indications associated with high recommendation rates to restrict, if possible, the risk and costs of multiple serial imaging studies. Assessment of recommendation features with the natural language processing program may help radiologists limit inconsistencies in recommendation practices regarding an imaging examination and its timing.


References
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 

  1. Bodenheimer T. High and rising health care costs. Part 2. Technologic innovation. Ann Intern Med2005; 142:932 -937[Abstract/Free Full Text]
  2. Bhargavan M, Sunshine JH. Utilization of radiology services in the United States: levels and trends in modalities, regions, and populations. Radiology 2005;234 : 824-832[Abstract/Free Full Text]
  3. Brenner DJ, Hall EJ. CT: an increasing source of radiation exposure. N Engl J Med 2007;357 : 2277-2284[Free Full Text]
  4. Baumgarten DA, Nelson RC. Outcome of examinations self-referred as a result of helical CT of the abdomen. Acad Radiol1997; 4:802 -805[CrossRef][Medline]
  5. Blaivas M, Lyon M. Frequency of radiology self-referral in abdominal CT scans and the implied cost. Am J Emerg Med 2007; 25:396 -399[CrossRef][Medline]
  6. Dreyer KJ, Kalra MK, Maher MK, et al. Application of recently developed computer algorithm for automated classification of unstructured radiology reports: validation study. Radiology2005; 234:323 -329[Abstract/Free Full Text]
  7. Fiszman M, Chapman WW, Aronsky D, Evans RS, Haug PJ. Automatic detection of acute bacterial pneumonia from chest x-ray reports. J Am Med Inform Assoc 2000; 7:593 -604[Abstract/Free Full Text]
  8. Elkins JS, Friedman C, Boden-Albala B, et al. Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language processing and manual review. Comput Biomed Res 2000; 33:1 -10[CrossRef][Medline]
  9. Jain NL, Knirsch CA, Friedman C, Hripcsak G. Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports. Proc AMIA Annu Fall Symp1996 : 542-546
  10. Mamlin BW, Heinze DT, McDonald CJ. Automated extraction and normalization of findings from cancer-related free-text radiology reports. Proc AMIA Annu Fall Symp 2003:420 -424
  11. Friedman C, Alderson PO, Austin JH, et al. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc 1994; 1:161 -174[Abstract/Free Full Text]
  12. Meystre S, Haug PJ. Natural language processing to extract medical problems from electronic clinical documents: performance evaluation. J Biomed Inform 2006;39 : 589-599[CrossRef][Medline]
  13. Hripcsak G, Austin JH, Alderson PO, Friedman C. Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology2002; 224:157 -163[Abstract/Free Full Text]
  14. Hripcsak G, Friedman C, Alderson PO, et al. Unlocking clinical data from narrative reports: a study of natural language processing. Ann Intern Med 1995; 122:681 -688[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
RadiologyHome page
C. L. Sistrom, K. J. Dreyer, P. P. Dang, J. B. Weilburg, G. W. Boland, D. I. Rosenthal, and J. H. Thrall
Recommendations for Additional Imaging in Radiology Reports: Multifactorial Analysis of 5.9 Million Examinations
Radiology, November 1, 2009; 253(2): 453 - 461.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Dang, P. A.
Right arrow Articles by Dreyer, K. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dang, P. A.
Right arrow Articles by Dreyer, K. J.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Hotlight (NEW!)
Right arrow
What's Hotlight?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS