|
|
||||||||
Special Reports |
1 From the Department of Radiology, MC 2026, University of Chicago, 5841 S Maryland Ave, Chicago, IL 60637 (S.G.A.). Affiliations for all other authors and the members of the Lung Image Database Consortium Research Group are listed at the end of this article. Received December 17, 2003; revision requested February 16, 2004; revision received March 11; accepted March 16. Supported in part by USPHS Grants U01CA091085, U01CA091090, U01CA091099, U01CA091100, and U01CA091103. Address correspondence to S.G.A. (e-mail: s-armato@uchicago.edu).
| ABSTRACT |
|---|
|
|
|---|
© RSNA, 2004
Index terms: Computers, diagnostic aid Lung, CT, 60.1211 Lung, nodule, 60.31
| INTRODUCTION |
|---|
|
|
|---|
The investigation of CAD techniques for chest radiography has a long history (2). Throughout this time, special emphasis has been placed on automated analyses of lung nodules within the complex background of superimposed anatomic structures that results when the three-dimensional human body is projected onto a two-dimensional image. Computed tomography (CT) generates images in a manner that eliminates the superposition of anatomic structures; the trade-off, however, is that much more image data per patient are acquired for radiologist interpretation.
| MOTIVATION TO FORM A DATABASE |
|---|
|
|
|---|
The development of CAD for use in the evaluation of lung nodules on CT images has accelerated in recent years. These CAD methods may generally be divided into the categories of nodule detection and classification. A number of researchers around the world have been developing computerized nodule detection techniques (1226). These groups have incorporated a variety of approaches, including gray-level threshold methods (12,16,20,2326), fuzzy clustering algorithms (13), spatial filtering (15), template matching (22), object-based deformation procedures (17), morphologic analysis (18), and model-based techniques (14,21). The computerized classification of lung nodules as cancerous or noncancerous has also received attention from a number of groups that used various combinations of shape and gray-level distribution characteristics (2737). Some of these groups use nodule features depicted on a single CT scan for classification, while others capture information regarding change in nodule features over time, as demonstrated with multiple CT examinations in the same patient.
Although the development of CAD methods for lung nodules on CT scans has accelerated, all of these diverse approaches have one common constraint: Access to well-characterized image data is limited. This deficiency represents a fundamental limitation to CAD research in this area. Consequently, it is difficult to develop and test CAD methods in a robust and reliable fashion.
CAD Requirements
Two fundamental requirements common to all CAD research are patient image data and a definition of "truth" for the specific task. For investigators not affiliated with medical centers, access to image data requires collaborative research agreements that may be hindered by federal regulations that govern the transmission of patient data, including images, to outside institutions (38). These investigators may choose to apply their talents to nonmedical problems, or they may confine their investigations to simulated image data with limited real-world application. Even for investigators associated with medical centers, the seemingly ready supply of clinical image data that might be reaped for research purposes is not without its barriers: The task of identifying and collecting appropriate images for any specific research activity can be a laborious process. In addition, the important need to secure patients consent for use of their images and clinical data in research can consume additional time and resources.
Once a set of appropriate images has been assembled, a medical opinion concerning the task-specific truth must be rendered on the content of those images. For example, investigators developing automated lung nodule detection methods will require the opinion of an experienced radiologist regarding the presence and precise location of nodules on the CT scans. More appropriately, a panel of experienced thoracic radiologists would be used to establish the truth for the nodule detection task, since the variability among radiologists in the detection of lung nodules is substantial (39). Truth for other CAD tasks will require other data, such as follow-up CT scans that enable radiologists to evaluate nodule growth, pathology reports, or radiologist-drawn nodule outlines. The collection of this information can be time consuming. Furthermore, the notion of a single truth in any particular instance is a fallacy; differences of opinioneven among experienced subspecialty radiologistsare a reality, and the resulting variability in the truth must be understood and appreciated by CAD investigators.
The task of gathering image data sets along with associated truth is expensive, and when different research groups use separate databases, a reliable comparison of CAD methods reported in the literature is impossible. The increasing need for CAD in the clinical practice of radiology lends urgency to the creation of common image databases with established truth to facilitate a direct comparison of CAD methods.
Origins of the Lung Image Database Consortium
The role of imaging in science is expanding, and consequently, the need for appropriate image-archiving and image-sharing mechanisms is evolving (40,41). Recognizing that the development of CAD methods by the imaging research community would be facilitated and stimulated through access to a well-characterized repository of CT image data, the National Cancer Institute (NCI) released a request for applications in April 2000, entitled Lung Image Database Resource for Imaging Research, as a U01 funding mechanism (also known as a cooperative agreement). The intent of this initiative was "to support a consortium of institutions to develop consensus guidelines for a spiral CT lung image resource, and to construct a database of spiral CT lung images" (42). Through the request for applications, the NCI recognized that "the generation of standardized databases requires the development of consensus on many issues related to database design, accessibility, metrics and statistical methods for evaluating image-processing algorithms" (43). From among the applicants who submitted grant applications in response to the request for applications, the following five institutions were selected through a peer-review process to form the Lung Image Database Consortium (LIDC): Cornell University; the University of California, Los Angeles; the University of Chicago; the University of Iowa; and the University of Michigan. The LIDC web site may be found at cip .cancer.gov.
The five-member consortium represents access to an expanded pool of CT data for CAD research. The LIDCs charge was not just to collect CT scans with lung nodules; this consortium has spent a great deal of effort thus far to harness the distinct experiences and divergent opinions of the member institutions and other participating individuals to provide a solid foundation for a robust database that will meet the anticipated needs of CAD investigators. Toward this end, the LIDC has identified a number of critical technical and clinical issues that must be addressed to ensure the effort of creating the database is properly focused. This foundation-laying process has been evolutionary in nature, as every issue raised has spawned multiple other issues for consideration. This article is intended to share with the community the breadth and depth of several of these key issues.
| MISSION STATEMENT |
|---|
|
|
|---|
To develop an image database as a web-accessible international research resource for the development, training, and evaluation of CAD methods for lung cancer detection and diagnosis using helical computed tomography (CT).
The database should enable the correlation of performance of CAD methods for detection and classification of lung nodules with spatial, temporal and pathological ground truth.
These two statements convey the intended public accessibility of CT image data and truth data contained within the database to facilitate all aspects of lung nodule CAD research.
To provide a true research resource, the database must contain more than images. Consequently, the database will consist of an image repository and an associated relational database in which nodule features (eg, radiologist outlines, subjective subtlety ratings, and lobar location); technical parameters of the scan available in the digital imaging and communications in medicine, or DICOM, header (eg, exposure rate, reconstruction algorithm, and scanner model); and patient information (eg, age, sex, smoking history, and any available diagnostic information, such as the results of follow-up studies or pathologic examinations) are recorded. The relational database component will give users of the database the ability to extract customized image subsets based on search results.
| INCLUSION CRITERIA |
|---|
|
|
|---|
A distribution of nodule size, radiologic pattern, subtlety, abnormalities, and anatomic location will be sought by means of monitoring case accrual. Similarly, we will attempt to maintain a distribution of patient demographics. Clearly, strict adherence to target distributions established a priori will limit the overall number of scans that may be accrued in the database; therefore, scan quantity will have priority over any target distributions of patient or nodule characteristics.
Scan Inclusion Criteria
CT scans from both diagnostic and screening studies will be included in the database. Ideally, we would prefer to include the complete radiologic history of a patient, beginning with the screening study at which any nodule was first demonstrated and encompassing all subsequent diagnostic follow-up studies. Scans acquired during and after any prescribed treatment would complete the history. While the screening study will certainly demonstrate the complete thoracic anatomy, subsequent studies may be limited to the anatomic location of a suspected nodule. These limited-anatomy scans will be included in the database; although such scans will be irrelevant for computerized detection methods, they will certainly benefit computerized classification methods. We recognize that including the complete history of a patient will often be difficult in practice; the database, therefore, may contain any combination of screening scans, full-thorax diagnostic scans, and limited-anatomy diagnostic scans for any specific patient.
In an effort to increase the yield of scans from the five institutions, the database will contain both prospective and retrospective cases. To maintain the technical relevance of the database, however, scans with reconstruction interval or section thickness greater than 5 mm will be excluded. No requirements with regard to scanner pitch, exposure, tube voltage, or reconstruction algorithm will be imposed. To achieve a robust database, the constituent scans will represent a variety of technical parameters from a variety of scanner models.
Image quality is an issue that has been difficult to define. Since image artifacts that are caused by patient factors, such as respiratory motion, and scanner factors, such as beam hardening, are a reality in medical imaging, CAD developers will need to contend with their presence. Consequently, while we will not deliberately fill the database with scans that contain severe artifacts, scans with high levels of noise or with streak, motion, or metal artifacts will be included. When such scans are included, however, they will be denoted by a "marginal" or "unacceptable" rating in the image quality field. The overall image quality of a scan, including the presence of artifacts and other factors, will be assessed by an LIDC radiologist and recorded in the relational database so that investigators may explicitly exclude or include images with marginal or unacceptable image quality from an image dataset.
Since the database is intended to reflect the realities of clinical practice, it will not comprise only pristine cases that contain only nodules. Nodule cases will be included despite the presence of other abnormalities on the scan, unless the other abnormality is spatially contiguous with the nodules and substantially interferes with visual interpretation. The database will be augmented by a collection of "normal" cases (eg, scans that do not contain nodules but may or may not contain other abnormalities), the number of which has yet to be determined.
Nodule Inclusion Criteria
The diameters of nodules (both calcified and noncalcified) included in the database will not exceed 30 mm, which is consistent with upper limits of nodulesize found in the literature (44). Furthermore, the minimum effective diameter for included nodules has been set at 3 mm. Some investigators maintain that nodules smaller than 5 mm may be of limited clinical importance, while others contend that since the benefits of detecting small lung cancers are presently unknown, to exclude from consideration nodules in the 12-mm size range may limit the relevance of the database. The 3-mm lower limit for the database strikes a compromise between these views and also takes into consideration the practical issue that all lesions identified as nodules in the database will require effort to define spatial location and extent and to follow through subsequent examinations. As a further compromise, the presence of nodules smaller than 3 mm that are suspicious for cancer will be indicated, but the spatial extents of these small nodules will not be defined.
The nodules in the database may be primary lung cancers, metastatic disease, or a noncancerous process. An upper limit on the number of nodules per scan, however, has been set at six as a general guideline. Although this number is arbitrary, scans with more than six noncalcified nodules are much less likely to represent primary lung cancer (5). This upper limit on the number of nodules again takes into consideration the practical issue that all lesions identified as nodules in the database will require effort to define spatial location and extent and to follow through subsequent examinations.
Perhaps the most important nodule inclusion criterion is that the nodules on the scans must be nodules. Exactly what this means has been the subject of extensive discussions and will be described in the next section.
| NODULE DEFINITION |
|---|
|
|
|---|
While these words provide an idealized definition of a lung nodule demonstrated on CT scans, the natural complexities of biologic systems render the practical application of such a definition more difficult. For example, most would agree that the lesion shown in Figure 1 satisfies the previous definition of a nodule. Moreover, the lesion shown in Figure 2 appears to satisfy the previous definition of a nodule when that single section is considered in isolation. A lesion with an axial dimension greater than approximately the reconstruction interval used in the CT examination will, of course, appear in more than a single section. When the complete extent of the lesion demonstrated in the single section of Figure 2 is considered (Fig 3), the more complicated nature of this lesion becomes apparent. So the question remains, is this lesion a nodule?
|
|
|
|
| TRUTH ASSESSMENT |
|---|
|
|
|---|
Spatial Location and Extent Estimates
A qualifying nodule is defined, for the purposes of the LIDC database, as a nodule that satisfies the nodule inclusion criteria discussed previously; namely, it has an effective diameter of 330 mm and is considered to lie within the nodule spectrum. Each scan selected for the database (with the exception of the cohort of "normal" scans described in the Scan Inclusion Criteria section) will have at least one but, as a general rule, not more than six qualifying nodules. The outlines for each qualifying nodule will be obtained to record not only the position (eg, a centroid calculated from the radiologist outline) but also the spatial extent of the lesions that are the focus of the LIDC database. In addition, the approximate centers of nodules that are smaller than 3 mm and suspicious for cancer will be indicated by the LIDC radiologists; outlines for these small nodules will not be recorded.
A panel of thoracic radiologists will independently construct outlines encompassing each qualifying nodule. The necessity of this task for the completeness of the database and for the benefit of investigators who will use it was debated by the LIDC Steering Committee on several occasions. The decision to supply this information generated two important issues for consideration: First, the manner in which nodule outlines would be obtained must be established. Second, the extent of interradiologist variability should be anticipated. Recent research has convinced the LIDC to expect substantial interreader variability in both the detection and the outlining of nodules (39).
The task of outlining each nodule in every section in which it appears for every CT scan selected for the database is certainly daunting. Placement of rectangular bounding boxes around each nodule was initially considered as a means by which spatial extent information could be obtained; however, while such an approach would be useful for investigators developing nodule detection techniques, the LIDC recognized that nodule classification and segmentation (and perhaps other research interests) would not be served by such coarse information. With the need for nodule outlines established by means of consensus, the next decision was whether these outlines must be constructed manually or whether semiautomated methods could be used. The trade-off is between time and bias: Manual outlines would require a great deal of radiologist time, but they would reflect the unbiased expertise of radiologists. Semiautomated outlines are expected to improve the reproducibility of nodule outlines and require a smaller investment of radiologist time, at least in some cases, but may exhibit bias introduced by the specific computer method used. Moreover, some may object to the development of computerized methods that are based on a truth established with the aid of another computerized method. The LIDC, through a series of pilot studies, is currently investigating the appropriateness of manual and semiautomated outlining techniques for the database.
The anticipated degree of interradiologist variability in the outlining task was a concern for the LIDC. Specifically, we were concerned that inconsistencies among the outlines constructed by different radiologists might render the resulting spatial extent estimates impractical for use by CAD researchers. Consider the nodule outlines in Figure 5. One radiologist constructed an outline that captured the core solid component of the nodule, while another radiologist independently constructed an outline that also encompassed the nonsolid component of the nodule. Both outlines, however, may be considered correct depending on the specified task, and both represent a valid interpretation of the nodule boundary.
|
An issue considered by the LIDC is the extent to which manual or semiautomated nodule outlines that are based on the CT scan actually represent the boundary of the physical nodule. This concern is greatest for outlines obtained from the upper- and lowermost sections that depict a nodule, since these sections exhibit the most pronounced partial volume effect because of the finite and anisotropic CT point spread function. Computerized nodule segmentation methods that might be developed by users of the database may be designed to compensate for these partial volume effects; the evaluation of such methods would then be affected adversely by spatial extent estimates that are based on radiologic appearance. Nevertheless, the consensus of the LIDC was that the most appropriate nodule outlines for the database would encompass all "abnormal pixels" that either belong to or are derived from a nodule. The opinion and experience of each radiologist will dictate how he or she accommodates adjacent vessels, airways, or pleura.
In addition to nodules, all abnormal lesions or foci of abnormality that are larger than 3 mm will be identified by recording the approximate spatial location of the center of the lesion. As discussed previously, in addition to nodules, scans selected for the database may contain other focal abnormalities that do not lie within the nodule spectrum or disease that may not even be considered a focal abnormality, such as emphysema. The coarse identification of such regions that do not correspond to normal anatomic structures will provide a more complete inventory of pulmonary abnormalities on each scan and allow for a more thorough evaluation of CAD methods. To illustrate the latter point, a nodule detection scheme may identify as a nodule a lesion that is actually a scar. While such a finding would be counted as false-positive with regard to the specific nodule detection task, it is distinctly different from false-positive findings caused by pulmonary vessels, since the scar is indeed an anomaly. In fact, investigators may wish to use the complete truth assessment that will be offered with the database to explicitly develop CAD algorithms that can be used to identify scars or classify abnormal findings as a scar or a nodule.
Verified Diagnosis
Every effort will be made to obtain a verified diagnosis for each qualifying nodule as either cancerous or noncancerous when it becomes available through chart review. This information will be essential for investigators developing automated lung nodule classification techniques. In addition, the pathologic subtype of nodules (both cancerous and noncancerous), which is based on the basis of the World Health Organization classification (46), will be extracted from patient charts and included with the database. To confirm the patient records, the pathology reports for all patients and any available histopathologic slides or surgical specimens for a subset of the patients (5%) will be independently reviewed by the LIDC panel of experienced pathologists. We recognize that many scans may not have available pathologic information; furthermore, we recognize that not all nodules on any one scan will have such information. The database will document the scans and nodules for which a verified diagnosis has been obtained with fine-needle aspiration biopsy, surgical resection, or extended radiologic observation in which no growth is demonstrated over a 2-year period (47).
| PROCESS MODEL |
|---|
|
|
|---|
The process model details the steps required to translate a CT scan acquired for the clinical evaluation of a patient into a viable element of the database. With appropriate local institutional review board approval, radiologists at each of the five institutions will identify thoracic CT scans from their clinical caseload or image archive, which may include images acquired as part of a research study such as the Early Lung Cancer Action Program (5) or the National Lung Screening Trial (48) that meet the previously discussed inclusion criteria established by the LIDC. Radiologists will then evaluate image quality and any artifacts that may be present. Informed consent procedures will also be followed, as required by each institutional review board.
Identification of a candidate scan will set in motion a sequence of events that begins at the local institution and then extends to the other four institutions. The scan will be transferred to the local research computer, where it will be catalogued either manually by a local data manager or automatically through a local DICOM receiver. This step is critical to ensure that the local institution has the ability to monitor any subsequent imaging or pathologic data acquired for that patient. Software will be applied so that all protected health information contained in the DICOM receiver header of the image will be removed in accordance with Health Insurance Portability and Accountability Act, or HIPPA, guidelines (38).
This scan then will be made available to the other four institutions for assessment of spatial location and extent. Rather than a forced consensus panel approach, the LIDC has opted for a panel that uses a combination of blinded and unblinded reviews by multiple radiologists (eg, one radiologist at each of the other four institutions) to establish estimates of nodule spatial location and extent. The blinded and unblinded reviews are both part of the same process in which radiologists attempt to identify, as completely as possible, all nodules on a scan. In this approach, the designated radiologist at each site first performs a blinded review of the scan by identifying the spatial location and radiologic characteristics of all abnormalities on the scan that are larger than 3 mm, as measured with electronic calipers. Nodules smaller than 3 mm that the radiologist deems suspicious for cancer are also identified.
Other information collected for each lesion will include lesion type (eg, scar or nodule), the radiologists subjective level of confidence that the lesion represents a focal abnormality in general or a nodule more specifically, radiologic texture (eg, solid, part solid, or nonsolid) if considered a nodule, a five-point lesion subtlety score (ranging from "obvious" to "extremely subtle"), presence of calcifications, and lobar location. Furthermore, outlines will be constructed for all qualifying nodules on calibrated monitors with magnification capabilities and an initial window and level setting of 1500 HU and 500 HU, respectively, which may then be adjusted by the radiologist on the basis of individual preference. Additional subjective assessments of characteristics such as shape and margin will be recorded. If the patient has one or more scans already in the database, a reconciliation of lesions will be performed to identify the same lesion across multiple scans. These data will be recorded in the fields of the relational database. During this initial review, each radiologist will interpret the scan independently from the radiologists at the other four institutions; hence, we use the term blinded review.
Once the four radiologists (each from a different institution) have performed the blinded review, the results of the blinded review of each radiologist will be made available to all of the other radiologists who reviewed the scan. Each radiologist will then perform an unblinded review of the scan with the additional information provided by the other radiologists. During this unblinded review, the radiologists will review all marked structures (eg, their own markings, as well as the markings of the other radiologists who reviewed the scan) and decide whether to include each marked structure as a nodule. It is important to note that a forced consensus will not be imposed; rather, all of the nodules indicated by the reviewing radiologists will be tallied and recorded in the database.
Information obtained from all radiologists during both blinded and unblinded reviews will be included in the database to provide a rich source of data for investigators. For example, nodules recorded by only two of the radiologists during the blinded review will constitute a different detection target than nodules initially identified by all radiologists during the blinded review. Even more interesting might be nodules recorded by only two of the radiologists during the blinded review and then recorded by only the same two radiologists during the unblinded review, which implies that other radiologists observed this structure and declined to consider it a nodule. As another example, the spatial extent of a nodule may be described in probabilistic terms on the basis of the number of radiologists outlines that encompass each pixel.
| ASSESSMENT METHODS |
|---|
|
|
|---|
An unbiased estimate of algorithm performance and a meaningful comparison of CAD methods necessitates the consideration of a number of important factors. The LIDC research group will inherently establish case distribution, lesion subtlety, and truth assessment. Other factors, such as the effect of scoring methods on reported CAD performance (50), the appropriateness of different training and testing paradigms (51), and the proper use of various task-dependent evaluation metrics, will remain at the discretion of the investigators who use the database. An initial survey of some of these essential issues as they pertain to lung nodule detection in CT has been reported recently by Dodd et al (49); the LIDC intends to further explore the intricacies of these issues and document the consensus on the most appropriate approaches for a variety of CAD tasks. Moreover, the LIDC will provide suggestions, references, and pointers to publicly available software, when appropriate.
| DISCUSSION POINTS |
|---|
|
|
|---|
Not all scans in the database will be useful for all CAD research activities, given the reality that not all desired data will be available for all nodules or all scans. For investigators pursuing nodule classification techniques based on volumetric analysis of sequential scans, only qualifying nodules that appear in at least two scans from the same patient within the database and that include associated pathologic information or demonstrate stability over a defined period of time will prove useful. For investigators pursuing nodule classification techniques based on nodule features within a single scan, the requirement for at least two scans may be removed, but pathologic information is still required. Nodule segmentation techniques may make use of all qualifying nodules, regardless of whether pathologic information is available; however, nodules smaller than 3 mm in diameter will not have appropriate spatial extent estimates for such analyses. Nodule detection techniques will benefit mainly from all qualifying nodules on full-thorax scans, since qualifying nodules on limited-anatomy scans will not be appropriate for the detection task. The database will be organized in such a manner that the relevance of each scan or each nodule with regard to various CAD tasks will be specified.
The nodule outlining process truly has no reference standard. Adjudication among the radiologists is unnecessary, since we intentionally want to provide a range of outlines based on the reality of differences of opinion among experienced chest radiologists. The variation in nodule outlines will provide a statistical map of nodule spatial extent estimates. This variability will allow researchers to perform a variety of interesting analyses. For example, if a CAD algorithm generates an estimate of the nodule border that provides a measure of edge texture or, with follow-up scans, a measure of nodule growth that allows prediction of the pathologic status of a lung nodule better than the outlines produced by the panel of radiologists, then the performance of that algorithm may be considered to exceed, in some sense, the accuracy of the radiologists border estimates. If the CAD algorithm generates an estimate of the nodule border that lies within the range of the radiologists outlines, then the algorithm is at least consistent with the interobserver variability of the radiologist panel.
A database such as this presents an opportunity for truly blinded evaluation of CAD techniques. As an example of such an evaluation paradigm for the registration accuracy of brain images, the Retrospective Image Registration Evaluation project at Vanderbilt University (52) allows research groups to download CT, magnetic resonance, and positron emission tomography image data sets, to which the groups may apply their own retrospective image registration techniques; the results are then submitted to the central project site for comparison with a prospective marker-based registration standard. Since results based on this standard are not revealed to the research groups, direct comparison of different registration techniques may be captured in a controlled manner. The analogy for the LIDC database would be the segregation of dedicated image training sets and test sets. In this scenario, investigators would only have access to designated training images for the development of their CAD techniques; the final technique would be applied to test images, which were not previously available to the investigators. The LIDC Steering Committee, however, decided against such a segregation of cases. The main reasons for this decision involved the limitations that would be imposed on investigators use of the database and the inability of the LIDC to anticipate the full range of applications for which investigators will use the database.
The public release of a database such as this is an important issue of timing. Ideally, the complete database containing all CT scans and verified diagnoses for all patients whose scans are included would be made available at the end of this project. We are aware that this approach, in practice, would severely limit the CAD research community; as the LIDC strives to compile the complete database, many researchers are eager to accept an initial collection of CT scans even without the follow-up scans and verified diagnoses that will eventually become available. To accommodate the larger goal of facilitating CAD research, we anticipate an initial release of approximately 100 CT scans by the end of 2004. These cases will be processed through the infrastructure we are creating so that a scan from the clinical workflow at a member institution can be placed into the central LIDC database. As previously described, this process includes identification of a qualifying scan, deletion of information that could be used to identify the patient, multiple reader and site estimates of the spatial location and extent of all nodules with blinded and unblinded reviews, acquisition of information for all database fields, and final cataloging.
The complete contents of the database will become openly available without restriction to the medical imaging research community and will be stored at and managed by the NCI. Image data sets will probably be provided on digital video disks or DVDs at the request of investigators. Although search requests and database descriptors will be available through the Web, Web-based retrieval of image data may be impractical because of the size of the image files. Details of the distribution process, including potential fees, remain to be evaluated by the LIDC. The burden of monitoring the use of the database will then shift to individual investigators and the scientific community through the peer-review process both for publications and for grants. Investigators and journal editors will be responsible for disclosing and grant review study sections will be responsible for demanding details on the precise manner in which image and clinical data were used. For example, the training and/or testing paradigm should be described so that other investigators may repeat the study on the same scans with their own methods. The LIDC intends to provide guidance but does not intend to impose restrictions on what should be reported in terms of important training and test set characteristics, such as size and subtlety. We fully expect that the success of the database and its effect on the community will become evident through the literature.
| SUMMARY |
|---|
|
|
|---|
Author affiliations: University of Chicago, Ill (H.M.); University of Iowa, Iowa City (G.M., E.A.H.); University of California, Los Angeles (M.F.M.G., D.R.A.); University of Michigan, Ann Arbor (C.R.M., E.A.K.); Cornell University, New York, NY (D.Y., C.I.H., A.P.R.); National Cancer Institute, Bethesda, Md (B.Y.C., L.P.C., L.E.D.); University of Pittsburgh, Pa (D.G.).
Members of the LIDC Research Group: Lori E. Dodd, PhD, David Gur, ScD, Nicholas A. Petrick, PhD, Edward Staab, MD, Daniel C. Sullivan, MD, Robert F. Wagner, PhD, Peyton H. Bland, PhD, Keith Brautigam, BA, Matthew S. Brown, PhD, Barry DeYoung, MD, Roger M. Engelmann, MS, Andinet A. Enquobahrie, MS, Carey E. Floyd, Jr, PhD, Junfeng Guo, PhD, Aliya N. Husain, MD, Gary E. Laderach, BS, Charles E. Metz, PhD, Brian Mullan, MD, Richard C. Pais, BS, Christopher W. Piker, BS, James W. Sayre, DrPH, Adam Starkey.
| FOOTNOTES |
|---|
Abbreviations: CAD = computer-aided diagnosis, LIDC = Lung Image Database Consortium, NCI = National Cancer Institute
| REFERENCES |
|---|
|
|
|---|