Radiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online before print October 21, 2004, 10.1148/radiol.2333031782
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
2333031782v1
233/3/868    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Marshall, I.
Right arrow Articles by Chappell, F. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Marshall, I.
Right arrow Articles by Chappell, F. M.
(Radiology 2004;233:868-877.)
© RSNA, 2004


Neuroradiology

Repeatability of Motor and Working-Memory Tasks in Healthy Older Volunteers: Assessment at Functional MR Imaging1

Ian Marshall, PhD, Enrico Simonotto, PhD, Ian J. Deary, PhD, Alasdair Maclullich, MRCP, Klaus P. Ebmeier, MD, Emma J. Rose, MSc, Joanna M. Wardlaw, MD, Nigel Goddard, PhD and Francesca M. Chappell, MA

1 From the Depts of Medical Physics (I.M.), Psychiatry (E.S., K.P.E., E.J.R.), Psychology (I.J.D.), Geriatric Medicine (A.M.), Clinical Neurosciences (J.M.W., F.M.C.), and Informatics (N.G.); SHEFC Brain Imaging Research Ctr (I.M., I.J.D., K.P.E., J.M.W., N.G.); Ctr for Functional Imaging Studies (I.M., N.G.); and Ctr for the Study of the Ageing Brain (I.J.D., J.M.W.), Univ of Edinburgh, Western General Hosp, Edinburgh EH4 2XU, Scotland. Received Nov 5, 2003; revision requested Jan 27, 2004; revision received Feb 13; accepted Mar 23. Supported in part by a SHEFC research development grant and a small project grant from the Chief Scientist Office of the Scottish Executive. I.J.D. supported by a Royal Society-Wolfson Research Merit Award. Address correspondence to I.M. (e-mail: ian.marshall@ed.ac.uk).


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
PURPOSE: To prospectively determine the repeatability of functional magnetic resonance (MR) imaging brain activation tasks in a group of healthy older male volunteers.

MATERIALS AND METHODS: Local research ethics committee approval and informed consent were obtained. Sixteen men with a mean age of 69 years ± 3 (standard deviation) performed finger-tapping and N-back (number of screens back) working-memory tasks. Each subject underwent MR imaging three times in weekly intervals. Within-subject task repeatability was analyzed in terms of the number of voxels classified as activated (activation extent), the mean activation amplitude, and (for finger tapping) the center of the mass of the activated region. A repeatability index was calculated to compare test-retest repeatability between subjects and between functional MR imaging tasks. Within-session, between-session, and between-subject variability was assessed by using analysis of variance testing of activation amplitude and extent.

RESULTS: Nine of the 16 subjects generated useful data at all three MR imaging–functional task sessions. At single-subject, single-session analysis, cortical activation was identified in most subjects and at most sessions. The centers of the masses of motor cortex activation were highly reproducible (within 3 mm). Patterns of activation were qualitatively repeatable, but there was substantial variability in the amplitudes and extents of activated regions. Within-session coefficients of variation (CVs) for left- versus right-hand and right- versus left-hand finger tapping were, respectively, 65% and 43% for activation amplitude and 75% and 121% for activation extent. The between-session CVs for activation amplitude were similar to the within-session values, whereas between-session CVs for activation extent were much greater than within-session values, up to 206%.

CONCLUSION: The generally poor quantitative task repeatability highlights the need for further methodologic developments before much reliance can be placed on functional MR imaging results of single-session experiments.

© RSNA, 2004

Index terms: Brain, function • Brain, MR, 13.121411, 13.121412, 13.121413, 13.121416 • Magnetic resonance (MR), functional imaging, 13.121411, 13.121412, 13.121413, 13.121416


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
There is substantial enthusiasm to apply functional magnetic resonance (MR) imaging to the noninvasive investigation of brain function in both basic and clinical neuroscience studies. One of the attractions of functional MR imaging is that it may be able to yield meaningful single-imaging-session results. However, to prove that single-session results are valid, it must be demonstrated that the patterns of brain activation that develop in response to the same task are repeatable between imaging sessions. In clinical settings, the day-to-day repeatability (also known as test-retest reliability) of results for individual subjects is important for validating the evaluation of patients in serial examinations. It must be possible to distinguish between the variations inherent to the examination method and the genuine day-to-day changes in subject brain activation. The degree of repeatability must also be known if quantitative measures of activation are to be used as correlating variables.

However, relatively little has been published on the repeatability of the single-session results of functional MR imaging (17), and the data reported in the literature are cause for concern. Although patterns of activation appear to be qualitatively repeatable, reported quantitative measures of repeatability are poor (27). Even the number of voxels that are activated during elementary visual and motor tasks has ranged between 0 and several thousand for repeated runs of a task performed by a given subject (5,6). Moser et al (2) reported standard deviations of voxel counts typically in the range of 20%–50% but with values of up to 100%. Some have suggested that amplitudes of brain activation are more stable than voxel counts (2,4).

As disappointing as these results are, the literature refers to results obtained almost exclusively in healthy, relatively young subjects. These individuals are in contrast to many of the actual patients who are likely to be recruited for functional MR imaging investigations of psychiatric and neurologic diseases, who are older. To our knowledge, the repeatability of functional MR imaging results in such subjects has not been investigated systematically. Thus, the purpose of our study was to prospectively determine the repeatability of brain activation tasks in a group of healthy older male volunteers.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Subjects and MR Imaging
On the basis of data published in the literature (27), we expected the effect size of brain activation to be large enough for us to perform a meaningful study of intraindividual task repeatability. We selected a modest subject sample size to investigate differences in task repeatability between individuals. The study was approved by the local research ethics committee, and all subjects gave informed written consent.

Potential participants were recruited from a cohort maintained by the Centre for the Study of the Ageing Brain with the aid of local general practitioners, who on our behalf recruited men aged 65 years and older who were known not to be taking medication regularly and not to have poor health. We then conducted further screening of the men in this group who expressed interest in participating in the study. This screening was performed by analyzing the detailed medical histories and the blood pressure measurements obtained by an experienced physician (A.M.). To minimize problems in interpreting activation patterns, we recruited only right-handed native English–speaking subjects.

Of the 20 participants screened, four were excluded at study screening. The remaining 16 subjects, who had a mean age of 69 years ± 3 (standard deviation), had normal cognitive ability according to Raven’s Standard Progressive Matrices Plus test results: Their mean test score was 33.2, which is approximately in the 48th percentile for 70-year-old individuals in the population of healthy persons in the United Kingdom (8). The lowest score, 27, is in approximately the 23rd percentile.

Each subject underwent three MR imaging sessions. At each session, localizer and structural MR images were obtained while a motor task, a working-memory task, and a repeat of the motor task were performed. The subjects were trained to perform the functional MR imaging tasks, which they also rehearsed immediately before undergoing each MR imaging examination to minimize any residual learning effects. MR imaging was performed with a 1.5-T unit (Signa; GE Healthcare, Slough, Berks, United Kingdom) fitted with Echospeed (GE Healthcare) gradients and by using a standard head coil. Each subject underwent the three MR imaging–functional task sessions at the same time of day, 1 week apart, to minimize any natural diurnal variations. Subjects were instructed to maintain their usual diet and not to consume any alcohol the night before undergoing imaging or any caffeine-containing drinks on the morning of the examination.

We minimized effects due to gross differences in head positioning between imaging-task sessions by measuring the positions of the external auditory meatus and the inner canthus of the eye relative to the head coil at the first session. On subsequent sessions, the imagers used these measurements to assist with accurate repositioning.

The subjects first underwent standard transverse T2-weighted fast spin-echo brain MR imaging for radiologic reporting (by J.M.W.), as required by our ethics committee, with use of the following parameters: 6300/102 (repetition time msec/echo time msec), 240-mm field of view, 256 x 256 matrix, and 5-mm sections with 1.5-mm intersection gaps. This examination was followed by transverse three-dimensional gradient-echo MR imaging with inversion-recovery preparation to yield T1 weighting. The sequence parameters used for this examination were 8/3.4/600 (repetition time msec/echo time msec/inversion time msec), a 15° flip angle, a 220-mm field of view, a 192 x 256 matrix, and a block of 128 contiguous 1.7-mm sections.

Next, the subjects performed the cognitive tasks, which consisted of finger-tapping and N-back (number of screens back) working-memory (9) paradigms. Finally, the finger-tapping task was repeated. The paradigms were programmed in E-Prime software (Psychology Software Tools, Pittsburgh, Pa), with the instructions and stimuli presented visually on a liquid crystal display screen mounted on the head coil (IFIS; Psychology Software Tools). The subjects were provided with left- and right-hand push-button units (IFIS) so that their responses could be logged by the software. Each ergonomically shaped push-button unit had a push button beneath the thumb and beneath each fingertip.

Finger-tapping Task
The finger-tapping task was a self-paced activity in which the subjects pressed the push buttons of a hand unit in sequential order, alternating between the left and right hands in 30-second blocks. The total duration of the task was 3 minutes (for six blocks). Performing this simple task involves use of the motor cortex, and concentration is required to maintain the correct finger sequence. Subjects were instructed to use a pace that was comfortable to them.

N-Back Working-Memory Task
For the working-memory task, the subjects had to watch a changing screen display. The display showed four boxes, with a colored marker appearing randomly in one of them. The subjects had to respond by pressing the button that corresponded to the position of the marker that appeared a given number of screens (N) ago (9). We used a block design in which N = 0 and N = 1 were alternated for four blocks and then N = 0 and N = 2 were alternated for four blocks. Each one-back (ie, one-screen-back) and two-back block consisted of 14 stimuli, whereas the zero-back blocks consisted of nine stimuli. Each stimulus lasted 3 seconds, yielding a total paradigm duration of 10 minutes.

Brain Oxygenation Level–dependent Echo-planar MR Imaging
Contiguous brain oxygenation level–dependent (BOLD) echo-planar MR images were acquired at a rate of 10 images per second throughout each task, enabling whole-brain imaging coverage (a "volume") every 2500-msec repetition time. The first four volumes were discarded to ensure that brain magnetization had reached a steady state before each task began. The other parameters used to acquire these images were a 64 x 64 matrix, a 3.8 x 3.8-mm in-plane pixel size, a 5-mm section thickness, and an echo time of 40 msec. The IFIS system logged the push-button responses to the tasks simultaneously with imaging. Each imaging session lasted a total of approximately 60–70 minutes, which included the time spent positioning the subject.

Imaging Data Transfer and Statistical Analyses
Imaging data were transferred to a multiprocessor computer (Compaq Alpha ES40; Hewlett-Packard, Bracknell, Berks, United Kingdom) at the Centre for Functional Imaging Studies. Analyses were performed by using the SPM99 Statistical Parameter Mapping package (available at: www.fil.ion.ucl.ac.uk/spm, accessed January 2000). The images were realigned to correct for head motion, normalized to the SPM99 echo-planar imaging template, and then smoothed with an isotropic Gaussian 6-mm full width at half maximum filter. Volume-to-volume global signal intensity normalization, together with temporal filters (low-pass-filter full width at half maximum of 4 seconds; high-pass-filter cutoff of 120 seconds for finger tapping and of 149 seconds for N-back task), was applied.

Motion estimates were included in the design matrix of the paradigms, and statistical parametric maps of brain activation were generated. Comparison MR images with findings indicative of the left- versus right-hand and right- versus left-hand differences in finger tapping were generated. The voxels during left-hand finger tapping that had higher signal intensities than the corresponding voxels during right-hand finger tapping (with a significance of P < .05) were included on the left-versus-right comparison image, and vice versa for the right-versus-left comparison image. In addition, parametric comparison images were acquired as the N-back task difficulty increased (from zero [screen] back to one back to two back). All single-session comparison images were acquired with a significance threshold of P = .05 corrected for multiple voxels and with no cluster threshold.

For each subject and task, a second-level random-effects analysis (5) consisting of a one-sample t test for the individual-session comparison images yielded estimates of mean brain activation. For this purpose, we treated each pair of runs of the finger-tapping task as separate runs. A third-level t test analysis of the individual-subject mean activation maps yielded a population estimate of brain activation.

Statistical Analyses of Finger-tapping and N-Back Tasks
In a more detailed analysis of the finger-tapping task, the number of activated voxels in the contralateral motor cortex was determined from the comparison MR images obtained at each MR imaging session and in each subject. To reduce the effects of any artifactual activation in nonmotor areas, voxel counting was restricted to the superior image sections (ie, those with z-axis coordinates of ≥38 mm) of the appropriate hemisphere. The relative activation amplitude was calculated by normalizing the mean amplitude of motor cortex activation (the ß parameter in the SPM99 general linear model) according to the global steady-state amplitude.

Repeatability (ie, overlap) maps (10) of the motor cortex showing the number of times (out of a maximum of six) that a given voxel was classified as activated were generated. In addition, centers of the masses of the activated regions were determined by calculating weighted means of the activation amplitudes across the regions. The default Montreal Neurological Institute coordinates were converted to Talairach coordinates by using a nonlinear transformation program (available at: www.mrc.cbu.cam.ac.uk/Imaging, accessed December 2002). Statistical analyses of the activation results, including analysis of variance and Bland and Altman (11) tests, were performed by one of the authors (F.M.C.) by using SAS, version 8 (available at: www.sas.com, accessed May 2003).

A repeatability index (RI) was calculated to compare the test-retest repeatability for different subjects and different functional MR imaging tasks. We used the following equation to determine the RI:

{r04dc13e01}
where N is the total number of runs of a task, ni is the number of voxels that were classified as activated during i runs, and ini is the product of i and ni. The RI is essentially the mean number of times that voxels were classified as activated, normalized to a range of 0–1. A value of 0 indicates that no voxels were activated during more than one run, whereas a value of 1 indicates that any voxels that were classified as activated were activated during all runs.

A similar statistical analysis was performed for the N-back task; however, centers of the masses of activation were not determined because widespread activation was found.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Subjects Analyzed
Three of the 16 subjects recruited were unable to tolerate the imaging procedure and thus withdrew from the study. Their ages (mean age, 70 years ± 3 [standard deviation]) were not significantly different from those of the remaining 13 subjects (mean age, 69 years ± 3). An additional two subjects had minor brain abnormalities—one characteristic of dural calcification and the other characteristic of an old asymptomatic cerebral microhemorrhage—which, although not infrequent in this age group (12), caused susceptibility artifacts on the functional MR images that precluded further analysis. Another subject was excluded from analysis because the image registration estimates indicated that the peak-to-peak subject head motion during both tasks was greater than 3 mm, even after linear detrending.

These exclusions left 10 subjects. Subject 5 was unable to perform the finger-tapping task properly. Thus, nine subjects (mean age, 70 years ± 3) were left for finger-tapping analysis. Subject 10 was excluded from the N-back analysis because of excessive head motion, again leaving nine subjects (mean age, 69 years ± 4).

Finger-tapping Analysis Results
Subject 6 did not perform the second finger-tapping paradigm during his first MR imaging–functional task session. Otherwise, this task generated the expected activation in the contralateral motor cortex and in the ipsilateral cerebellum during most runs in most subjects, as shown in Table 1. The overall mean amplitude of activation was 0.78% of the at-rest signal intensity.


View this table:
[in this window]
[in a new window]

 
TABLE 1. Finger-tapping Task: Numbers of Activated Voxels and Mean Activation

 
The results for subject 3, who exhibited good task repeatability (RI for left- vs right-hand finger tapping, 0.49) are shown in Figure 1, whereas the results for subject 7, who exhibited poor task repeatability (RI = 0.13), are shown in Figure 2. Note that the diagrammatic representations in Figures 1a and 2a show all activated voxels throughout the brain; however, only those voxels with z-axis coordinates of 38 mm or greater were included in the analysis. Statistical analysis results for the finger-tapping task are summarized in Tables 13. The group repeatability map generated for all nine subjects is shown in Figure 3.



View larger version (55K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1a. (a) Diagrammatic representation of brain activation induced by left-hand finger tapping. All six results (three MR imaging sessions times two task runs) are shown for subject 3, who showed good task repeatability (RI = 0.49). Within each block, sagittal (top left image), coronal (top right image), and transverse (bottom left image) projections are shown. (b) Corresponding transverse MR image sections overlaid with a reliability map show numbers of times (out of maximum of six) voxels in the motor cortex were classified as activated during six runs of the finger-tapping task. Colors range from dark blue (indicating voxels were classified as activated one time) to red (six times). Background images are a template derived from the mean of 20 normalized inversion-recovery-prepared gradient-echo MR images (8/3.4/600, 15° flip angle) obtained in a group of healthy elderly men. Images are shown in neurologic convention (ie, image left corresponds to subject left).

 


View larger version (141K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1b. (a) Diagrammatic representation of brain activation induced by left-hand finger tapping. All six results (three MR imaging sessions times two task runs) are shown for subject 3, who showed good task repeatability (RI = 0.49). Within each block, sagittal (top left image), coronal (top right image), and transverse (bottom left image) projections are shown. (b) Corresponding transverse MR image sections overlaid with a reliability map show numbers of times (out of maximum of six) voxels in the motor cortex were classified as activated during six runs of the finger-tapping task. Colors range from dark blue (indicating voxels were classified as activated one time) to red (six times). Background images are a template derived from the mean of 20 normalized inversion-recovery-prepared gradient-echo MR images (8/3.4/600, 15° flip angle) obtained in a group of healthy elderly men. Images are shown in neurologic convention (ie, image left corresponds to subject left).

 


View larger version (51K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2a. (a) Diagrammatic representation of brain activation induced by left-hand finger tapping. All six results (three MR imaging sessions times two task runs) are shown for subject 7, who showed poor task repeatability (RI = 0.13). Within each block, sagittal (top left image), coronal (top right image), and transverse (bottom left image) projections are shown. (b) Corresponding transverse MR image sections overlaid with a reliability map show numbers of times (out of maximum of six) voxels in the motor cortex were classified as activated during six runs of the finger-tapping task. Colors range from dark blue (indicating voxels were classified as activated one time) to red (six times). Background images are a template derived from the mean of 20 normalized inversion-recovery-prepared gradient-echo MR images (8/3.4/600, 15° flip angle) obtained in a group of healthy elderly men. Images are shown in neurologic convention (ie, image left corresponds to subject left).

 


View larger version (144K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2b. (a) Diagrammatic representation of brain activation induced by left-hand finger tapping. All six results (three MR imaging sessions times two task runs) are shown for subject 7, who showed poor task repeatability (RI = 0.13). Within each block, sagittal (top left image), coronal (top right image), and transverse (bottom left image) projections are shown. (b) Corresponding transverse MR image sections overlaid with a reliability map show numbers of times (out of maximum of six) voxels in the motor cortex were classified as activated during six runs of the finger-tapping task. Colors range from dark blue (indicating voxels were classified as activated one time) to red (six times). Background images are a template derived from the mean of 20 normalized inversion-recovery-prepared gradient-echo MR images (8/3.4/600, 15° flip angle) obtained in a group of healthy elderly men. Images are shown in neurologic convention (ie, image left corresponds to subject left).

 

View this table:
[in this window]
[in a new window]

 
TABLE 2. Finger-tapping Task: Overlap Statistics and RI Values for Motor Cortex

 

View this table:
[in this window]
[in a new window]

 
TABLE 3. Standard Deviations and Coefficients of Variation for Repeated Measurements in a Given Subject at Two-Way Analysis of Variance

 


View larger version (93K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3. Twenty-four transverse MR image sections on the group repeatability map generated for nine subjects who participated in a total of 53 task runs show numbers of times (out of maximum of 53) voxels in the motor cortex were classified as activated. Colors range from dark blue (indicating voxels were classified as activated one time) to orange (28 times) during 53 runs of the finger-tapping task. Background images are a template derived from the mean of 20 normalized inversion-recovery-prepared gradient-echo MR images (8/3.4/600, 15° flip angle) obtained in a group of healthy elderly men. Images are shown in neurologic convention (ie, image left corresponds to subject left).

 
To assess within-session effects, a Bland and Altman analysis (11) of the paired differences (between runs 1 and 2) in each of the three sessions was performed for both left-hand and right-hand finger tapping. All of the 95% confidence intervals of the differences contained the zero point, indicating that the true mean differences could be zero; however, the width of the 95% confidence intervals (and the magnitude of the standard deviations) relative to the magnitude of the measurements implies that these measurements were not repeatable to a useful degree.

Thus, at overall mean voxel counts of 1547 (left vs right hand) and 1145 (right vs left hand), the mean differences between run 1 and run 2 ranged from –911 to +63, depending on the day (ie, session) of the imaging-task and whether the left or right hand was used to perform the task. The standard deviations ranged from 846 to 1644, so the tendency for the voxel counts to be greater during the second run of the task than during the first was not significant. There was a tendency for the larger voxel counts to be less repeatable.

Activation amplitudes were slightly more repeatable: At mean activations of 0.80% (left vs right hand) and 0.76% (right vs left hand), the mean differences between run 1 and run 2 ranged from –0.44% to +0.08%, with standard deviations of 0.19%–0.92%. RIs ranged from 0.00 (subject 6, left vs right hand) to 0.49 (subject 3, left vs right hand), with a mean value of 0.16. Finger-tapping frequency, as determined from the logged push-button responses, was not significantly correlated with activation amplitude or activation extent.

Since on a given day the grouped run 1 and run 2 results were shown not to be systematically different (11), they were treated identically and used in a two-way analysis of variance, with the lack of independence of measurements obtained in the same subject taken into account (13). No significant correlation between given subject and day of the task was found. Summary results are shown in Table 3. This analysis yielded within-session CVs of 75% and 121% for voxel counts at left- and right-hand finger tapping, respectively. Within-session CVs of 65% and 43% were calculated for activation amplitudes at left- and right-hand finger tapping, respectively. The between-session CVs for activation amplitudes were similar to the within-session results. However, the between-session variability in voxel counts was much greater than the within-session variability, especially at left- versus right-hand finger tapping, for which the CV was 206%.

The centers of the masses of contralateral cortical activation at right-hand finger tapping closely mirrored those at left-hand finger tapping. The centers of the masses were highly reproducible, with intrasubject standard deviations of approximately 2 mm at each coordinate. The corresponding standard deviations for the group were approximately 3 mm, thus showing that the finger-tapping task was also repeatable between subjects when individual results were spatially normalized. The centers of the masses—that is, the Talairach coordinates—were 36, –24, and 57 mm for left- versus right-hand finger tapping and –37, –26, and 56 mm for right- versus left-hand finger tapping.

N-Back Analysis Results
During the N-back task, many regions (including the superior frontal, middle frontal, occipital, and parietal gyri, and the cerebellum) showed increased activity (overall mean amplitude, 2.5% of at-rest signal intensity) with increasing difficulty (Fig 4). On the other hand, the medial temporal gyrus and the cingulate gyrus were more active with decreasing task difficulty. N-back repeatability statistical analysis results are shown in Tables 3 and 4. RIs ranged from 0.00 (subject 8) to 0.16 (subject 1), with a mean value of 0.09.



View larger version (143K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4. Thirty transverse MR image sections on a random effects group analysis map generated for nine subjects who performed the N-back working-memory task. Brain activation increased with increasing task difficulty, as indicated by the red, orange, and yellow areas. Brain activation also increased with decreasing task difficulty, as indicated by the blue areas. Background images are a template derived from the mean of 20 normalized inversion-recovery-prepared gradient-echo MR images (8/3.4/600, 15° flip angle) obtained in a group of healthy elderly men. Images are shown in neurologic convention (ie, image left corresponds to subject left).

 

View this table:
[in this window]
[in a new window]

 
TABLE 4. N-Back Working-Memory Task: Numbers of Activated Voxels, Mean Activation, Overlap Statistics, and RI Values

 
N-back behavioral scores—that is, the percentages of responses that were correct—correlated weakly with the numbers of activated voxels (R = 0.47, P = .01) but did not correlate with activation amplitudes. Two-way analysis of variance yielded between-session CVs of 122% for voxel counts and 34% for activation amplitudes (Table 3). Owing to the limited sample size, it was not meaningful to look for correlations between the N-back and finger-tapping results (11).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Intolerance to MR imaging, minor asymptomatic abnormalities, and excessive head motion reduced the study group size from 16 recruited participants to nine participants with useful data sets. This attrition rate may at first appear rather high, but we have no reason to believe that it is not representative of those in comparable studies with elderly subjects. Clinical studies involving patients are likely to have an even higher "dropout" rate, and this must be taken into account when designing studies.

Our study results show that patterns of activation are qualitatively robust for a given subject. The centers of the masses of activation had standard deviations of around 2 mm for individuals. However, the mean activation amplitudes and the numbers of activated voxels in particular were highly variable during both the finger-tapping and the N-back working-memory tasks. For example, even in the subject who had the highest task repeatability for left-hand finger tapping (subject 3), there was a 2:1 ratio for the number of activated voxels from run to run. Furthermore, only 2019 voxels were consistently activated across all six runs, with 2130 activated only once and approximately 1000 activated on two, three, four, and five runs. The resulting RI was 0.49. Ideally, voxels would be either always activated or never activated for a task familiar enough to the subject that learning effects were negligible—that is, a task that yields an RI of 1.0. No RI came even close to this ideal value.

Our finding of CVs for voxel counts on the order of 100% was somewhat disappointing, but it confirmed the consensus view reported in the literature, which is based on findings mainly in younger people. Various studies (2,3,6) of visual activation in younger subjects have revealed that the numbers of activated voxels vary greatly, even within subjects—for example, between 10 and more than 1000 voxels in one study (6). In another study, the proportion of voxels activated between two runs (ie, the overlap), averaged across all subjects, was 0.74 between two runs within a session and 0.64 between two runs on different days (3).

In the Miki et al study (6), some subjects had overlap values of 0.00. This was also true in the McGonigle et al study (5) involving a single volunteer who was imaged 33 times for 2 months while performing motor, visual, and cognitive tasks. There was a wide range of activation, from 0 to more than 1000 voxels for the motor task. Specht et al (14) showed that the proportion of voxels activated between two sessions was modulated by attention level, with an overlap of 0.42 for a passive viewing task and of 0.69 for an active task. Other measures of repeatability (intraclass correlation coefficients and correlations of t test scores between sessions) yielded generally consistent results.

Interestingly, we found that the activation amplitudes during the N-back working-memory task were greater than those during finger tapping (mean amplitude, 2.54% vs 0.80%). One might have expected the higher-level cognitive task (N-back working memory) to generate less activation than the task that directly stimulates primary motor cortex activity (finger tapping). It may be that higher concentration was required to perform the working-memory task, and this is reflected in the higher amplitude observed.

Both Moser et al (2) and Cohen and DuBois (4) found that the relative amplitude of activation was more repeatable than the number of activated voxels. This was confirmed by our results: CVs of motor activation amplitudes were in the range of 42%–85%, as compared with CVs of 75%–206% for voxel counts. For the N-back task, we calculated a between-session CV of 34% for activation amplitude, as compared with a CV of 122% for activation extent. Mattay et al (1) examined eight volunteers aged 23–37 years during three finger-tapping task–imaging sessions each and found qualitatively reproducible patterns. Within the primary sensorimotor cortex, 70% of voxels were activated during all three sessions; this is higher than the percentage of activated voxels determined in the present study with older subjects. The centers of the masses of activated regions were displaced between 3 and 9 mm between sessions. Our centers-of-the-mass results were more consistent, with typical standard deviations of 2 mm in each direction (less than 4 mm total) after spatial normalization.

Our definition of RI enables one to compare results, regardless of the number of task runs, instead of calculating an average overlap across all possible pairs of runs, as has been done by using the simple overlap parameter described in the literature. The RI that we used is more conservative in that consistent activation across all runs is necessary to achieve a high value.

Poor repeatability of voxel counts has been attributed to voxel-wise noise levels that vary considerably between runs and thus affect the statistical significance of the comparison results (4). Subject head motion may be a contributing factor, but we found no correlation between activation results and motion estimates. A subject’s attention level or performance may contribute to variability, but the only significant correlation that we found was that between N-back score and voxel count (R = 0.47, P = .01).

To avoid relying on null conditions with poorly specified brain states, we calculated the differences between well-defined states. These were differences between left- and right-hand finger tapping and differences seen with parametric increases in N-back task difficulty. To reduce the effect of any artifactual activation in nonmotor brain areas during the finger-tapping task, we restricted voxel counting to the superior image sections in the appropriate hemisphere.

Deciding how to classify voxels as activated remains a problem in functional MR imaging analysis. Essentially, a threshold has to be chosen for use in converting a continuous variable, such as signal intensity level at BOLD MR imaging, to a binary parameter. Most authors have used either a correlation coefficient of around 0.5 or the t value that corresponds to a corrected P value of less than .05. Genovese et al (10) developed a statistical framework for selecting the optimum threshold that is based on multiple runs with a given subject and task. Noll et al (15) illustrated the technique of Genovese et al by using motor and working-memory tasks. Unfortunately, it is generally not possible to conduct multiple sessions in a clinical context or with clinical research volunteers because they may be unable to tolerate prolonged imaging times, so this method is not generally applicable.

Given that wide variability of results appears to be unavoidable, at least for the time being, Waldvogel et al (7) investigated whether a standard simple task could be used to calibrate a task of interest. They conducted six separate sessions in which a group of six volunteers performed finger tapping and a visual activation task at each session. For three subjects, there was a strong correlation between the activation amplitudes—but not the number of activated voxels—during the two tasks that could not be explained by image noise or head motion. The authors attributed these variations to physiologic factors, the attention level, and the alertness of the subjects. Unfortunately, there was little activation in one or another of the tasks among the three other subjects, ruling out the possibility of task calibration. In the present study, we believed that such a small sample size ruled out the possibility of a statistically meaningful investigation of this issue (11).

We examined older subjects so that the findings might have more relevance to clinical research—for example, of aging and degenerative disease—compared with the degree of relevance to clinical research that the findings would have had if we had simply recruited subjects from among staff members and students. Our results, as well as those reported in the literature for young volunteers, probably would have been "best case" outcomes with carefully controlled simple experiments and motivated subjects.

Our study results demonstrate that functional MR imaging studies are possible with older subjects. However, minor asymptomatic abnormalities, which are more likely to be present in older than in younger subjects, may complicate the interpretation of image findings and demand neuroradiologic assessment. Incidental calcification can confound image processing. The influence of periventricular areas of white matter hyperintensity, which are common in older persons, has not been addressed.

We found that extra care was required in training the subjects to ensure that they understood the tasks and were able to respond correctly. Even with these provisos, useful results were obtained at the majority of the MR imaging sessions. The results with more complex cognitive tasks and for patients are unlikely to be as good. For example, Machielsen et al (16) studied episodic memory in 10 volunteers who were imaged twice within one session and then once again several days later. They found poor correspondence of voxel counts between the sessions (mean overlap statistic, 36%), although it was slightly better (overlap statistic, 49%) within the first session.

In the Manoach et al (17) study, the findings in volunteers and in individuals with schizophrenia who performed a working-memory task during two sessions conducted several weeks apart were compared. The individuals with schizophrenia showed less reliable activation in brain regions controlling cognitive tasks than did the control subjects, although the two groups had similar results in regions controlling motor tasks. Unfortunately, the study was confounded because of the poorly matched subject groups: Compared with the control subjects, the subjects with schizophrenia were 10 years older and had IQs more than 10 points lower.

Group studies will also be confounded when different subjects use different cognitive strategies, thereby activating different brain regions. For example, Miller et al (18) studied individual differences in episodic memory processing in six subjects on two occasions. The activation patterns in the different subjects were visually different, with a mean correlation across subjects of only 0.20. The mean correlation between sessions for a given subject was 0.48.

There were limitations in the present study. Subjects were trained to perform the tasks before participating in the first MR imaging–functional task session. They also practiced the tasks immediately before undergoing each imaging examination to minimize any residual learning effects. Familiarity with the tasks in the imaging environment was expected to increase with each session, and this learning effect might have led to differences in brain activation between sessions. However, no significant learning effects were observed. In a future full-scale study, it would be interesting to fully investigate the possible influence of learning effects by using larger numbers of subjects and imaging-task sessions.

The finger-tapping task may not have been optimal for producing repeatable results. Self pacing in particular may have introduced an avoidable confounding effect. Nevertheless, we found no correlation between finger-tapping frequency and measures of activation. It should also be noted that by investigating only left- versus right-hand and right- versus left-hand differences we ignored any activation associated with an at-rest condition and thus may have weakened the repeatability results.

MR imaging units with higher field strengths (3.0 T and greater) might be expected to provide more consistent results. However, Tegeler et al (19) found that repeatability did not appear to be any better at 4.0 T than at the more usual field strength of 1.5 T. This finding suggests that physiologic rather than physical factors limit the signal-to-noise ratio at BOLD MR imaging. The fact that the use of higher field strengths leads to increased susceptibility artifacts is also a problem, making the study of certain brain regions technically more difficult. As more and more high-field-strength MR imaging units are installed, we can expect more functional MR imaging studies to further address such issues in the near future. In the meantime, it appears unsound to place too much reliance on the functional MR imaging results of single-session experiments.

In summary, a group of healthy older volunteers successfully performed motor and cognitive tasks. Within-subject task repeatability was similar to that reported in the literature for healthy young subjects, with generally consistent patterns of activation but poor quantitative repeatability. Within-session CVs were on the order of 100% for activation extents and up to 65% for activation amplitudes. The centers of the masses of motor cortex activation were consistently within 2–4 mm. It is evident that further methodologic improvements in task design and data analysis must be made before single-session functional MR imaging activation results can be relied on. Meanwhile, we believe that the most legitimate use of functional MR imaging appears to be that for group-averaged studies of cognitive function. For clinical studies, these investigations will require carefully chosen homogeneous patient cohorts.


    ACKNOWLEDGMENTS
 
This work was performed at the SHEFC Brain Imaging Research Centre for Scotland (BIRCS: www.dcn.ed.ac.uk/bic). We are indebted to Kristin Haga, PhD, for final training of the subjects and to the radiographers of the SHEFC BIRCS for performing subject imaging.


    FOOTNOTES
 
Abbreviations: BOLD = brain oxygenation level dependent, CV = coefficient of variation, RI = repeatability index

Authors stated no financial relationship to disclose.

Author contributions: Guarantor of integrity of entire study, I.M.; study concepts and design, I.M., E.S., I.J.D., K.P.E., E.J.R., J.M.W., N.G.; literature research, I.M., E.S.; clinical studies, A.M., J.M.W.; data acquisition, I.M., E.S., A.M.; data analysis/interpretation, I.M., E.S., K.P.E., J.M.W.; statistical analysis, F.M.C.; manuscript preparation, I.M., A.M., J.M.W.; manuscript definition of intellectual content and final version approval, I.M., I.J.D., K.P.E., E.J.R., A.M., J.M.W.; manuscript editing, I.M.; manuscript revision/review, I.M., E.S., I.J.D., K.P.E., E.J.R., J.M.W.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Mattay VS, Frank JA, Santha AK, et al. Whole-brain functional mapping with isotropic MR imaging. Radiology 1996; 201:399-404.[Abstract/Free Full Text]
  2. Moser E, Teichtmeister C, Diemling M. Reproducibility and postprocessing of gradient-echo functional MRI to improve localisation of brain activity in the human visual cortex. Magn Reson Imaging 1996; 14:567-579.[CrossRef][Medline]
  3. Rombouts SA, Barkhof F, Hoogenraad FG, Sprenger M, Scheltens P. Within-subject reproducibility of visual activation patterns with functional magnetic resonance imaging using multisection echo planar imaging. Magn Reson Imaging 1998; 16:105-113.[CrossRef][Medline]
  4. Cohen MS, DuBois RM. Stability, repeatability, and the expression of signal magnitude in functional magnetic resonance imaging. J Magn Reson Imaging 1999; 10:33-40.[CrossRef][Medline]
  5. McGonigle DJ, Howseman AM, Athwal BS, Friston KJ, Frackowiak RS, Holmes AP. Variability in functional MR imaging: an examination of intersession differences. Neuroimage 2000; 11:708-734.[CrossRef][Medline]
  6. Miki A, Raz J, van Erp TG, Liu CS, Haselgrove JC, Liu GT. Reproducibility of visual activation in functional MR imaging and effects of postprocessing. AJNR Am J Neuroradiol 2000; 21:910-915.[Abstract/Free Full Text]
  7. Waldvogel D, van Gelderen P, Immisch I, Pfeiffer C, Hallett M. The variability of serial functional MR imaging data: correlation between a visual and a motor task. Neuroreport 2000; 11:3843-3847.[Medline]
  8. Raven J, Raven JC, Court JH. Manual for Raven’s progressive matrices and vocabulary scales Section 3, The standard progressive matrices. Oxford, England: Oxford Psychologists Press, 1998.
  9. Casey BJ, Cohen JD, O’Craven K, et al. Reproducibility of fMRI results across four institutions using a spatial working memory task. Neuroimage 1998; 8:249-261.[CrossRef][Medline]
  10. Genovese CR, Noll DC, Eddy WF. Estimating test-retest reliability in functional MR imaging. I. Statistical methodology. Magn Reson Med 1997; 38:497-507.
  11. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986; 1:307-310.[CrossRef][Medline]
  12. Kato H, Izumiyama M, Izumiyama K, Takahashi A, Itoyama Y. Silent cerebral microbleeds on T2*-weighted MRI: correlation with stroke subtype, stroke recurrence, and leukoaraiosis. Stroke 2002; 33:1536-1540.[Abstract/Free Full Text]
  13. Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res 1999; 8:135-160.[Abstract/Free Full Text]
  14. Specht K, Willmes K, Shah NJ, Jäncke L. Assessment of reliability in functional imaging studies. J Magn Reson Imaging 2003; 17:463-471.[CrossRef][Medline]
  15. Noll DC, Genovese CR, Nystrom LE, et al. Estimating test-retest reliability in functional MR imaging. II. Application to motor and cognitive activation studies. Magn Reson Med 1997; 38:508-517.
  16. Machielsen WC, Rombouts SA, Barkhof F, Scheltens P, Witter MP. FMRI of visual encoding: reproducibility of activation. Hum Brain Mapp 2000; 9:156-164.[CrossRef][Medline]
  17. Manoach DS, Halpern EF, Kramer TS, et al. Test-retest reliability of a functional MRI working memory paradigm in normal and schizophrenic subjects. Am J Psychiatry 2001; 158:955-958.
  18. Miller MB, van Horn JD, Wolford GL, et al. Extensive individual differences in brain activations associated with episodic retrieval are reliable over time. J Cogn Neurosci 2002; 14:1200-1214.[CrossRef][Medline]
  19. Tegeler C, Strother SC, Anderson JR, Kim SG. Reproducibility of BOLD-based functional MRI obtained at 4 T. Hum Brain Mapp 1999; 7:267-283.[CrossRef][Medline]




This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
2333031782v1
233/3/868    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Marshall, I.
Right arrow Articles by Chappell, F. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Marshall, I.
Right arrow Articles by Chappell, F. M.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
RADIOLOGY RADIOGRAPHICS RSNA JOURNALS ONLINE