Biophysical Considerations in the Precision of Quantitative 18 F-FDG PET/CT DISSERTATION

Size: px
Start display at page:

Download "Biophysical Considerations in the Precision of Quantitative 18 F-FDG PET/CT DISSERTATION"

Transcription

1 Biophysical Considerations in the Precision of Quantitative 18 F-FDG PET/CT DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Katherine Mary Binzel Graduate Program in Biophysics The Ohio State University 2013 Dissertation Committee: Michael V Knopp, Advisor Nathan C Hall Michael Tweedle Edward Martin

2

3 Copyright by Katherine Mary Binzel 2013

4 Abstract As Positron Emission Tomography is increasingly being used as a functional marker of disease states, both adequate image quality and robust quantification are critical components of such imaging studies. A variety of both biological and physical factors affect these qualities in significant ways. As PET technology continues to evolve, an understanding of these effects becomes essential to the continued usefulness of PET in clinical oncology. A detailed investigation of physical elements impacting PET quantification was completed. By directly comparing the results of data sets reconstructed in various ways the true impact of reconstruction settings on quantification was revealed. It was found that changes made to factors such as the number of iterations and subsets per reconstruction or time-of-flight kernel width had a limited impact on PET quantification. However, the addition of a system point spread function correction in the reconstruction algorithm results in significant changes in quantitative measurements. When comparing this state of the art correction method to data including cutting edge time-of-flight data acquisition, further differences in quantitative measurements were found, with time-of-flight data further improving the accuracy with which activity concentrations are recovered. These results show that although there remains a need for improvement in the acquisition and reconstruction of PET data, with each system upgrade the results of PET studies are indeed coming closer to reaching a true estimation of physiological activity distributions. With this in mind, re-evaluation of the PET radiopharmaceutical dose required for adequate and robust imaging was conducted. Simulations of low-count density images showed that reducing the average image count levels by as much as 66% did not significantly impact the clinical usefulness of PET data. Therefore, maintaining current ii

5 emission scan durations, a significant decrease in the standard amount of administered radioactivity could be implemented. By thoroughly detailing the quantitative impact of improving PET/CT image acquisition and reconstruction methods, it was found that there is potential for decreasing required radioactivity doses in PET/CT imaging. With the current focus on reducing patient radiation exposure in medical imaging procedures, these findings are both timely and critical in terms of the continued utilization of PET/CT imaging in oncology. iii

6 This work is dedicated to my family. iv

7 Acknowledgments Many people deserve thanks for their aid in the completion of this research. Most importantly is my advisor, Dr. Michael Knopp, who has been instrumental in guiding not only my development as a scientist but also my growth on a personal level. Without your constant support and encouragement through the past years I would not be where I am today. Thank you for this immense opportunity. I also wish to thank my family, who have likely suffered more through this process than I ever did. Thank you for your patience and guidance, love and laughter. Your support and perspectives on life are appreciated more than I could ever say. Also many thanks go to the faculty and students with whom I have worked, and played, over these years. Within the research environment, the spirit of teamwork and camaraderie among our group is wonderful and I thank each person for their contribution to my completed work, however large or small. In particular, thank you to my committee members and to Dr. Jun Zhang. Within the university setting I have found some truly great friends that I am incredibly thankful for. Weekly interdisciplinary excursions have relieved much stress and created ample memories. Thank you! v

8 Vita May B.S. Engineering Physics, Miami University September 2007 to present...graduate Research Associate, Biophysics Graduate Program, The Ohio State University Field of Study Major Field: Biophysics Studies in: Positron Emission Tomography vi

9 Table of Contents Abstract... ii Acknowledgments... v Vita... vi Table of Contents... vii List of Tables... viii List of Figures... ix Chapter 1: Background... 1 Chapter 2: Evaluation of the Consistency of PET/CT Data Acquisition and PET Quantification with Variations in Reconstruction Methods Chapter 2.1: Introduction Chapter 2.2: Variability of Administered 18 F-FDG Doses and Uptake Times in Clinical PET/CT Imaging Chapter 2.3: Consistency of SUV with Variation in PET Reconstruction Parameters 44 Chapter 2.4: Quantitative Impact of a Technology Upgrade on PET/CT Imaging Chapter 2.5: Conclusion Chapter 3: Impact of Time-of-Flight Acquisition and System Point Spread Function Corrections on PET/CT Quantification Chapter 4: Impact of Acquisition Time and Radiopharmaceutical Dose on PET Quantification Chapter 5: Summary and Outlook References Appendix A: Index of Abbreviations vii

10 List of Tables Table 2.1. Number of FDG PET/CT Studies Completed with Different Referral Indications Table 2.2. Number of Studies with Injection to Scan Times within the Acceptable Range of 75±10 Minutes Table 2.3. Number of Studies Completed with Injected Doses within the Acceptable Range Table 2.4. Average Injection to Scan Time per Hour of the Day of Injection Table 2.5. Average Injected Activity per Time of Day of Injection Table 2.6. Parameters of Philips GEMINI TF "Speed" Reconstruction Settings Table 2.7. Average of the Absolute Value of the Percent Difference in SUVmax for Each Tissue Type Measured with Varied Reconstruction Settings Table 2.8. Average SUVmax of Each Tissue Type Evaluated Table 2.9. Average Volume of ROIs by Tissue Type Table Siemens Biograph 16 and Biograph 64 mct PET/CT Reconstruction Protocol Settings Table Percent Difference between 3D OSEM and TrueX SUV by Tissue Type Table 3.1. Total Activity and Hot-to-Background Ratios for Each Sphere in Both Separate Experiments Table 4.1. Phantom Hot Sphere to Background Ratios for Reduced PET Acquisition Durations viii

11 List of Figures Figure 1.1. Sample CT, PET and PET/CT Images... 2 Figure 1.2. Positron Emission, Resulting Annihilation, and Photon Release... 3 Figure 1.3. PET Images Acquired with Free Breathing, A & B, and with a Breathe Hold Technique, C & D [34]... 9 Figure 1.4. True, Random, and Scattered Photon Detection by PET System Figure 1.5. PET Coincident Event Detection with Time-of-Flight Adjusted Line of Response Figure 1.6. Fused PET/CT Image of Phantom with Hollow Spheres Illustrating the Results of the Partial Volume Effect Figure 1.7. Philips GEMINI TF 64 PET/CT Scanner Figure 2.1. Distribution of the Time from FDG Injection to Scanning for the Whole Patient Population Figure 2.2. Distribution of Studies by Absolute Injected Activity Levels in Half mci Increments for Adult Patients Figure 2.3. Distribution of Injected Activity per kg Bodyweight for Adult Patients Figure 2.4. Distribution of Average Injection to Scan Times by Time of Day at Which Scanning was Completed Figure 2.5. Distribution of Average Injection to Scan Times by Patient Age Figure 2.6. Distribution of Average Injected Activity by Time of Day of Injection Figure 2.7. Injection to Scan Times and Injected Dose for Five Patients with Multiple Studies Continued Figure 2.8. Representative Philips GEMINI TF Images with Varied Reconstruction Settings ix

12 Figure 2.9. Average Absolute Value Percent Difference from Reference SUVmax for All ROIs Evaluated with Varied Reconstruction Settings Figure Average Percent Difference from Reference SUVmax of Each Tissue Type with Varied Reconstruction Settings Figure Average and Range of SUVmax among Each Relaxation Parameter within Sharpness Settings, by ROI SUVmax Figure Average and Range of SUVmax among Each Relaxation Parameter within Sharpness Settings, by ROI Volume Figure Absolute Value of Lesion Percent Difference, by ROI SUVmax Figure Average and Range of Lesion SUVmax Percent Difference, by ROI SUVmax Figure Absolute Value of Lesion Percent Difference, by ROI Volume Figure Average and Range of SUVmax Percent Difference, by ROI Volume Figure Representative Images Reconstructed with the Previous OSEM and the Current TrueX Reconstruction Settings Figure Distribution of 3D OSEM and TrueX SUVmax and SUVavg Percent Differences for All ROIs by ROI SUV Figure Distribution of 3D OSEM and TrueX SUVmax and SUVavg Percent Differences for All ROIs by ROI Volume Figure Lesion 3D OSEM and TrueX SUVmax and Percent Difference Between the Two Measurements, Ordered by Increasing Activity Figure Lesion 3D OSEM and TrueX SUVmax and Percent Difference Between the Two Measurements, Ordered by Increasing Surrounding ROI Volume Figure Comparison of OSEM and TrueX Lesion to Background Contrast Ratios.. 69 Figure 3.1. Sample Images of Phantom Data Acquired on the Philips GEMINI 64 TF and the Siemens Biograph 64 mct PET/CT Figure 3.2. Maximum kbq/ml per Volume Measured on the Siemens Biograph During the First Experiment x

13 Figure 3.3. Maximum kbq/ml per Volume Measured on the Philips GEMINI During the First Experiment Figure 3.4. Maximum kbq/ml per Volume Measured on the Siemens Biograph During the Second Experiment Figure 3.5. Maximum kbq/ml per Volume Measured on the Philips GEMINI During the Second Experiment Figure 3.6. Average Recovery Coefficients for all Spheres Measured on Both the Siemens Biograph and Philips GEMINI Figure 3.7. Time Curves from the Philips GEMINI and Siemens Biograph, respectively, Depicting the Decrease in Measured Activity Concentrations of Each Sphere Over Time Figure 3.8. Time Curves from the Philips GEMINI Showing the Disagreement Between Measured and Actual Activity Concentrations of Each of the Smallest Spheres Over Time Figure 4.1. Representative Image of Phantom with Hollow Spheres Containing Varied Activity Concentrations Figure 4.2. Phantom Percent Differences from Reference Activity Concentrations with Variations in Emission Scan Acquisition Times Figure 4.3. Average Phantom Sphere ROI Maximum Pixel Value Measurements for Each Emission Acquisition Duration Figure 4.4. Average of Maximum Pixel Value Measurements from Real and Clipped Phantom PET/CT Data... Continued Figure 4.5. Representative Clinical PET Images for Each Emission Acquisition Duration Figure 4.6. Average and Range of Reviewer Image Quality Scores per PET Emission Acquisition Duration Figure 4.7. Whole Body PET Image with Overlay of Acquisition Volumes and Count of Prompt Coincident Events per Volume for Each Varied Acquisition Duration xi

14 Figure 4.8. Average Percent Difference from Reference SUVmax of Each Tissue Type with Varying Acquisition Duration Figure 4.9. Average Percent Difference from Reference SUVavg of Each Tissue Type with Varying Acquisition Duration Figure Correlation Plots of Each Lesion SUVmax for Each Acquisition Duration as Compared to 90 Second per Emission Volume Measurements Figure Bland Altman Plots of Each ROI s SUVmax Agreement with Reference Values for Each Acquisition Duration, Including 95% Confidence Intervals... Continued xii

15 Chapter 1: Background Within the practice of medicine, the role of imaging is a critical one. The ability to visualize processes and changes occurring within the body guides physicians in diagnostic and treatment planning decisions in ways that physical exams cannot. Since Wilhelm Roentgen s discovery of x-rays in 1895 [1], medical imaging has developed into a vast field leading into a future where personalized medicine, improved detection of disease and more effective treatment choices are all well within reach [2]. At the leading edge of these developments is combined positron emission tomography (PET) and computed tomography (CT) [3]. Fully realized in the 1970 s and becoming clinically available in the 1990 s [1], PET brought a new perspective to medical imaging, allowing for detailed visualization of metabolic processes. Molecular imaging combined with anatomic imaging provides a complete picture of many diseases, such as those related to cancer and beyond. Over the last decade, PET/CT has been proven as a useful tool in cancer diagnosis, staging, treatment planning and therapy response assessment [4]. For these reasons, PET/CT imaging also plays a key role in biophysical and biomedical research. Whether investigating new radiopharmaceuticals for better targeting of specific diseases [5, 6] or seeking to improve current imaging practices [7, 8], the capabilities of this imaging modality are constantly changing. Increasingly PET with 18 F- fluorodeoxyglucose (FDG) is being used as a biomarker of disease [9, 10]. Though somewhat limited in resolution, PET has strength in allowing truly whole body imaging of radiopharmaceutical distributions with a very high level of sensitivity. When combined with the high quality anatomic images produced with CT, this sensitivity and accuracy in detection and evaluation of disease is further increased. An example of this combined imaging is seen in Figure 1.1. Here CT alone shows details of anatomic structures while 1

16 PET highlights the increased metabolic activity of a suspected tumor. And combined, a clear picture of the suspected disease can be realized. FDG PET/CT now comprises part of the standard work-up for staging in many of the most common forms of cancer [4, 10, 11]. It has also been validated as an early measure of the effectiveness of therapy [12, 13], enabling physicians to better tailor treatment plans in a more patient friendly and cost effective manner [14]. Figure 1.1. Sample CT, PET and PET/CT Images The underlying principle of PET imaging is the utilization of the radioactive decay of a radioisotope for imaging purposes. 18 F is an ideal radionuclide for imaging for several reasons. It has a half-life of 109 minutes, which is long enough to be useful in biological imaging. Additionally, it is almost exclusively a positron emitter and the energy of those positrons, and their resultant annihilation photons, are detectable at an excellent level [15]. 18 F can also be well substituted into biomolecules without significantly impacting the chemical function of those molecules. 2-( 18 F)-fluoro-2-deoxy-D-glucose (FDG) is transported into cells in the same way as endogenous glucose, via the membrane transporter 2

17 GLUT1 [16]. It is taken up by most normal tissues, in particular in the brain and myocardium, with the exception of adipose tissue [17]. Cancer cells produce about 60% of their ATP through glycolysis, as opposed to mitochondrial oxidative phosphorylation [18]. This shift in metabolism, the Warburg phenomenon, is made in order for cancer cells to better meet their unique needs. It is useful in low oxygen environments, results in the formation of lactate, which is used elsewhere in the cell, and the intermediates of the glycolytic pathway can be used for several anabolic reactions [19]. With an increase in glycolysis, cancer cells then have an increased need for and usage of glucose. Thus, in addition to increases in protein and DNA synthesis, blood flow and amino acid transport, in general cancer cells also exhibit overexpression of GLUT1 [16]. Once FDG has been transported inside a cell, it is phosphorylated by hexokinase II, becoming 18 F-FDG-6-phosphate [16]. In this state, the phosphorylated FDG undergoes no significant reactions and is polar, so that it does not readily cross the cell membrane again until later dephosphorylated [17]. The increased blood flow to cancer cells and generally hypoxic environment further upregulates FDG uptake. Thus, following emission from a parent 18 F nuclide, a positron will undergo annihilation with a nearby electron and emit two 511 kev photons, as is shown in Figure 1.2. This occurs near the location of the cells Figure 1.2. Positron Emission, Resulting Annihilation, and Photon Release 3

18 having taken up the FDG. The recorded increase in the PET signal is then proportionate to the number of cancer cells at that physiological location, making FDG PET quite specific [17]. PET is however limited in sensitivity, due to non-target uptake, particularly in inflammatory cells present following many forms of treatment or other illnesses. In this way PET and PET/CT serve as key imaging tools in the diagnosis and initial staging of cancer. Conventional techniques that until recent years have been state of the art in the detection and definition of disease are now seeing enhanced results with the incorporation of PET/CT studies, or in some cases are being replaced entirely by them [4]. PET has been shown to have high diagnostic accuracy in lymphoma and cancers of the GI tract and lung, particularly in distinguishing between malignant and benign solitary lung nodules and masses [20]. In a study of the staging of locally advanced or inflammatory breast cancer, FDG PET/CT was compared to the conventional locoregional work-up [11]. This typically includes a physical examination, bilateral mammography and sonography or breast MRI exams. Here PET/CT showed important prognostic capabilities by detecting all known primary tumors as well as identifying new, distant metastases in many patients. With bone being the most common site of metastasis in breast cancer, it is also important to note that PET/CT outperformed conventional planar bone scanning in this study. This work highlights PET/CT s key characteristic in whole body imaging, which aids in the diagnosis and staging of disease by its potential to visualize distant, unknown sites of disease in ways that traditional, more focused approaches do not. Current research shows that while PET/CT alone may not be sufficient in the full diagnosis of locally advanced breast cancer it certainly adds pertinent information to the staging process. A study of PET s role in the staging of non-small cell lung cancer (NSCLC) describes its ability to impact treatment planning as compared to conventional contrast enhanced CT imaging [10]. Here the addition of PET data changed initial staging in 42% of patients, with 33% of all patients being upstaged. Nearly all of those patients who were upstaged had results confirmed by pathology. The few that were not pathologically confirmed were also incorrectly staged by conventional methods. In this study 61% of all patients had treatment plans changed following the evaluation of PET data, many of which 4

19 were moved from curative to palliative approaches. This avoidance of unnecessary treatment, as well as several cases of moving to more aggressive treatment, shows PET s importance leading in to treatment. While it may initially be a greater expense than conventional techniques, the ability to better determine an appropriate course of treatment makes PET/CT a key tool, both from a patient comfort and long-term economic viewpoint [14]. While PET/CT has been shown to hold high potential in the diagnosis and staging of cancer, it perhaps has a greater capacity for impacting the early evaluation of tumor response to therapy and prediction of long-term outcomes. Most cancer treatment drugs are effective in greater than 60% of cases [21]. Yet this leaves the potential for many nonresponders to any given drug, and many diseases now have several viable treatment options [22]. Thus the early determination of the efficacy of treatment can be a very important factor in overall outcomes of many cancers for which survival rates remain low. Not only can early identification of ineffective treatment spare patients any ill-effects of continuing that course, it also creates the opportunity to employ a new and perhaps more effective approach. Though the response evaluation criteria in solid tumors (RECIST) and other techniques such as the measurement of specific serum markers remain the predominant evaluation methods in therapy response assessment, there is a continuing shift toward the inclusion of PET data at this stage of cancer care. In lymphoma, breast and lung cancer, PET has been shown to have high (greater than 90%) accuracy in predicting nonresponders very early in the course of treatment [21]. In particular, it is the earlier changes in metabolic characteristics which PET is able to detect prior to physical changes in tumor size that give it this advantage. In some cases, disease stabilization is considered to be a favorable outcome as opposed to eventual tumor shrinkage [22]. In these situations, PET measures of metabolic activity can be quite useful. PET has been shown to be more effective in differentiating residual tumor mass from treatment induced necrosis or fibrosis than CT alone. For these reasons many research studies continue to evaluate the best use of PET in therapy response assessment [12, 23, 24]. 5

20 Thus, in recent years PET/CT has gained popularity and clinical importance in all stages of cancer treatment and care. With improvements in the image quality and resolution of the CT component, PET/CT has a high level of sensitivity in detecting disease [25]. However, on a strictly visual basis, the specificity is limited in terms of distinguishing between malignant and benign tumors. Although visual analysis of PET data has traditionally been sufficient [26], in seeking the improved use of PET/CT studies, the quantitative evaluation of metabolic data has also become a point of interest. There are several well developed methods for quantitative analysis of PET data. However the measurement of standardized uptake values (SUV) adjusted to total body weight remains the most commonly used approach [27]. As a somewhat simple ratio of the activity concentration in a tissue to the injected activity in the total body size [21], SUV is only a semi-quantitative measure of actual tumor metabolism, from a fixed timepoint [25]. Despite these and other limitations, it is the simplicity of the measurement, as opposed to more complex kinetic modeling, that keeps SUV appealing to clinicians. The use of quantitative measures has been shown to compliment qualitative analysis of PET/CT studies as well as reducing inter-observer variability in cases of serial scans over the course of treatment and follow-up [25]. In the merging of anatomic and physiological data, there is great potential for the evaluation of diseases on a microscopic level [26]. In fact it has been the drive to use PET not only for determining if a tumor may be malignant or benign, but there has also been an increased use of PET in the early evaluation of therapy response which has led to the increased use of quantitative analytic methods. Several studies highlight the improved use of PET quantitative data in the diagnosis and staging of cancer. In the determination of whether an FDG avid lymph node is malignant or benign in nonsmall cell lung cancer, the maximum SUV pixel value within a defined region of interest (SUVmax) as measured in early and delayed PET studies has been proven useful [28]. By comparing absolute changes in SUVmax in two series completed one hour apart, more than 80% of all lymph nodes studied were correctly classified as either malignant or benign, as confirmed by pathology. Another study in non-small cell lung cancer showed that the combined measurement of PET SUVmax and CT data allowed for the distinction of adenocarcinoma with bronchioalveolar carcinoma over other types of non-small cell lung 6

21 cancer [29]. As this distinction may be critical in the planning of treatment, the careful use of PET quantification can prove vital. In discussing the increased use of quantitative analytic methods with PET/CT studies, the many factors affecting final measurements of SUV should also be noted. SUV, being a semi-quantitative measure of the activity concentration of any given tissue, is dependent upon many levels of the PET/CT study, from acquisition of the data through the analysis methods used [30]. Firstly, there are several biological factors affecting the distribution of FDG within a patient. Primarily, it is a weight-dependent distribution that of course varies based upon the actual dose of radiopharmaceutical administered. While weight-based dosing is employed in some cases [31] frequently one standard dose is used for each patient in a population. So patients with a smaller body weight will consequently have a higher available activity level in a tumor that may be identical to one in a heavier patient [30]. This is an important characteristic in the case of serial scanning over the course of treatment and long-term follow-up care. If a patient is likely to lose or gain any significant amount of weight then relative SUVs for the same dosing level must be taken into account. Additionally, a patient s overall body composition, dependent upon weight or not, is also a contributing factor in the distribution of FDG. White body fat is less metabolically active than muscle, so for a patient with low muscle mass and a higher BMI, there will be more free FDG than in a patient with a higher amount of muscle [30]. This may result in an identical tumor having a lower SUV in a leaner patient over one present in a heavier individual. This is where several proposed measures of SUV have been developed. SUV corrected for lean body mass or body surface area are potential calculations [27]. Although these methods are complicated by the fact that there are several ways to measure both lean body mass and body surface area. And they have been shown to have similar results to SUV adjusted simply to body weight, meaning that it remains the current standard measure. Another biological factor affecting FDG distribution is a patient s blood glucose level (BGL) at the time of injection [30]. FDG competes directly with endogenous glucose for uptake into any cell. Thus, if a patient has a high BGL at the time of injection, FDG 7

22 uptake may be inhibited and consequently SUV in a target lesion measured artificially low. For this reason it is recommended by the NCI that patients not complete a PET study if their BGL is greater than 200 mg/dl, with special consideration for diabetic patients [32]. Corrections for blood glucose levels in calculating SUV have also been proposed [33]. The results of such normalization efforts are mixed, providing improved accuracy for some patients but potentially introducing new calculation errors in those with greater fluctuations in BGL over a series of scans [30]. Thus, BGL is a factor which should be kept in mind but for which most patients probably do not require correcting for [32]. Additional physiological factors exist which affect the accuracy of PET quantification. These include the presence of inflammatory cells which tend to uptake FDG at rates greater than normal cells [30]. Breathing motion during emission scanning affects the areas surrounding the lungs and upper abdomen. Misregistration between PET and CT images due to breathing can lead to count dropping around the diaphragm. And blurring of the PET data can make accurate identification and quantification of small lung nodules difficult, as is seen in Figure 1.3 [17, 34]. This figure shows the difference in image quality in the lung region with and without a gated-breath hold technique, with the solid arrows pointing to a tumor lesion in the lung. Among these and many other forms of physiological uptake which can impact the quality of a PET scan, none can be completely accounted nor corrected for. However, during scanning every effort to minimize these effects by standardizing the study protocol should be made [32]. 8

23 Figure 1.3. PET Images Acquired with Free Breathing, A & B, and with a Breathe Hold Technique, C & D [34] Stemming from the understanding of physiological factors affecting PET/CT results are factors involved in the acquisition of PET data. These include the injection to scan time and the choice of emission scan duration. The waiting time between FDG injection and the commencement of emission scanning is a choice that has been thoroughly debated, though still varies from clinic to clinic and may vary greatly in dynamic and research related studies [8]. Most studies fall within a minute waiting period, however, with 60 minutes being the most common [32], as it has been shown that at least a 55 minute wait gives improved contrast ratios as compared to shorter durations [8]. In the interest of dual time point imaging, several studies have explored the impact of varied waiting times on PET quantification. In clinical scans which can take between 15 and 30 minutes to cover the whole body, a delay in imaging different anatomic positions is always present and all emission data are corrected to the start time of the initial bed position [8]. This means that tissue in the lower body can have an uptake time upwards of half an hour longer than that in the upper part of the body, for scanning in a head to toe direction. And it has been shown that imaging at 60 and at 120 minutes post-injection gives rise to an average of ±18% difference in SUVmax measurements in both benign and malignant FDGavid lesions [28]. Imaging at 45 and 90 minutes post-injection shows similar variation, with 9

24 malignant tumors increasing in SUVmax by between 15.8% and 22.6% and benign lesions having between 1.7% and 10.9% decreases in SUVmax [35]. These studies prove the variability of FDG uptake over time, that it is not a static process but one that in fact varies greatly over the course of hours following injection. This means that visual reviews, to some degree, and any quantitative data taken from a PET study will vary as well, dependent upon the waiting period. For this reason it is well understood that consistent waiting times, whatever they may be, must be employed, especially in serial scanning over the course of treatment and follow-up [28]. Additionally, when designing a PET/CT protocol, the choice of the emission scan duration has an impact on the resulting visual and quantitative data. Two possibilities for increasing the number of coincident count events in a PET/CT image are to increase the injected activity or to increase the scan duration. As with injection to scan time, both dosing and acquisition duration have been well studied, however there is still variability among standard protocols [7]. In optimizing the injected dose, both visual quality and count statistics come in to consideration. Through the analysis of the noise-equivalent counting rate it has been shown that the improvement in PET signal detection plateaus at certain dose and count levels [36]. This comes from the fact that at the same time as the number of coincident events is increasing, so too are random and scattered noise events. Random events are proportional to the square of the activity in the scanner FOV; therefore above a given threshold the random noise signal can overwhelm the true coincident events [16]. So in addition to factors such as the detector energy resolution and timing window, the amount of activity present can improve or limit image quality and quantification. While one study suggests that a dose of at least 8 MBq of FDG per kilogram of body weight produces the best image quality in PET scans without CT for attenuation correction [7] this dosing level is both cost prohibitive and concerning in terms of radiation exposure, particularly in larger patients. Therefore the choice to increase the acquisition duration of each emission volume becomes a way to increase useable counts without also increasing present noise levels. In the evaluation of the impact of the number of true coincident events and random events on SUV measurements, it was shown that within a range of true to random count ratios of

25 to 0.96 there was no significant impact on PET quantification [37]. This was over a wide range of total counts, from 5 to 120 million per data set. This study thus shows that quantification is stable once an adequate image quality has been reached. However, as always, it is best to aim for consistency among scan protocols, in order to ensure the best results are achieved for comparison between multiple examinations. There are several other technical factors which have an influence on measured PET SUVs as well. These include the physical configuration of each scanner as well as available software options on each imaging system. Calibration between scanners and dose calibrators is critical to ensuring accurate quantification [38]. In the American College of Radiology Imaging Network s qualification of 100 scanners, a 6% variability in measurements among identical scanners was reported [39]. Even with good quality control practices, shifts in calibration between and among systems exist and should be monitored and corrected for when needed. Additionally, the configuration of the scanner, and its ability to accurately detect and record coincident photons, also plays a key role in eventual measurements of any emission scan data. The spatial resolution of any imaging system is an important physical characteristic. It is defined by the detector size, off-axis detector penetration and detector Compton scatter, among other factors involving the choice of radioisotope [40]. Improvements in available scintillation crystals have recently allowed for improved spatial resolution as well as reduced scatter fractions, increased sensitivity and count rate performance, and improved accuracy of count loss and randoms corrections [41]. Figure 1.4 diagrams the differences in detection of true, random and scatter events. A true event consists of two coincident photons being detected by two detector crystals in line with one another, creating a line of response (LOR) between them along which the positron annihilation event actually occurred. A random event comes about when two unpaired photons are detected simultaneously by two crystals, creating a misaligned LOR and therefore incorrect localization of the positron annihilation event. Lastly, scattered events occur when a photon is set off its true trajectory by interacting with anatomic structures while leaving the body. This also results in inappropriate detection and thus LOR placement. 11

26 Figure 1.4. True, Random, and Scattered Photon Detection by PET System These improvements in detection come with the upgrade from bismuth germinate (BGO) based scanners to those with lutetium oxyorthosilicate (LSO) scintillator detectors. LSO allows for singles count rates of approximately 1 million counts per second with an acceptance window between 350 and 650 kev, due to a short decay time and high light output and stopping power, as compared to BGO [41, 42]. Alongside this improvement in detection is the move toward time-of-flight (TOF) based scanning, made possible in part by the improved performance of the photomultiplier tubes used as well [42]. TOF capabilities allow for more precise detection in the difference in arrival times of the two photons in a prompt coincident event [43]. For annihilation events happening at the center of the LOR depicting the travel of the resulting photons, the photons will arrive and be detected approximately simultaneously. For events away from the center of the scanner field of view (FOV) however, the more precise detection of the difference in arrival times allows for the LOR to be adjusted by a Gaussian curve, so that the signal may be weighted toward where the annihilation event, and thus the original positron, was located. This effect is illustrated in Figure 1.5. The improved localization may even be exact enough to isolate the event to within the single voxel in which it occurred [43]. 12

27 Figure 1.5. PET Coincident Event Detection with Time-of-Flight Adjusted Line of Response Although TOF does not allow for a direct improvement in spatial resolution, the better localization along each LOR reduces noise in the image data, thereby increasing the image signal to noise ratio (SNR), and leading to an eventually improved resolution [43]. TOF data are particularly useful in the imaging of larger patients where traditional PET imaging suffers due to increased attenuation, scatter, and loss of true counts [42]. The gain in SNR follows the equation: SNRgain ~ 2D c dt [16] where D is the patient diameter, c is the speed of light and dt is the difference in arrival times of the two coincident photons. Thus, the larger a patient is, the greater the potential gain in final image quality when using TOF PET imaging. The use of TOF data has been shown to improve lesion detection in clinical data from a qualitative standpoint [43]. Dependent upon reconstruction settings, it also shows an improvement of up to 43% in measuring lesion to background activity ratios, with the largest gains being for the heaviest patients [42]. This improved contrast recovery comes alongside the faster and more uniform convergence during reconstruction, an additional benefit of TOF acquisition. 13

28 In addition to improved detection of coincident photons, PET/CT imaging has the potential for improved qualitative and quantitative results based on the inclusion of a model of the system point spread function (PSF). This gain stems from the impact of the partial volume effect (PVE) on both qualitative and quantitative readouts. The PVE occurs to the greatest degree in measuring small objects within a PET image. It consists of two distinct components, image blurring and image sampling [44]. Image blurring is cause by the limited spatial resolution of the imaging system. However good it may be, the resolution is still finite. This blurring causes spillover of measured activity concentrations between adjacent tissue types. So the end result is that a small, hot source will appear larger and as to contain less activity than it truly does. Image sampling has to do with the fact that each object in the image space is sampled on a voxel grid. The exact contours of any object are unlikely to match with the grid patterns, and so it fills parts of voxels with adjacent objects at the borders between the two. The combination of these two effects can impact PET imaging to a large extent, especially if trying to image small, hot lesions in a much colder background. An example of this is seen in Figure 1.6 where a set of hollow spheres containing 18 F is imaged. Here the CT image shows each sphere s true dimensions, while the represented PET activity in the smallest of the spheres is seen to extend beyond the true volume. Although the use of CT data can help clarify discrepancies in the apparent size of an object, errors in quantitative measures require further correction. 14

29 Figure 1.6. Fused PET/CT Image of Phantom with Hollow Spheres Illustrating the Results of the Partial Volume Effect There are several methods for correction ranging from involvement at the reconstruction level through applying correcting factors to image measurements. Focusing here on improving spatial resolution and reducing image noise levels, applying modeling of the imaging system PSF holds potential in helping correct for PVE [45]. The part of limited spatial resolution which a PSF can help account for is in the arrival and detection of coincident photons. For photons arriving from the edges of the scanner FOV, they are more likely to pass through the appropriate detector crystal and into an adjacent one because of the angle of the LOR [45]. This results in the incorrect LOR being applied to the photon pair and causes distortion in the final image. The PSF can be measured at many points in a given scanner FOV and applied during reconstruction, helping better position LORs and thus reducing image noise and improving spatial resolution. By describing how a point source is rendered by the imaging system, the PSF then helps account for degradation of the PET signal [46]. The application of the PSF can occur either in sinogram space or at the imaging level. Within sinogram space it evaluates the resolution degradation 15

30 on the acquisition data purely dependent on the width of the sinogram beds. Here, the PSF is not affected by the scanner FOV and includes no other factors from the reconstruction process. At the image level, the correction is dependent upon the reconstruction used, allowing for a higher match between the image pixel size and the precision in the PSF modeling. This type of correction has been shown to be most appropriate for TOF listmode acquisitions [46]. Both phantom studies and clinical data have shown improvement in image quality and quantification with the application of the system PSF during reconstruction. Phantom studies demonstrate improved spatial resolution and recovery of activity distributions, while clinical images have increased spatial resolution as well [46]. It has also been shown that for reconstructions with the same number of iterations, including the PSF reduces noise effects and improves the standard deviation of the image in phantom data [47]. These improvements allow for increased detection of small metastatic lymph nodes, as tested in a small number of patients [48]. A final technical factor affecting both image quality and quantification of PET/CT images is the choice of the reconstruction algorithm and of settings within that algorithm. As previously mentioned, the incorporation of TOF and PSF information into the reconstruction stage of PET/CT data acquisition has an impact on the final results. But so too does the simpler choice of how to reconstruct the data. Most current PET systems utilize an iterative reconstruction method, but the first systems employed filtered back projection (FBP) approaches, due to early limitations in computing capabilities. This method involves taking information from each LOR determined during the emission scan. The data are precorrected for scattered and random events as well as attenuation, via the CT data [16]. The goal of reconstruction is to recover tracer activity concentrations from each point in the image space, which is done by modeling each LOR as line integrals containing the radial position and angle of each detector pair comprised in the LOR. Next, each line integral along a given angle forms a projection of that angle over the entire detector FOV [1]. By first performing a Fourier transform of each projection, an appropriate filter can be applied, then the inverse Fourier transform is calculated to obtain a filtered projection profile [1]. Back projection is then applied in order to reconstruct an image showing 16

31 activity from any given slice of the data set using projections obtained from all angles covering that slice. Back projection of each projection converts the information into a 2D array [15]. With noise free data this process would return an exact depiction of the true activity distribution. However, in FBP high frequency noise is also amplified, degrading the image SNR and possibly leading to edge enhanced ringing artifacts. Some of the shortcomings of FBP reconstruction techniques can be overcome by making use of improved computing capabilities and approaching reconstruction in an iterative way [49]. Using a more general linear model allows for a better description of the blurring and attenuation occurring in the PET image. With an iterative technique, the estimated image can be progressively improved through each iteration. By beginning with an initial estimate of the activity distribution of the image a projection step can be applied to the current image estimate, yielding a set of projection values that would result if the estimate were true. These predicted projections are then compared to the actual data set, creating a set of projection-space error values. The error values are mapped onto the image space by back projection to create an image-space error value, which is used to update the image estimate. This process is repeated over a chosen number of iterations until an acceptable image estimate is produced. The addition of ordered subset expectation maximization (OSEM) allows better refinement at each iteration by breaking the full set of projection data into mutually exclusive subsets and applying statistical methods for approximating the most likely activity distribution that would have created the sequential data in each subset. This requires the same amount of time per iteration to compute, but arrives at convergence faster [15], meaning that fewer iterations are then needed [49]. As compared to FBP, OSEM iterative reconstruction reduces streak artifacts in the PET image and results in better SNR in areas of low uptake [16]. However, the use of more subsets, while increasing reconstruction speed, will also introduce more noise into the image [49]. Therefore the choice of reconstruction methods and settings has an important impact on PET/CT image quality and quantification. It has been shown that quantitative measures from iteratively reconstructed images are increased over those from identical images reconstructed using FBP [50]. Although these measures are more accurate than 17

32 those from FBP images, at the time when iterative reconstruction was being introduced adjustments in the evaluation of PET images needed to be made to accommodate these new ranges. Even within the same reconstruction method however, the number of subsets and iterations to be used and the application of filters during or after reconstruction are known to affect the quantification. Changes made in the number of subsets and iterations have been shown to dramatically affect measured SUVs [37]. Comparing two protocols for the same phantom data reconstructed with 2 through 8 iterations, differences in SUVs ranged from 6 to 70% and varied by 2-25% when the number of subsets was changed from 2 to 32 for the same number of iterations. Although these constitute drastic changes in reconstruction protocols, they serve to highlight how simple changes can significantly affect quantification of PET data. Thus this study, as well as many others, stresses the importance of standardizing PET/CT study protocols, from beginning to end [4, 30, 32]. Despite the many ways in which PET/CT quantitative measures can be varied, it has been shown that SUV can be a highly reproducible measurement. When careful consideration is given to maintaining consistent protocols, in every aspect from patient preparation to acquisition and reconstruction to ROI analysis methods, SUVmean has been measured with a correlation of 0.99, a difference of 0.01±0.27 SUV and SUVmax has been measured with a -0.05±1.14 SUV difference for patients scanned twice in a short period of time [51]. Additional studies have shown similar reproducibility [25] proving the true capability of PET/CT as both a qualitative and quantitative tool in medical imaging. Following this discussion of technical factors affecting PET/CT image quality and quantification, it should be pointed out that differently manufactured imaging systems involve different physical configurations as well as different standard acquisition and reconstruction settings. As will be discussed in detail in later chapters, two scanners are used within the scope of this research. They are within the same clinical environment and therefore operate with the same dosing levels and acquisition durations, averaging 15 mci of 18 F-FDG per patient with emission scanning beginning 75 minutes post-injection and comprising 90 seconds per PET acquisition volume. However certain fundamental differences exist between the two systems. The Siemens Biograph 64 mct consists of a 18

33 78 cm bore with a 21.8 cm axial FOV, containing lutetium oxyorthosilicate scintillator crystals in 4 rings each with 48 detector blocks [52]. It has TOF capabilities, however that acquisition mode is not used during ordinary clinical studies. The TrueX HD reconstruction algorithm that is used for such studies is a fully 3D algorithm with an ordinary Poisson OSEM including accounting for the system PSF. The standard reconstruction settings utilize 3 iterations with 24 subsets per iteration and a 5 mm full-width at half maximum Gaussian filter. Additionally, during PET acquisition there is a 33% overlap between emission volumes. The second imaging system within the clinical environment is a Philips GEMINI TF 64 PET/CT, pictured in Figure 1.7. This scanner is also a whole body imaging system with a 71.7 cm bore and an 18 cm axial FOV. It contains lutetium-yttrium oxyorthosilicate detector crystals arranged in 28 flat modules of a 23x44 array [53]. This system does include TOF acquisition for all clinical scans. The standard reconstruction algorithm is a 3D list-mode OSEM including 3 iterations with 33 subsets each and a 14.1 cm TOF kernel width. Reconstruction on this scanner does not account for the system PSF with the current software. Also, the acquisition utilizes a 50% overlap between emission volumes. These differences highlight some of the ways that different manufacturers configure each system in a unique way. It should also be noted that even for the same systems, software and hardware upgrades can include important changes as well. Examples of such changes include upgrading detector elements, making use of new reconstruction software and adding components to emission scanning such as TOF. As has been discussed, all such changes impact PET/CT image quality and quantification. So care must be taken whenever comparing two differently acquired data sets, however minor those differences may seem. 19

34 Figure 1.7. Philips GEMINI TF 64 PET/CT Scanner A final introductory topic concerning the following work is that of the radiation dose involved in PET/CT studies. As previously discussed the image quality and accuracy of quantification in any PET/CT exam is dependent upon the number of useful coincident count events detected during the emission scan. This statistic is then inherently related to the injected dose of radioactivity. However, at the same time that optimal imaging data is being sought, patient safety must also be considered. With the increasing usefulness of nuclear medicine studies in the monitoring of cancer and other diseases, the effective dose of radiation exposure to the general population has more than doubled in the last twenty years [54]. This increase in exposure ultimately may lead to an increased risk of secondary diseases for patients undergoing the greatest number of studies. One study of the effective doses of several PET/CT protocols found that the average total effective dose was around 6.2 msv from the PET component and the dose from the CT component ranged from 7 to 25 msv depending upon the tube potential and current settings [55]. This means that CT contributes between 50 and 80% of the total dose during a PET/CT scan. And thus efforts 20

35 have been made in recent years to limit this dose as much as possible through the use of low-dose CT protocols when it is needed primarily for attenuation correction and full diagnostic resolution is not necessarily required [56]. Yet the effective dose from the injection of FDG remains a point of interest as well. With recent advances in scanner technology and reconstruction methods perhaps previously optimized dosing strategies now achieve count levels beyond those needed for adequate imaging. In the era of growing concern over nuclear medicine studies overexposing populations of patients [57] now is the appropriate time to investigate new and improved dosing strategies in PET/CT imaging, based upon each factor previously discussed. Since its inception, PET and PET/CT have held great potential in the field of medical imaging. Providing a unique metabolic perspective of disease states, PET is now used in oncology for many applications, from initial diagnosis to long-term follow-up. In the past twenty years advancements in detection and reconstruction of PET data as well as the fusion with anatomic imaging have put PET at the forefront of personalized medicine. From the metabolic changes cancer cells undergo that result in increased radiotracer uptake to the potential sources of nonspecific uptake, understanding physiological factors affecting PET/CT imaging allows for detailed qualitative review of images. Technical factors which impact PET/CT imaging, such as the development of improved detector crystals for the incorporation of TOF information and upgrades in software for more accurate image reconstruction, do so not only in a qualitative sense but in a quantitative one as well. Constant efforts are made on the part of manufacturers and clinicians to ensure the accuracy and reproducibility of all aspects of any PET/CT examination. In this spirit the following chapters report the results of research conducted with these ideas in mind. Aiming to produce consistent results through further analysis of factors affecting PET/CT, quantifying the true accuracy with which PET/CT systems measure radioactivity distributions and evaluating how radiation exposure may be limited by taking advantage of state of the art PET/CT technologies, this work seeks to prove the accuracy and reproducibility of PET/CT data in clinical settings, further advancing its important and essential role in oncology. 21

36 Chapter 2: Evaluation of the Consistency of PET/CT Data Acquisition and PET Quantification with Variations in Reconstruction Methods Chapter 2.1: Introduction In oncology, 18 F-fluorodeoxyglucose (FDG) PET/CT is increasingly being used for biomarker like assessments at each stage of patient care [4]. Metabolic imaging combined with anatomic imaging has become part of the standard protocol in the diagnosis, staging and therapy response assessment of many types of cancers [9]. Therefore both adequate image quality and robust quantitative measures are essential. Visual assessment of PET/CT data has until recently been the primary method of analysis [26]; however improvements in detection and reconstruction of PET/CT data have allowed for the use of quantitative measures as well. Such quantitative results have been applied in determining whether a tumor is malignant or benign [28, 29] and have been shown to reduce inter-observer variability when reviewing multiple scans over the course of treatment and follow-up [25]. Thus the need for consistent quantitative accuracy is critical in order for PET/CT data to maintain an effective role in oncologic imaging. It is well known that many factors affect the measurement of activity concentrations in PET/CT imaging [30]. These include patient-specific qualities such as body weight and composition, blood glucose levels at the time of injection as well as patient compliance during scanning so as to avoid motion. There are also scanner-specific effects, such as the scintillator detector configuration [41], reconstruction algorithm used [37], and the calibration of the scanner and the dose calibration equipment [38]. Additionally, patient preparation and acquisition factors impact quantification, such as the administered dose and the choice of uptake time and emission scan duration. Standard recommendations have been established for these factors, though 22

37 a large amount of variability still exists across the field of PET/CT imaging [58]. With variation in any of the above imaging protocol options, the accuracy and robustness of PET/CT quantification may be compromised. Thus it is important to evaluate the true impact of any such changes. This is especially relevant when these changes are either implemented with standard equipment and software upgrades or are unknowingly taking place in the clinical setting. The following work is aimed at evaluating some of these effects in clinical data sets. The true variation in some critical patient preparation factors was analyzed. These included an evaluation of the range of injection to scan times and administered doses seen in a busy clinical environment. Recommendations for standard uptake periods and radiopharmaceutical doses have been made by several groups over time [8, 32, 56]. It is recommended that general PET/CT studies be completed around 60 minutes following the injection of FDG, with most studies being initiated within a range of minutes postinjection [32, 58]. Studies utilizing dual time point imaging have shown that quantitative measures vary by an average of ±18% when imaging is completed at 60 and 120 minutes after FDG injections [28]. Another study measured changes in tumor uptake of about 20% when imaging at 45 and 90 minutes post-injection [35]. It is known that uptake in both malignant and benign tumors, as well as normal tissues, continues to change for hours following FDG injection. A study of the changes in SUV of untreated locally advanced breast cancer showed that between 27 and 75 minutes after injection of radioactivity the measured uptake changed approximately linearly [59]. These results stress that when using PET/CT data for quantitative analysis consistent uptake times are essential, particularly when imaging is being done repeatedly over treatment and follow-up periods. In this work, the data were taken from clinical imaging studies in which the target uptake time was 75 minutes. Also of interest is any potential variation in the injected dose of FDG. Among several major institutions previously studied, dosing levels ranged from 7-20 mci of 18 F- FDG [58]. Most sites chose a standard dosing level for all patients, while some opted for weight-based dosing, administering between mci per kilogram of body weight. As with uptake time, the amount of administered radioactivity has an obvious impact on 23

38 PET/CT quantification and whatever dosing level is chosen at a given site should be maintained with the highest possible accuracy. Within this data, a standard 15 mci dose was planned for all patients older than 18. Those patients younger than 18 received doses based on body weight. A long term analysis of real clinical data for a specific PET/CT imaging system was conducted in order to better understand how standard protocols are, and may be better, implemented. Secondly, a study of the impact of reconstruction protocol settings on PET/CT quantification was completed. Through the improvement of computing capabilities and photon detection, PET/CT image quality and quantification have been improved over recent years. Moving from filtered back projection reconstruction to fully 3D listmodebased ordered subset methods has allowed enhanced image signal to noise ratios, the reduction of artifacts [16], as well as increased quantitative accuracy [50]. Such iterative reconstruction methods introduce new variables to the reconstruction process however. The number of iterations and subsets per iteration as well as other parameters are available for clinicians to change, each impacting the image quality, quantification and reconstruction speed. For example using a greater number of subsets per iteration will increase the reconstruction speed but has been shown to introduce more noise into the final image [49]. Previous studies have shown that changing such reconstruction parameters can vary PET quantification by as much as 70% [37]. PET/CT imaging has also been improved by the addition of time-of-flight (TOF) detection capabilities into emission data acquisition. TOF allows for more accurate detection of the difference in arrival times of two coincident photons, providing more precise localization of each positron annihilation event and thus activity distributions throughout the body [43]. This results in improved lesion detection on a qualitative level as well as increased target uptake to background ratios, thereby improving quantification [42]. When applying TOF data there is an additional option within reconstruction which is the TOF kernel width (KW). This factor affects the region in which each coincident event is positioned [60]. A larger KW allows for more accurate localization of photon pairs, producing improved accuracy in representing activity distributions on PET images. The 24

39 kernel is taken as the actual measured timing resolution of the imaging system [61]. This is then modeled as a Gaussian function of some known width, which is dependent upon the count rate during emission scanning. It has been shown that a KW larger than the actual timing resolution of the system improves contrast recovery ratios of PET/CT images. This comes at the cost of increased reconstruction times however. Equally important is the finding that if a KW smaller than the timing resolution is applied that event mispositioning can lead to significant degradation of PET data. The standard reconstruction settings set up by each system manufacturer are such that they optimize image quality and quantification while keeping clinical time constraints in mind. Therefore the choice to deviate from standard reconstruction protocols is one which should be carefully made. This work thus aimed to evaluate the differences in quantification of data reconstructed using three vendor recommended protocols. As changes such as the number of subsets per iterations, TOF KW size and iteration relaxation parameter were made, it remained critical that quantification remained consistent as well. By using clinical patient data, this study was then able to evaluate the real world effects that reconstruction parameter variations have on PET/CT quantification. Lastly, this work focused on the effects that upgrading imaging systems and their software may have on PET/CT quantification. This project was brought about when our clinical site received an upgrade from a Siemens Biograph 16 PET/CT to a Biograph 64 mct PET/CT system. The change in detector systems as well as the accompanying reconstruction algorithms began producing clinical results outside the ranges previously experienced by reviewing clinicians and so an evaluation of the changes in PET quantification caused by changing to a new reconstruction method was undertaken. Although similar to the previous discussion, this work differs in that here the changes to reconstruction are being applied to all patient studies despite previous acquisition parameters and are being applied at the recommendation of the system manufacturer. In the previous analysis, changing from standard reconstruction protocols is only recommended to achieve specific goals in specific patients, such as one needing improved image quality for enhanced analysis of a certain lesion. Here, the standard recommended 25

40 reconstruction is what was changed. It would be hoped that this change would have only a minimal impact on quantification, however clinical feedback suggested otherwise. It is well known that consistent protocols produce the most useful results in PET/CT imaging, especially when using absolute changes in tumor measurements to evaluate response to therapy or recurrence of disease [32, 62, 63]. Thus the results of this analysis were aimed at characterizing how changes made to standard reconstruction methods affected the quantification of PET/CT data. In review, the multiple parts of this work aim to quantitatively evaluate the ways in which variations in reconstruction of PET/CT data truly impact the results of clinically performed studies. This was done in two scenarios, the first being when changes to the reconstruction method are made electively in order to achieve either improved results or to increase the speed of reconstruction. The second circumstance is when changes are made as part of a system upgrade where new methods will be applied to all future patient studies. In either case, it is expected that any impact on PET quantification will be minor in order to preserve the accuracy of each study. Additionally, variation in patient preparation procedures within a busy clinical environment was analyzed over a long time range. Although the quantitative impact was not evaluated, it has been previously established that factors such as the uptake time following FDG injections have a clear effect on PET/CT measurements. Thus accuracy and consistency at this point of PET/CT acquisition is also critical and an evaluation of any variation from target doses and uptake times is of interest. 26

41 Chapter 2.2: Variability of Administered 18 F-FDG Doses and Uptake Times in Clinical PET/CT Imaging Introduction In FDG PET/CT imaging, many factors impact both the image quality and accuracy of quantitative analysis of PET data [30]. These include everything from each patient s individual characteristics to preparation prior to imaging to the actual acquisition of emission data. It has been well established that consistency among these multitude of factors is critical in order to achieve accurate and robust results in any PET/CT examination. Therefore, guidelines and standard recommendations have been set forth by several different groups [8, 32, 56]. However the true accuracy with which these recommendations are implemented remains unclear. While variations in standard patient preparation procedures exist among institutions [58], it should be known what variations exist within any given clinical environment. When a chosen standard in qualities such as FDG dose and uptake time are enacted in a busy clinical environment, what percentage of patients actually complete imaging according to these designated procedures? Simple delays in patient positioning or needing to repeat imaging due to patient motion impact not only that patient s study but perhaps others as well. This evaluation then aimed to quantify the range of administered radiotracer doses and uptake times over a long time frame involving thousands of standard clinical procedures. Such an analysis then allows for a detailed account of how well these protocols are being implemented. Trends in this data set allowed for the identification of areas which most often lead to deviations from prescribed protocols. Materials and Methods Information regarding patient preparation for 18 F-FDG PET/CT imaging on a Siemens Biograph 16 PET/CT was collected over a range of three years (March 8, 2005 through March 18, 2008) within our institution. Data pulled from the institution PACS 27

42 system included adjunctive data forms (ADFs) and scan information from 9488 consecutive patients within this time frame. At the time of imaging, ADFs were manually filled in by the technologist completing the scan, with required information also being input into the imaging system by hand. The imaging system recorded the initiation time of both the CT and PET imaging components. Data which were queried from the two sources included the study date, the study indication, patient status as inpatient or outpatient, patient weight, sex and date of birth, FDG injection site, injected volume and activity level, the injection time and the scan time. Here, scan time was taken as the time at the start of the first PET emission volume scan. Data were primarily pulled from image DICOM information. The injected activity was calculated by the technologist by subtracting the syringe residual activity following injection from the syringe initial activity, as measured in a dose calibrator synchronized to the imaging system. The injection time was taken from the dose calibrator clock, which was the time used for all subsequent decay corrections. The injection time and activity were recorded by the technologist on the ADF and on the imaging system, along with patient weight, sex and date of birth. According to standard protocol, the target dose for each patient was 15 mci of FDG ±10% for all patients older than 18. Patients younger than this had doses adjusted by body weight. The target uptake time was 75 minutes, with variations of ±10 minutes being considered within an acceptable range. Once all patient data were entered into a single spreadsheet the injection to scan time (IST) was calculated by subtracting the injection time from the scan start time. Additionally activity per kilogram body weight was calculated for all patients, in order to gain an impression of variation due to standard dosing for all patients. A portion of the data required manual correction in the final spreadsheet. This resulted primarily from errors made when entering patient information in the imaging system, such as mislabeling the injection time or from data fields being left blank within the image DICOM information. Errors also existed in the retrieving of the scan time from the PACS information, which were manually corrected by directly referencing the images. 28

43 Several study types were excluded from this analysis. These included brain and cardiac studies because these protocols utilize a different target uptake time. Also scans conducted for patients with melanoma were given special consideration. These patients undergo PET imaging of the legs followed by whole body imaging from the skull to the thighs. Leg only imaging is generally completed prior to the 75 minute target time and so such studies were excluded from this analysis. Other reasons for exclusion were incomplete data or patients who had multiple scans on the same day. The final number of studies reviewed was Results Of the 7824 patient studies evaluated, 53.4% were male and 46.6% were female patients. The average weight of all patients was 80 ± 21 kg and the average age was 59 ± 15 years old. A total of the number of studies completed for each indication can be seen in Table 2.1. As shown, most studies were specifically labeled, however many included only a general indication for whole body imaging or radiation treatment planning. Also 694 (8.9%) studies were designated only as part of the National Oncology PET Registry (NOPR) imaging. 11.8% of patients were inpatient status at the time of imaging. Exam # Studies LYMPHOMA STAGE/RESTAGE 1597 HEAD & NECK STAGE/RESTAGE 1502 LUNG CA STAGE/ RESTAGE 1066 COLORECTAL STAGE/RESTAGE 711 NOPR/OTHER BODY IMAGING 694 PULMONARY 687 BREAST CA STAGE/RESTAGE 471 MELANOMA STAGE/RESTAGE 371 ESOPHAGEAL STAGE/RESTAGE 337 RAD THER TREATMENT PLANNING/ F18 DOSE/ WHOLE BODY IMAGIN 216 THYROID CA STAGE/RESTAGE 172 Table 2.1. Number of FDG PET/CT Studies Completed with Different Referral Indications 29

44 The average IST was 80 ± 15 minutes, ranging from 48 to 331 minutes. Figure 2.1 shows the distribution of ISTs according to the number of studies at 1 minute intervals. The distribution has been truncated at 125 minutes due to the low number of studies beyond this point, in order to show greater detail of the spread of the majority of the data points. As can be seen the majority of studies were initiated within the 75 ± 10 minute window. Table 2.2 lists the number of studies within this range as well as the numbers beyond the acceptable range of 65 to 85 minutes. Of all studies completed, 64.6% were done in the target time frame, leaving 35.4% of scans begun either before or after the target times. Most of the studies beyond the target range were completed later rather than earlier than planned. As not seen in Figure 2.1, it should be noted that 104 studies were begun at or later than 125 minutes following FDG injection. Figure 2.1. Distribution of the Time from FDG Injection to Scanning for the Whole Patient Population 30

45 Table 2.2. Number of Studies with Injection to Scan Times within the Acceptable Range of 75±10 Minutes The average injected activity was 14.5 ± 1.4 mci, excluding the 6 pediatric patients who were given weight-based doses. The administered doses ranged from 6.1 to 21.2 mci. Table 2.3 gives the distribution of doses within and beyond the 10% range of the targeted 15 mci dose. 76.3% of studies were completed with doses within this target range. For the 23.7% that were outside this window, 98% were below 13.5 mci. Figure 2.2 shows the full distribution of administered doses, in half mci increments, including the 6 pediatric patients. Dose (mci) # studies % studies ± ± < > Table 2.3. Number of Studies Completed with Injected Doses within the Acceptable Range 31

46 Figure 2.2. Distribution of Studies by Absolute Injected Activity Levels in Half mci Increments for Adult Patients In calculating the dose per kg body weight, the average was 0.20 ± 0.06 mci/kg, ranging from 0.07 to 0.58 mci/kg, again excluding pediatric patients. Figure 2.3 illustrates this range in activities per kg body weight. Although it can be seen that 43.0% of all studies had activity concentrations within a range of ±15% of the average, there was still wide variability which could have a significant impact on both image quality and quantification. Further analysis of injection to scan times and injected activities revealed several trends. The first showed that IST was related to the time of day at which scanning occurred. Table 2.4 shows the average IST of all patients injected at each hour during the day. As also seen in Figure 2.4, the average IST increased throughout the day, with the exception of studies begun after 16:00. There were only 10 such patients and so this average time is less robust than the others. Figure 2.4 illustrates the variability in IST with time of day, highlighting that studies completed earlier in the day are closer to the 75 ± 10 minutes target range than those performed later, particularly after 14:00. 32

47 Figure 2.3. Distribution of Injected Activity per kg Bodyweight for Adult Patients As suggested by Table 2.4, ISTs of scans before 10 am were significantly closer to the target time than those after 10 am (p<0.001), as determined by ANOVA. Additionally, the average IST was also impacted by the day of the week on which the scan was performed. 188 of the 7824 studies were completed on Saturdays and the average IST of these scans was 74.9 minutes. On these days fewer scans were performed than during regular weekdays, thereby potentially accounting for this increased accuracy. Hour of Day of Injection Average IST 6: : : : : : : : : : : Table 2.4. Average Injection to Scan Time per Hour of the Day of Injection 33

48 Figure 2.4. Distribution of Average Injection to Scan Times by Time of Day at Which Scanning was Completed IST was also found to be significantly related to the age of the patient, with the average IST increasing slightly for each year of age, as seen in Figure 2.5. Along with this trend, the youngest and oldest patients showed wider variability in IST than did median aged patients. This, however, is likely due to the smaller number of patients at these ages, as the entire distribution of IST according to age showed the most variable times spread across all age ranges. IST was not significantly associated with patient weight, in- or outpatient status, nor injection site. The exam indication did however have a relationship to IST, with melanoma studies having significantly longer ISTs than other study types, averaging 85 minutes. Head and neck related exam also had longer than average ISTs. 34

49 Figure 2.5. Distribution of Average Injection to Scan Times by Patient Age The amount of activity administered to each patient was also found to be related to the time of day at which the injection was completed. Table 2.5 lists the average injected activity at half hour intervals through the day. Figure 2.6 depicts the distribution of average injected dose throughout the day. As is evident, injected activity decreases with time over each day, on average. It should be noted that in general two dose shipments are received each day, one in the morning and another for the afternoon. This explains the increase in injected activities seen for afternoon exams as opposed to those from late morning. Again ANOVA showed that injected activities from before 10am were significantly closer to the target goal than those from between 10am and noon (p<0.001). Afternoon activity levels were less significant though still closer to the target level than the 10 to noon group (p<0.05). This was likely affected by the lower number of late afternoon studies however. 35

50 Injection Time Average Activity (mci) 6:30-7: :01-7: :31-8: :01-8: :31-9: :01-9: :31-10: :01-10: :31-11: :01-11: :31-12: :01-12: :31-13: :01-13: :31-14: :01-14: :31-15: :01-15: :31-16: :01-16: Table 2.5. Average Injected Activity per Time of Day of Injection Figure 2.6. Distribution of Average Injected Activity by Time of Day of Injection 36

51 A specific analysis of patients with repeated studies over the three year period highlighted the true variability of both IST and dose. Figure 2.7 depicts these variations for the five patients with the greatest number of studies completed. Although a specific reason for each point of variation is not known, this figure shows that for any given patient the dose per kg of body weight and IST can vary quite dramatically for consecutive studies. The average IST of each patient s scans is also noted here. While all but one of the five patients had averages within the ±10 minutes of the target 75 minutes time, the variation of individual times can clearly be seen. Figure 2.7. Injection to Scan Times and Injected Dose for Five Patients with Multiple Studies Continued 37

52 Figure 2.7 continued. Figure 2.7. Continued 38

53 Figure 2.7 continued. 39

54 Discussion FDG PET/CT continues to gain utility in medical imaging and oncology, with a primary application being the evaluation of tumor response to therapy. In this way, quantitative measures of PET radiotracer distributions at baseline and throughout treatment necessarily require a high level of accuracy and repeatability. While PET SUVs are dependent upon many factors that are beyond the ability to be controlled for, the patient preparation procedures are one variable which can and should be carefully monitored and conducted. Although there is no one standard FDG dosing level and uptake time, both are known to affect PET quantification. While the true impact on quantitative measures caused by the variations found here was not analyzed, the chosen local standard IST and dose should be maintained with the highest level of accuracy possible. This study thus evaluated how well the standard local doses and uptake times are executed within a busy clinical environment. The average injection to scan time of all patients was found to be 80 ± 15 minutes. This was within the acceptable ±10 minute range of the 75 minute target time. 64.6% of the 7824 studies evaluated were conducted in the 65 to 85 minute window, showing that a majority of patients were well managed in the preparation for PET/CT imaging. There were however a large portion of studies which were begun well before or after the target time, with the ISTs of the entire population ranging from 48 to 331 minutes as seen in Figure 2.1. IST appeared to be most dependent upon the time of day during which the study was planned. ISTs of studies begun before 10 am were significantly closer to the target time than those completed after 10 am. This is explainable given that patients are often scheduled one directly after another and the next two patients have typically received an FDG injection prior to the previous study being completed. Thus a delay with one patient will likely translate into delays for the next patients following them that day. Delays may be caused by a patient taking longer than expected using the restroom immediately prior to scanning, needing extra time for patient positioning or needing to repeat imaging due to patient motion. These types of delays are not out of the ordinary and cannot be completely 40

55 planned for. Any unexpected equipment issues may further compound these types of delays. In a busy clinical environment it is not always favorable to plan extra time between patients as a buffer against these delays, however good time management on the part of the technicians can help increase the number of studies begun on time. IST was also associated with the study indication. Patients being evaluated for melanoma had average ISTs significantly longer than the general population. This is explained by the protocol utilized at this site, wherein the lower body is imaged prior to the whole body for patients with melanoma. While these leg only scans were not included in this analysis of IST, they had a clear impact on the IST for the whole body imaging of these patients. The addition of a leg only scan, even if begun before 75 minutes following injection, can easily push back the start time of the whole body scan by taking longer than planned or by requiring extra time to plan the second imaging protocol. Although the average IST of these patients was still within the ±10 minute range, extra care must be paid to these patients in order to ensure the most accurate image acquisition. IST was also found to be slightly dependent upon patient age, with IST increasing for older patients, on average. This likely relates to the possible reasons for delays throughout the day. Elderly patients are more likely to need additional time during final preparations and positioning prior to scanning. This points to the fact that by identifying patients most likely to be delayed extra time can be planned between the injection and commencement of scanning to increase compliance with target ISTs. The average administered FDG dose was found to be 14.5 ± 1.5 mci with 76.3% of all studies receiving ±10% of the 15 mci target dose, as seen in Figure 2.2. This was in general a good result, with the vast majority of patients beyond this range receiving lower than desired doses. Again, injected dose was most significantly related to the time of day which injection was planned for. Doses administered before 10am were significantly more near the target dose than those between 10am and noon. Afternoon doses were also closer to the 15 mci target, on average. The decrease in injected dose with time also relates to delays in scan times in that injection of a patient may be delayed when a previous patient is known to be running behind schedule. In these cases while the IST may be more accurate 41

56 for the second patient, injected activity will be diminished due to decay during the delay time. This trend is most significant for patients between 10am and noon in part due to the delivery of a second, afternoon shipment of doses. With the arrival of this second amount of radiopharmaceutical, any activity lost due throughout the morning may be recovered for afternoon injections, resulting in the slight increase in injected activity recorded for these patients. An additional note should be made regarding the accuracy of data entry by technicians. About 1000 incidences of erroneous data entries were encountered in this analysis. Errors in DICOM headers were manually corrected by referencing ADFs. However this highlights the need for accurate entry of patient information into the imaging system. Mistaken doses or patient weights can propagate errors through the quantitative analysis of PET data and so care must be taken to avoid such issues. The calculation of activity per kg body weight also points to the actual range of doses caused by standard dosing for all patients. With a range of 0.07 to 0.58 mci/kg for all patients, there was wide variability for this population of patients. The amount of radioactivity available for uptake within each individual patient s body composition is known to affect both image quality and quantitative measurements of activity distributions [26]. This effect on quantification becomes particularly important for patients receiving multiple scans over the course of treatment and follow-up, especially if significant weight loss is experienced during this time frame. Although outside the scope of this study, this true range in activity per kg was an interesting secondary finding. In general, the results of this study show that target ranges of both IST and injected activity are being achieved. However there was a wide variety of both ISTs and administered doses for the population studied. While these goals are being met on average, the accuracy of each and every study should be the overall goal. Through better identification of patients most likely to be delayed, either due to patient demographics or acquisition protocols, the overall accuracy of IST and administered dose may be improved. Deviations in both IST and dose can be avoided if these patients can be better managed from the onset. Although many opportunities for unexpected issues still exist, it is through 42

57 the controlling of variables which can be managed that the overall effectiveness of PET/CT studies can best be ensured. These long term results are likely not unique to this clinical environment and so serve to offer suggestions for improvement at any imaging site. Conclusion In conclusion, the target IST and dose for patients is being reached on average. However variability is still common and is significantly related to identifiable factors such as the time of day of imaging, the study indication, and certain patient demographics such as age. In order for accurate and robust PET quantitative measures to be achieved, improvement in imaging procedures must be made so that all patients may fall within target IST and dose ranges. 43

58 Chapter 2.3: Consistency of SUV with Variation in PET Reconstruction Parameters Introduction With the increasingly popular use of PET/CT as a quantitative marker of disease states, accurate and robust quantitative measurements remain at the forefront of acquiring and reconstructing PET data [62]. The standardized uptake value (SUV) is the most common measure of PET activity within any given pixel [7], although it is known to be susceptible to variation caused by a wide range of factors [30]. Among the many parameters which have an effect on PET quantitative readouts, the choice of specific reconstruction settings, such as the number of subsets and ordered-subset iterations or the time-of-flight kernel width, are known to have a direct effect on quantification [37]. Therefore minimizing these effects is a priority when designing and commercially producing an imaging system. Manufacturers will thus limit the available reconstruction protocols from which clinicians may choose to complete PET studies with. In particular, the Philips GEMINI TF PET/CT has three such protocols, each with four additional variables, to utilize on a daily basis [60]. Although the vast majority of clinical studies are completed using one standard group of reconstruction settings, it may happen that a different choice may be made in specific situations. For example, decreasing the number of subsets per OSEM iteration while maintaining the same number of iterations will slightly increase the reconstruction speed [49]. Also, increasing the time-of-flight kernel width will improve quantitative accuracy and image quality for detection of small target lesions, however this will require a longer reconstruction time [61]. Whatever the reason for applying a different vendor provided reconstruction protocol may be it should be expected that any resulting impact on PET quantification will be minimal. This is to ensure that if two unique sets of reconstruction settings are applied to two separate studies that the direct comparison of quantitative measures may still be as accurate as possible. In cases of determining response to therapy [25] or detection of 44

59 recurrent disease [65] where preliminary and follow-up scans are compared such accuracy and robustness is essential. Thus this study aimed to evaluate the actual changes in SUV measurements for clinical PET/CT data reconstructed with all possible parameter variations, as made available by the system manufacturer. Through this work the explicit need to maintain the same reconstruction protocol for all scans completed for any individual patient could be determined. Materials and Methods Clinical PET/CT scans were conducted for 12 patients using a Philips GEMINI TF 64 (Cleveland, OH). 3D time-of-flight (TOF) scans were completed 72.7±9.9 minutes following injection of 13.6 ± 2.5 mci of 18 F-FDG. There were 4 female and 8 male patients, with an average age of 50 years and average weight of 70±15 kg. CT were acquired with 120 kvp, 163 mas and a slice thickness of 4 mm. PET data were acquired with 90 seconds per emission volume, from skull to thighs with a 50% overlap between volumes. These settings were all according to the standard local protocol Following standard PET/CT reconstruction with 3D list-mode OSEM, the listmode data were further reconstructed with each setting clinically available on the Philips system. Parameters available for each reconstruction consisted of speed, sharpness and TOF kernel width (KW). Within speed settings, fast, normal and high quality (HQ) were available options, varied by the number of subsets per iteration in the OSEM reconstruction algorithm and the TOF KW, as shown in Table 2.6. As the TOF acquisition is applied, the Iterations Subsets Kernel Width Normal Fast High Quality Table 2.6. Parameters of Philips GEMINI TF "Speed" Reconstruction Settings system can better detect the difference in arrival times of the two coincident photons [60]. Changing the reconstruction KW limits the region in which each event is positioned, with 45

60 a larger KW allowing for more accurate positioning. This occurs at the cost of increasing the reconstruction time however. Therefore, the HQ setting may produce more accurate activity distributions in the PET images, but it is best applied to images with better than standard count statistics. Changing the reconstruction sharpness involves adjusting the reconstruction relaxation parameter (RP). This in effect controls the magnitude of the changes that each iteration can make to the current image estimate during reconstruction. A smaller RP allows less refinement with each iteration, resulting in less noise being introduced to the image, producing a smoother final image. Sharpness settings were called normal, RP of 1.0, smooth, RP of 0.7, smootha, RP of 0.6, and smoothb, RP of 0.5. Normal speed and normal sharpness are the default reconstruction settings for clinical patients. With a combination of each speed and sharpness setting, 12 total image sets were produced for each patient. Using the normal settings for reference, 3D regions of interest (ROIs) were drawn over any present target lesions as well as in background tissues including the bladder, heart, liver, kidneys and spleen. ROIs were drawn such that the entire volume of the tissue exhibiting an increased FDG concentration was encompassed. The SUVmax of each ROI was recorded and a percent difference from reference values was determined for each subsequent set of SUVs. Results Figure 2.8 illustrates the qualitative characteristics of each reconstruction setting. As can be seen, the fast speed, smoothb image appears the smoothest, as it involved the fewest subsets per iteration and the smallest RP. Images reconstructed with the larger RP and KW, HQ speed and normal sharpness, appear to have better definition between tissues and at the edges. The image quality of each reconstructed set was adequate for review, with all target lesions identified on the reference images being well visualized on each additional set of images. It should also be noted that no truly significant differences in reconstruction times were experienced; all images were reconstructed within a clinically acceptable time frame. 46

61 Figure 2.8. Representative Philips GEMINI TF Images with Varied Reconstruction Settings In the quantitative analysis of the images, the majority of SUVmax s measured fell within ±15% of reference values, with very few measures lying beyond this range. The greatest single variation was 20.7% below reference value. The average variation of each tissue type for each reconstruction speed setting is outlined in Table 2.7. Tissue Type Fast Speed Average %Difference Normal Speed Average %Difference HQ Speed Average %Difference Bladder Heart L. Kidney Lesion Liver R. Kidney Spleen Table 2.7. Average of the Absolute Value of the Percent Difference in SUVmax for Each Tissue Type Measured with Varied Reconstruction Settings Here it can be seen that the average of the absolute value of the percent difference for fast, lower subset settings ranged between 2.6 and 9.7% different from reference values, regardless of RP setting. Normal speed settings varied by RP had average percent differences ranging from 1.2 to 6.3% for each tissue type measured. The HQ, larger KW 47

62 settings produced variation between 3.1 and 11.2% different from standard measures. While each reconstruction speed had a similar average range of measured SUVmax s they produced different variations from the reference measures. SUVmax was found to be highest for images reconstructed with the HQ, larger KW and RP settings. These data sets had SUVmax s which measured an average of 9.4% greater than reference values for all ROIs evaluated. The range of the individual values was from -2.8% to 17.2% different from reference values. Fast settings with fewer subsets and a smaller RP had the lowest SUVmax s, averaging 8.4% below reference measurements, ranging from to 2.8% different. Figure 2.9 shows the average of the absolute values of the percent difference for all ROIs according to each reconstruction setting. Figure 2.9. Average Absolute Value Percent Difference from Reference SUVmax for All ROIs Evaluated with Varied Reconstruction Settings This figure illustrates the trend where the normal speed settings had quantitative measures closest to reference values, on average, and HQ and fast speeds resulted in the greatest deviations. While smoothb sharpness with a low RP reconstructions produced the greatest variation among fast settings with fewer subsets per iteration, normal sharpness with the largest RP, had the greatest average percent difference among HQ settings. 48

63 Figure 2.10 further highlights these variations, showing the average percent difference per reconstruction setting of each tissue type ROI. Here the zero point of the normal speed and sharpness reference value percent difference is used as a reference point for each actual average percent difference. Fast settings resulted in SUVmax s below reference values for each tissue type. Additionally, among fast settings, decreasing the RP can be seen to decrease the measured SUVmax, increasing the variation from reference values, on average. Thus lowering the number of subsets per iteration as well as lowering the RP results in lowered measurements of activity concentration, independent of tissue type. A similar result is seen in normal speed settings, where again changing only the RP shows decreased SUVmax measurements with decreasing the allowed image estimate refinement per iteration. In HQ, larger KW, reconstruction settings, SUVmax is measured higher than reference values in nearly all tissue types, at all RP settings. Although slightly diminished and more dependent on tissue type, lowering the RP does still effectively decrease measured SUVmax s, decreasing the variation from reference values for the HQ speed reconstructions. Figure 2.10 also demonstrates the impact of tissue FDG uptake on quantification. While general effects of changing the RP and KW are seen across each ROI location, typically low uptake tissues, such as the liver and spleen, showed greater variation among speed settings than normally higher activity concentration tissues, such as the bladder. The average liver measurement increased in variation from reference values from -6.5% to -12.1% for settings with fewer subsets. Meanwhile the bladder SUVmax s only decreased by 2%, on average, with changing RPs. Among HQ settings, the liver again had an average decrease in variation of 6% whereas the average bladder SUVmax s were within 1.5% of each other over all RPs utilized. 49

64 Figure Average Percent Difference from Reference SUVmax of Each Tissue Type with Varied Reconstruction Settings The range of SUVmax s measured in background tissues varied greatly, from an average reference SUVmax of 2.02 in the spleen to in the bladder, with 72.7 being the highest. The average SUVmax of each tissue type can be seen in Table 2.8. For normal and fewer subset settings with the smaller KW there was no significant trend in the activity concentration impacting the SUVmax variation from reference values, as seen in Figure This is to say that low uptake ROIs displayed similar variation from reference values as high SUVmax tissues, for each reconstruction setting. Also, among speed settings, variations in the percent difference of each SUVmax were similar over the range of activity Tissue Type Average SUV max Bladder 44.9 Heart 8.1 L. Kidney 18.4 Lesion 9.7 Liver 2.3 R. Kidney 27.4 Spleen 2.0 Table 2.8. Average SUVmax of Each Tissue Type Evaluated 50

65 distributions measured in background tissues. HQ reconstructions using the larger KW did show a slight bias in that background tissues with an SUVmax of 5 or lower had measurements closer to reference values than those of tissues with an SUVmax greater than 5. Also among normal and fewer subset settings ROIs with an SUVmax of less than 5 had noticeably greater variation in SUVmax s measured for each RP setting than did those with an SUVmax of 5 or higher. Conversely, it was the ROIs measured below 5 that had less variation with RP for HQ, larger KW settings. These results again highlight the different ways in which RP and KW changes affect PET quantification. Regarding the size of the tissue of interest and resulting ROI, background tissues ranged from 5.1 cm 3 to cm 3. Table 2.9 lists the average size of all ROIs drawn over each tissue type. As with SUVmax, there was a trend that a smaller range of SUVmax s was measured with decreased RPs as ROI sizes increased, beginning most notably above 100 cm 3 volumes, for normal and fast reconstruction settings. This trend is seen in Figure For HQ, larger KW settings, ROIs with volumes less than about 100 cm 3 showed less variation with changing RP than ROIs of larger volumes. This is only a general trend however, as there were no ROIs measured between 20 and 100 cm 3 in volume. Tissue Type Average ROI volume (cm 3 ) Bladder Heart L. Kidney Lesion 55.1 Liver 13.2 R. Kidney Spleen 9.4 Table 2.9. Average Volume of ROIs by Tissue Type Focusing on target lesions Figure 2.13 shows the absolute value of each lesion s percent difference for each RP setting within the three speed settings. The measured SUVmax s ranged from 3.3 to 50.4 (averaging 9.7) and showed results similar to those of background tissues. Dependent upon SUVmax, lesions with SUVs greater than 8 in general had less variation with decreased RPs than those measuring below 8, for fast and normal 51

66 reconstructions with the smaller KW. This result was less evident in HQ, larger KW, reconstructions, where larger yet more consistent variation in SUVmax was seen across the range of RPs utilized. Figure 2.14 depicts the average and range of percent differences for each lesion per speed setting. Here it can be seen that while the average percent difference of each tumor lesion varied from reference values by a similar amount over the range of activities measured, the variation cause by decreasing the RP was much more limited for lesions with an SUVmax greater than 8, most notably for fast and normal speed settings. In relation to the volume of the tumor lesion, the results were again similar to background tissues, yet were more specific due to the range of volumes evaluated. Figure 2.15 shows the percent difference from reference values for each target lesion arranged from smallest to largest volume over the variety of RPs utilized in reconstructing the data. As with activity concentration, the smaller lesions, below 8 cm 3, experienced the greatest variation in SUVmax measurements for fast and normal speed settings. Also HQ reconstructions appeared less affected by lesion size, as with tumor uptake values, although the smallest lesions more commonly had wider variations in percent differences with changing RPs. In Figure 2.16, the average and range of the percent difference of each lesion for the three speed settings can be seen, again arranged by size. This figure also depicts the decrease in variation among RPs with increased lesion volume, although it is again apparent that the average percent difference was fairly consistent among all lesions evaluated, particularly for normal and fast speed settings. It should also be noted from Figures 2.14 and 2.16 that for normal and fast reconstructions tumors with either larger volumes or higher activity concentrations had SUVs measured closer to reference values than those with less activity or a smaller volume. 52

67 Figure Average and Range of SUVmax among Each Relaxation Parameter within Sharpness Settings, by ROI SUVmax 53

68 Figure Average and Range of SUVmax among Each Relaxation Parameter within Sharpness Settings, by ROI Volume 54

69 Figure Absolute Value of Lesion Percent Difference, by ROI SUVmax 55

70 Figure Average and Range of Lesion SUVmax Percent Difference, by ROI SUVmax 56

71 Figure Absolute Value of Lesion Percent Difference, by ROI Volume 57

72 Figure Average and Range of SUVmax Percent Difference, by ROI Volume 58

73 Discussion PET/CT imaging analysis for staging or therapy response assessment depends to a degree upon the choice of parameters used within reconstruction algorithms. While settings with greater subset amounts are meant to return a better quality image by reaching convergence faster, the impact on quantification must also be considered. Also, changes in the time-of-flight kernel width and reconstruction relaxation parameter affect both image quality and quantification. A smaller RP introduces less noise into PET images by allowing the image estimate at each iteration during reconstruction to be improved by a smaller amount. And a larger TOF KW allows for more accurate placement of each coincident photon pair, enhancing image quality as well as the measurement of activity distributions over the image. Within this work the true impact of changing reconstruction protocols was explored using clinical PET/CT data and all available vendor issued reconstruction settings. As these reconstruction settings are readily available, the quantification of each set of results should fall within some narrow range in order to ensure that accurate and robust quantification can be performed, independent of any variation in the choice of reconstruction settings. In these results, the impact on quantitative measures was found to be minor. The greatest deviation from standard SUVmax measurements was 20% while the majority of the data fell within 15% of SUVs measured on reference images. The average percent difference from reference values was less than 10% for sets of data which were reconstructed with either a larger KW or fewer subsets per iteration and a smaller RP than is standard. These results were for not only a wide variety of background tissues which were evaluated but also all present target tumor lesions identified within the patient data sets. They also show that for the majority of patients and lesions being evaluated minor changes to the reconstruction method produce still acceptable quantitative results while potentially improving either image quality or reconstruction speed. To some extent however, the measured variation was dependent upon activity present within an evaluated tissue, with low activity tissues showing greater variability among the range of RP settings, as seen in Figure This effect was also demonstrated 59

74 with lesions with SUVmax s lower that 8 having greater variation over the RPs used for reconstruction than those with larger SUVs. Although as seen in Figures 2.14 and 2.16 this effect was predominantly seen for smaller KW, normal and fast reconstruction speeds. Here a smaller RP produced greater deviation from reference values for low SUV target lesions. Also, the size of the target tumor was seen to have an effect on the impact of the reconstruction on quantification. While the range of background tissue sizes measured limited these results, with no ROIs being drawn between 20 and 100 cm 3, the evaluation of target lesions was over a broader range. As Figure 2.15 shows, lesions below 8 cm 3 were more susceptible to variation based on changes made to the RP during reconstruction. This result is likely due to partial volume effects occurring through data acquisition and reconstruction. The spatial resolution of these images was 4x4x4 mm, thus any object being measured which was smaller than twice this size, around 7 cm 3 is susceptible to spill in and spill out effects, limiting the inherent accuracy of quantification for these objects [44]. Thus when lowering the RP and producing a smoother, although less noisy, image it would be expected that quantification would suffer slightly. Although the average percent difference in SUVmax was still within ±15% for these low RP reconstruction settings. In evaluating the effects of lowering the number of subsets per iteration or increasing the TOF KW, opposite results were found. Fast speed settings which utilized a smaller number of subsets per iteration but the same KW as standard reconstructions consistently produced SUVmax s below reference values, as seen in Figure This has to do with the breakup of the list-mode data into fewer subsets over the same number of iterations during reconstruction. Generally adding more subsets to a reconstruction will require fewer iterations, as convergence is reached faster [15, 49]. Therefore, fewer subsets per iteration, while introducing less noise into the image [49], applied with the same number of iterations means that the final image is less likely to have reached the same level of convergence as normal settings. Although faster, it is apparent that quantification is affected by these changes, with fast speed and normal RP reconstructions having SUVmax s on average 3% less than normal speed and RP reconstructed images. Increasing the TOF KW also impacted quantification of PET data, by increasing measured SUVmax s of ROIs 60

75 evaluated within this work. HQ speed settings utilized a larger KW than normal ones, for the same number of iterations and subsets. This should then increase the accuracy with which each coincident event is localized. This translates into more accurate measures of activity distributions on PET images and is also likely why a smaller dependence of lesion SUVmax and size was seen in calculating the percent difference from reference values. Therefore, while HQ speed and normal RP reconstructions averaged 9.4% greater than reference values in measuring SUVmax, this reconstruction setting likely gives a more accurate measure of the true lesion activity distribution. The results of this work thus show that the choice of reconstruction settings has a distinct impact on PET/CT quantification. However, this resultant deviation from SUVs measured on standard images, is minimal. In evaluating therapy response assessment, changes in SUV of greater than 40 or 50% from baseline values are common cutoff ranges [62]. So the variation of about 15% from reference values seen here, while non-negligible, is well below this range and therefore may be considered clinically acceptable. Conclusion In conclusion, this evaluation of variation in SUV measurements dependent upon reconstruction methods shows that minimal, most often less than 15%, variations occur when utilizing vendor prescribed reconstruction settings. However, when using PET/CT data for the evaluation of response to therapy, or in any repeated manner, consistent reconstruction protocols should be used in order to minimize any potential variation which may impact clinical decision making. 61

76 Chapter 2.4: Quantitative Impact of a Technology Upgrade on PET/CT Imaging Introduction As previously discussed, PET/CT image quality and quantification are dependent upon many factors involved in the patient preparation and image acquisition process [30], with the choice of reconstruction settings being a significant one. Within the scope of image acquisition, the hardware and software used to detect and reconstruct emission data have undergone much improvement in recent years. The development of improved scintillator detectors [41] and computing capabilities [49] has resulted in technology upgrades being implemented within already functioning clinical imaging sites. When such upgrades occur, clinicians already operating in a frame work of average results are presented with a new normal so to speak. In order to continue to allow direct comparisons of imaging results taken from two unique imaging systems, the results must then be comparable. At our clinical site, the PET/CT system was upgraded both by the detector configuration and by the implementation of a new standard reconstruction algorithm. Upon collection of data with the new system, clinicians reported improved image quality but also began measuring background activity levels outside the normal range previously established. And so an evaluation of the true quantitative impact of the system upgrade was conducted. The comparison of identical data sets reconstructed according to both the new and previous recommendations were quantitatively analyzed. Any resulting differences in quantification were then identified and could be adjusted for in the clinical setting. Materials and Methods PET/CT studies were completed for 100 consecutive patients on a recently installed Siemens Biograph 64 mct, upgraded from the Biograph 16 PET/CT. Patients had a mean age of 59 years and a mean weight of 78 kg (ranging from 36 to 127 kg). Emission scans were performed at 73.5 ± 6.3 minutes following injection of 13.7 ± 1.3 mci 18 F-FDG. The PET raw data were reconstructed twice using both the current vendor recommended TrueX protocol and the previous standard 3D OSEM protocol, simulating 62

77 the results of a study performed with the prior system. The parameters of each reconstruction setting are listed in Table In addition to these changes in subset and iteration numbers, the new TrueX reconstruction also included a correction for the system point spread function (PSF). The FWHM setting is that of a Gaussian post-reconstruction filter applied following image reconstruction. Iterations Subsets FWHM 3D OSEM mm TrueX mm Table Siemens Biograph 16 3D OSEM and Biograph 64 mct TrueX PET/CT Reconstruction Protocol Settings Following reconstruction, identical and matched regions of interest (ROIs) were drawn over any present target tumor lesions, using the TrueX reconstructed images for initial identification. ROIs were drawn so as to fully encompass tissue exhibiting increased FDG uptake and were copied directly to 3D OSEM images. ROIs were also placed in background regions of the liver, heart, and bladder. The maximum and average SUV of each ROI was recorded for 435 ROIs, from which a percent difference from TrueX values was calculated for each 3D OSEM value. An approximation of each lesion s contrast ratio was also calculated according to the equation below Contrast = Signal Background [52] In this equation the lesion SUVmax was taken as the signal and the corresponding liver SUVavg was used as the background value for each ROI. Results In reviewing the image quality of each separately reconstructed data set, an appreciable increase in sharpness was seen for TrueX images as compared to 3D OSEM. Figure 2.17 shows a comparison of the image quality and quantitative measures of 3D OSEM and TrueX reconstructed data. As seen on the TrueX image, there is improved definition of represented activity levels over that in the 3D OSEM image. 63

78 Figure Representative Images Reconstructed with the Previous OSEM and the Current TrueX Reconstruction Settings Quantitative analysis of the data revealed significant discrepancies between the two reconstructions. TrueX reconstructed images had overall higher SUVmax and SUVavg measurements, as compared to OSEM images. The TrueX SUVmax of all ROIs drawn was an average of 32.8 ± 15.0% greater than OSEM SUVmax s, ranging from 1.2 to 77.5% greater. The TrueX SUVavg of all ROIs was an average 10.0 ± 10.2% greater than OSEM SUVavg s. Figure 2.18 shows the distribution of SUVmax and SUVavg percent differences for all ROIs evaluated, arranged in order of increasing ROI OSEM SUV. Figure 2.19 shows the same percent difference in SUVmax and SUVavg, arranged by ROI volume. It is interesting to note that for ROIs smaller than 10 cm 3, the percent differences between OSEM and TrueX SUVmax and SUVavg are significantly greater than those of ROIs greater than 10 cm 3 in volume (p<0.001). The average percent difference in SUVmax of smaller ROIs was 54.9 ± 11.0% while it was 29.0 ± 12.0% for larger ROIs. Similarly the average percent difference in SUVavg for ROIs less than 10 cm 3 was 26.7 ± 9.8%, significantly more than the average 7.2 ± 7.1% for larger ROIs. 64

79 Figure Distribution of 3D OSEM and TrueX SUVmax and SUVavg Percent Differences for All ROIs by ROI SUV 65

80 Figure Distribution of 3D OSEM and TrueX SUVmax and SUVavg Percent Differences for All ROIs by ROI Volume Reviewing each ROI by tissue type, ROIs placed on tumor lesions had the greatest SUVmax and SUVavg percent differences, 47.0% and 21.0% average differences, respectively. Background tissues had smaller percent differences, as shown in Table For tumor lesions specifically, the percent difference was found to relate to both lesion uptake intensity and volume. Figure 2.20 displays the OSEM and TrueX SUVmax of each lesion ROI, as well as the corresponding percent difference between the two measurements. 66

81 These data are in order of increasing OSEM SUVmax. Here it can be seen that there is a downward bias in percent difference with increasing SUVmax. In particular, for tumors with OSEM SUVmax s less than 5, the percent difference between TrueX and OSEM measurements was significantly greater than for those with OSEM SUVmax s greater than 5 (p<0.001). SUVavg had a similar but less significant trend. Tissue Type Average SUV max %Difference Average SUV avg %Difference Lesions 47.0% 21.0% Bladder 23.0% 4.5% Heart 33.0% 5.7% Liver 20.2% 2.5% Table Percent Difference between 3D OSEM and TrueX SUV by Tissue Type Figure Lesion 3D OSEM and TrueX SUVmax and Percent Difference Between the Two Measurements, Ordered by Increasing Activity Additionally, the percent difference between measures was also related to the tumor, and resulting ROI, volume, as seen in Figure This figure again depicts the OSEM and TrueX SUVmax of each ROI, along with the corresponding percent difference, arranged in order of increasing ROI volume. Again a decrease in percent difference is seen with increasing volume, with ROIs less than 10 cm 3 in volume having significantly larger percent differences than larger ROIs for both SUVmax and SUVavg (p<0.001). 67

82 Figure Lesion 3D OSEM and TrueX SUVmax and Percent Difference Between the Two Measurements, Ordered by Increasing Surrounding ROI Volume In addition to overall lesser variability between SUV measurements, some significant trends relating to size and uptake were also seen in background tissues. In the bladder the difference between OSEM and TrueX SUVmax s increased with increasing SUVmax but on average decreased with increasing volume. The heart and liver measurements showed no significant dependency on either tissue uptake or ROI size. Neither did SUVavg measurements for any of the background tissues. In calculating the contrast ratios of lesion SUVs, the contrast in TrueX images was always greater than that of OSEM images. Figure 2.22 shows the compared contrast values of OSEM and TrueX measurements. The difference in the two contrast values was not significantly related to tumor SUV or size. 68

83 Figure Comparison of OSEM and TrueX Lesion to Background Contrast Ratios Discussion Based upon clinical feedback following a PET/CT system upgrade, this study was conducted in order to quantify the differences in SUVs produced using both the new and previous standard reconstruction methods. Both the imaging system hardware and software were upgraded at this time. This study however only analyzes the impact of the software changes. Because the raw data were acquired on the new system, changes in detection are not accounted for. Only once the PET data were reconstructed twice could a comparison between the old 3D OSEM and the new TrueX methods be made. The changes in the standard reconstruction method involved several facets. Firstly, increasing the number of iterations and subsets per iteration in the new TrueX settings would in theory lead to both a faster reconstruction time and higher contrast image, but one which also potentially had greater noise levels [37]. Secondly, the addition of a point spread function correction in the reconstruction algorithm further improves image quality through increased spatial resolution, as well as improved quantification through better recovery of activity distributions, particularly for small objects [46]. The PSF correction has also been shown to reduce image noise. Therefore both improved image quality and more accurate 69

84 quantitative measures were expected with the implementation of the new reconstruction algorithm alone. However, as has been well established, consistency among reconstruction settings plays a large role in the accuracy and usefulness of PET quantitative measures. Thus any major discrepancies revealed by reconstructing data with both the new and old methods would be cause for concern. In the 100 patient studies evaluated, 148 target tumor lesions were identified and evaluated, in addition to several reference tissues. For images reconstructed with OSEM and TrueX settings, the average percent difference between the SUVmax of all tissues was 32.8±15.0%. The largest percent difference measured was for a TrueX SUVmax being 77.5% greater than the corresponding OSEM SUVmax. In general the difference between the two measurements was related to the ROI volume. On average, ROIs which were smaller than 10 cm 3 had greater differences in the two measured SUVs. Reference ROIs placed over the bladder, heart and liver showed less dependence on size than did target tumor lesions, but also generally consisted of larger ROIs than did target tumors. Specifically reviewing lesions, a relation to both ROI uptake and volume was revealed. While lesions had an overall higher percent difference between TrueX and OSEM SUVs, compared to background tissues, the percent difference was also significantly greater in lesions less than 10 cm 3 in volume or with an OSEM SUVmax below 5, as seen in Figures 2.20 and These trends related back to the inherent differences in the two reconstruction algorithms. By adding the PSF correction, the TrueX reconstruction improves the representation of activity distributions in smaller objects. This is done by better accounting for partial volume effects at sizes below twice the resolution of the system. Here the image resolution was 4x4x4 mm, so lesions at or below about 10 cm 3 in volume would be expected to have the greatest improvement in quantitative accuracy, although by increasing the number of iterations and subsets the quantification of all tumors should be improved. This impact on quantification is again seen in calculating the contrast ratios of each target lesion compared to the background, here the average liver SUV. Again the contrast levels of TrueX images were always superior to those of OSEM images. This improved 70

85 contrast, while apparent in quantitative measures, can also lead to enhanced lesion detection from a qualitative point of view. This effect was not specifically evaluated here, although image quality is generally improved in TrueX images, as seen in Figure The comparison of these two measurements revealed an upward bias in measured activity levels in TrueX reconstructed images, as compared to the previous OSEM method. Although these increased measurements may be more accurate representations of true uptake levels, they are still significantly different than those previously encountered. Background tissues had significantly different SUVs, as well as present tumor lesions. Therefore a direct comparison between any SUV measured on OSEM and TrueX images should be made with care. Also, compared ratios of tumor to background levels showed an upward bias with the application of the new reconstruction settings. For any patients for whom imaging was completed on both systems, absolute changes in quantitative measures may not be the most optimal method for evaluation of changes in the disease state. Conclusion In conclusion, PET/CT imaging is constantly striving toward the most accurate representation of activity distributions in the body, both qualitatively and quantitatively. However, as standard image acquisition and reconstruction methods evolve, potential variations in the way PET results are evaluated must be identified. Here, reconstruction changes had a significant impact on SUV measurements, possibly influencing clinical decision making. Therefore rigorous qualification of any new imaging system implementations should be made part of any system upgrades to minimize the potential impact on patient care. 71

86 Chapter 2.5: Conclusion In review, patient preparation, emission data acquisition, and PET reconstruction methods are all known to impact both qualitative and quantitative aspects of PET/CT imaging studies. This work therefore evaluated the variations in PET results caused by changes at several levels of the imaging process. In terms of the implementation of standard patient preparation procedures, it was found that the majority of evaluated patient studies fell within the acceptable range of both administered FDG dose and IST. However there were still many completed studies which fell outside both the desired 15 mci (±10%) and 75 minute (±10) dose and IST targets. Both of these factors are known to impact PET quantification and so should be kept consistent for all patients as best as possible. Both administered doses and ISTs were found to be related to the scheduled scan time and examination indication, as well as patient age. Thus through improved management of timing throughout the day and better identification of patients likely to cause delays, the accuracy with which both of these targets is met may be improved upon. Secondly, an analysis of how electively changing reconstruction settings impacts PET quantification showed only minor variations in SUV measurements. For PET images reconstructed with each clinically available reconstruction protocol, a less than 15% average difference between SUV measurements was found. For changes such as the number of subsets per iteration or the TOF KW, a minimal and acceptable deviation from standard quantitative measures occurred. However it remains advisable that the same reconstruction settings be used for each study completed for a single patient. PET/CT imaging is influenced by such a variety of factors that any one which can be controlled for, such as reconstruction methods, should be kept constant whenever possible. When such choices are beyond the control of the user however, the impact becomes all the more important. In the case of a system upgrade implementing a new standard reconstruction algorithm, the effect of these changes on PET quantification was found to be significant. More significant changes to both iteration and subset numbers than in the 72

87 prior evaluation as well as adding PSF corrections to the reconstruction led to an average increase in tumor lesion SUVmax s of 47.0%. This difference is well beyond the range of acceptable and normal deviations in PET measurements. This analysis showed that although the new reconstruction method was intended to improve both image quality and quantitative accuracy, newly acquired studies could no longer be directly compared to previously completed ones. Again, consistency in each aspect of PET/CT imaging was proven critical. In conclusion, this multifaceted evaluation of PET/CT acquisition and reconstruction procedures highlights the need for consistency through each stage of PET/CT imaging. To best counteract the factors beyond normal control during PET studies, manageable ones such as injected FDG dose, uptake time, and reconstruction methods should be kept as consistent as possible. This will help to ensure the accuracy of each study s qualitative and quantitative results, and thus the usefulness of PET/CT in the clinical setting. 73

88 Chapter 3: Impact of Time-of-Flight Acquisition and System Point Spread Function Corrections on PET/CT Quantification Introduction 18 F-FDG positron emission tomography (PET) has become a key component in the diagnosis, staging and restaging of a variety of cancers [9, 20]. Within the scope of oncologic imaging, image quality and accurate quantification are critical to the assessment of disease [66, 67], particularly in the case of repeated imaging for therapy response assessment [24]. Metabolic imaging has a potentially great impact on the overall treatment and outcomes of cancer patients when applied as a standalone study or alongside conventional diagnostic methods [14]. Additionally, PET has been shown to be highly accurate in the early prediction of tumor response to therapy [21]. The quantitative accuracy of such studies depends upon analysis methods to a degree [27], but the activity concentrations recovered and reconstructed by the imaging system also play a large role in quantitative readouts. The ability of a given system to recover activity distributions from within the body can greatly impact the clinical results of a PET study [68]. This accuracy then becomes of interest when patients receive multiple scans on more than one imaging system, even within the same institution. It has been shown that repeated quantitative measures in FDG PET/CT can be performed accurately for serial scans acquired and reconstructed within the same imaging system [51]. In our clinical environment however, two differently manufactured PET/CT scanners are used for routine clinical scanning. Given that they are in the same network, patients receiving multiple scans through the course of treatment and during follow-up are commonly imaged on both systems. The standard protocols on each scanner are similar in terms of patient preparation procedures, however they differ by means of data acquisition 74

89 and reconstruction settings, specifically with and without time-of-flight (TOF) information and correction for the system point spread function (PSF). The inclusion of TOF capabilities has been shown to improve photon counting and contrast recovery [42] as well as signal to noise ratios, particularly in larger patients [69]. Only one of our two imaging systems uses TOF acquisition as part of the standard protocol. The other, however, includes a reconstruction algorithm which makes use of the system PSF. With PET s limited spatial resolution and thus partial volume effects impacting the detection and quantification of small tumor lesions, the incorporation of PSF modeling into the PET reconstruction has a positive effect on image quality, reducing noise and enhancing lesion signal to noise ratios [47]. The application of both TOF data and PSF-based reconstructions has been shown to significantly improve image quality [45]. However with each system s utilization of a unique arrangement of detector characteristics and standard reconstruction methods [52, 53], the comparison of quantitative measurements from one scan to another is a potential limitation in metabolic imaging. Previous studies have evaluated the accuracy of a single system [39] and have assessed calibration accuracy and techniques to improve it [38, 70]. However a direct correlation of the quantitative results of multiple scanners using a single phantom prepared with a clinically used radioisotope has yet to be developed. The aim of this study was then to investigate the recovery of activity concentrations by the two imaging systems, by making use of two identically prepared phantoms including hollow spheres with varying volumes and known activity concentrations. The recovery coefficient (RC) of each scanner was measured, allowing a direct comparison of the impact of varied acquisition and reconstruction methods. Materials and Methods We evaluated the accuracy of the two on-site PET/CT scanners, a Siemens Biograph 64 mct PET/CT and a Philips GEMINI TF 64 PET/CT, using an 18 F filled Jaszczak phantom with hollow sphere inserts. On two separate occasions, two identical phantoms were filled with the activity levels described in Table

90 Background 16 ml 8 ml 4 ml 2 ml 1 ml 0.5 ml Exp. 1 Hot to Background Ratio Total activity (kbq/ml) Exp. 2 Hot to Background Ratio Total activity (kbq/ml) Table 3.1. Total Activity and Hot-to-Background Ratios for Each Sphere in Both Separate Experiments Activity concentrations were calculated from known dilution measurements and from dose calibrator measures. These two measurements showed that both phantoms contained the same amount of activity in each set of hollow spheres. This method of filling the spheres with proportionately increasing activity concentrations with the decrease in sphere volume was meant to test scanner accuracy over a range of sizes and concentrations that may be encountered in clinical situations. Once filled, a phantom was placed on each of the two imaging systems. Each scan was performed according to the standard local protocol. For the Siemens Biograph this included a 3D non-tof scan followed by TrueX HD OSEM reconstruction using 3 iterations each with 24 subsets. This reconstruction included a correction for the system PSF. On the Philips GEMINI TOF scans were performed with 3D list-mode OSEM reconstruction, utilizing 3 iterations with 33 subsets, which does not yet include a PSF correction in the current software. Scan durations were such that the first run of experiment 1 included five time points at which a 15 second, 30 second, 45 second, 60 second, 90 second, 180 second and 360 second scan were completed. This protocol was repeated four additional times, requiring 129 minutes. It was set up such that each of the scans of the first timepoint was exactly one half-life before those of the fifth. Although efforts were made to synchronize the clocks between both scanners, we ensured that each scan was performed at the same time at both locations by coordinating over the phone. Upon completion of the first run on each scanner, the two phantoms were switched and repositioned for imaging during a second run. This process took approximately one half hour. In the first experiment only four time points were included in the second set of image acquisitions, however each of the original scan durations was still included at each time point. In the second experiment, both the first and second run included a total of three time points. 76

91 Following reconstruction, all images were analyzed by placement of 3D regions of interest (ROIs) over each hot sphere and in the background centered within the phantom cylinder at the level of the spheres. ROIs over the spheres were placed within the volume visible on the CT scan, although this did not always contain the entirety of the activity visualized on PET images. The maximum, minimum, average and standard deviation of the activity concentration (in kbq/ml) per ROI were recorded for each scan duration at each time point. Image activity concentrations were then compared to known concentrations by means of calculating the RC as follows: RC = maximum measured sphere activity average measured background activity known sphere activity known background activity [71]. Results Although a variety of emission scan durations were included in the acquisition for further analysis, the results from the 90s scans follow the standard local protocol of each imaging system. Therefore they represent the truest approximation of coincident count events in a clinical study and are presented here as the image sets analyzed. Upon visual inspection, the PSF-based reconstruction appeared to better confine recovered activity distributions within the margins of each sphere than did the strictly TOF acquired data. Figure 3.1 shows sample images from each scanner. For the TOF images, particularly in the three smallest spheres, the activity on the PET images appears blurred beyond the outlines visible on the CT images. The TOF, non-psf corrected images do however appear to have a more homogeneous background signal. 77

92 Figure 3.1. Sample Images of Phantom Data Acquired on the Philips GEMINI 64 TF and the Siemens Biograph 64 mct PET/CT The quantitative results further highlighted the differences between acquisitions over a range of hot sphere volumes and activity concentrations. Figures 3.2 and 3.3 show the measured kbq/ml of each sphere from the PSF-corrected and the TOF acquired data for the first experiment, respectively. Figures 3.4 and 3.5 show the results from the second experiment for the PSF-corrected and TOF acquired measurements, respectively. In each of these figures the black line represents the true activity concentrations in each sphere and the background volume, while the colored lines represent the decay corrected measured activity values over the various time points imaged during each experiment. Firstly, the PSF-based reconstructed image results show accurate measures for the three largest spheres, while exhibiting overestimation of activity concentrations in the 1 and 2 ml spheres and slight underestimation for the smallest 0.5 ml sphere. This trend was maintained over all repeated measurements from both experiments. Secondly, the TOF acquired image results show greater accuracy in measuring the activity concentrations of four larger spheres, however the results differed substantially for spheres less than 2 ml in volume. Here the ability to distinguish between varying activity levels is diminished. Again these results were consistent among repeated measurements. 78

93 Figure 3.2. Maximum kbq/ml per Volume Measured on the Siemens Biograph During the First Experiment Figure 3.3. Maximum kbq/ml per Volume Measured on the Philips GEMINI During the First Experiment 79

94 Figure 3.4. Maximum kbq/ml per Volume Measured on the Siemens Biograph During the Second Experiment Figure 3.5. Maximum kbq/ml per Volume Measured on the Philips GEMINI During the Second Experiment In calculating the recovery coefficients from each scan, the above results were further investigated. Figure 3.6 shows the average RC of each of the two experiments for both the PSF-corrected and TOF acquired data sets. The average RC for all PSF-corrected 80

95 measurements was between 1.12 and 1.16 for the three larger spheres and varied as high as 1.41 for the smaller spheres in which activity concentrations were overestimated. The average RC of the smallest, 0.5 ml sphere was 0.86, confirming an average underestimation of activity in this sphere for the given concentration level. The TOF acquisition results showed, on average, more accurate recovery in spheres between 2 and 16 ml in volume, with the average RC being 0.98 ± In the two smallest spheres however the average RC fell to 0.61 for 1 ml and 0.44 for 0.5 ml volumes. Figure 3.6. Average Recovery Coefficients for all Spheres Measured on Both the Siemens Biograph and Philips GEMINI To give further insight into the quantitative results, time curves were constructed for each experiment. Figure 3.7 shows the curves for each imaging system, the GEMINI and the Biograph, respectively, in the second experiment. These curves illustrate changes in measured activity levels over time. Given that the three larger spheres had hot-tobackground ratios doubling as the volume was halved, the expected result is seen with 81

96 Figure 3.7. Time Curves from the Philips GEMINI and Siemens Biograph, respectively, Depicting the Decrease in Measured Activity Concentrations of Each Sphere Over Time measured activity levels being similar with each progressive half-life of decay time for these three spheres, for both TOF acquired and PSF-corrected data measurements. For the three smaller spheres, the curves do not appear as expected, however. In these spheres, activity levels were not measured accurately on either system. The PSF-corrected data 82

97 consistently overestimated activity concentrations in the 1 and 2 ml spheres, while slightly underestimating the 0.5 ml concentration. However, the TOF acquired data significantly underestimated both the 1 and 0.5 ml sphere activities. Figure 3.8 shows the time curves for the small spheres, from the first experiment only, from the TOF acquired data. This image shows the discrepancies in measurements of the small spheres. The system can be seen to underestimate activity concentrations, although the amount by which it does so appears to decrease with time. This highlights the fact that the true decay in radioactivity is not being reflected in activity concentration measurements, indicating that the errors in measurements are likely more related to blurring than count statistics related affects. Figure 3.8. Time Curves from the Philips GEMINI Showing the Disagreement Between Measured and Actual Activity Concentrations of Each of the Smallest Spheres Over Time Discussion In any clinical environment where multiple imaging systems are utilized for everyday patient scanning, calibration between system quantitative measures is essential, as is the maintained accuracy of each system s measurements. This study was undertaken in order to fully explore how well each of two imaging systems recovered activity concentrations of known amounts. 83

98 The phantoms were prepared and filled with hollow spheres of varying volumes and activity concentrations. Imaged over two full half-lives worth of decay, these sessions provided detailed insight into each system s ability to accurately recover activity distributions. Upon visual analysis of each system s images several differences were noted. One observation was that the TOF non-psf corrected images appeared to have a more uniform, less noisy background. However the PSF-corrected non-tof images appeared to have less blurring of the PET activity distributions, particularly in the smallest spheres. These differences in image quality relate to the differences in emission data detection and image reconstruction between the two systems. While the GEMINI data here included TOF data the Biograph data did not. However, the Biograph reconstruction algorithm did include a correction for the system PSF. Quantitative analysis of the data provided further insight into the differences between the two systems. For the larger spheres of 16, 8, and 4 ml volumes both the Biograph and GEMINI measured activity concentrations with a good level of accuracy. For the smaller spheres containing higher activity levels, the systems differed greatly. The TOF acquired data were unable to measure the 1 and 0.5 ml spheres with any accuracy, quantifying their activity levels at the same level as the 2 ml sphere. Alternatively the PSFcorrected data overestimated the activity in the 1 and 2 ml spheres by an average of about 20%, while underestimating the 0.5 ml sphere by an average of 10%. These quantitative results are also representative of the differences in detection and reconstruction methods used by the two systems. The utilization of TOF data allows for better recovery of activity in larger objects. However, the PSF correction proves essential in quantifying any object smaller than 2 ml in volume. The quantitative effects of these two characteristics, TOF acquisition and PSF modeling, are also reflected in the calculation of RC s for each imaging system, as seen in Figure 3.6. In these experiments, both the sizes and the activity concentrations of each sphere were varied within the same imaging session. While we sought a range of activities that may be encountered clinically and would also give sufficient detectable counts during acquisition, we also looked at a range of sizes of the hot spheres which may be encountered 84

99 in a patient scan. With an image resolution of 4x4x4 mm, the smallest sphere, having a 9.89 mm inner diameter, remains within the detectable limits of both systems. However, suspected partial volume effects were seen in the TOF non-psf corrected results. Therefore the construction of time curves provided insight as to whether these effects were related to the chosen hot-sphere-to-background ratios. These curves, as seen in Figure 3.7 show that for large spheres the decay in activity levels was accurately accounted for. As seen in Figure 3.8 however, the measured activity amounts in the small spheres do not correspond well with the known decay levels. This then suggests that the errors in activity measurements of the small spheres are likely related to partial volume effects rather than the activity ratios. Even though these small spheres had quite large activity concentrations, they are not so large as to be affecting the quantitative measures. Rather it appears that the spill in and spill out cause by partial volume effects, causing visual blurring and inaccurate quantification, are the predominant factors here. This study is limited by the fact that both sphere volumes and activity concentrations were varied simultaneously during both experiments. Thus both of these characteristics were then impacting each imaging system s recovery of activity distributions. Future work should involve imaging of a phantom prepared with spheres of varied volumes but the same activity concentrations per sphere and the same volumes with varied concentrations. This will then isolate the true quantitative impact of partial volume effects and thus the application of the PSF correction and TOF data acquisition. Conclusion In conclusion, this study revealed significant differences in the way that two unique PET/CT systems recover and quantify various activity concentrations in spheres of varied volumes. The systems are unique in their use of either TOF PET acquisition or a system PSF correction. The quantitative results herein show that TOF data better measure activity in large objects while a PSF correction remains key to accurately quantifying activity in smaller objects. While both systems work toward providing the most accurate quantitative results, care must be taken when directly comparing quantitative results from two separately acquired imaging data sets. 85

100 Chapter 4: Impact of Acquisition Time and Radiopharmaceutical Dose on PET Quantification Introduction Positron emission tomography (PET) is increasingly being used as a functional biomarker of disease states, especially in oncology [9]. The ability to visually and quantitatively represent metabolic changes in disease states makes PET well suited to a unique imaging role. PET allows for early [25] and accurate measures of changes in disease, providing increased guidance in the treatment planning of many forms of cancer [14, 72]. Excellent visual quality and accurate quantification are then critical components of clinical PET studies, both in the initial staging of disease [4] and in therapy response assessment [73]. In the last 60 years, the frequency of nuclear medicine studies in the United States has increased 10-fold [54]. And in the last 20 years the worldwide annual per capita effective radiation dose from medical imaging has doubled. Although PET/CT studies account for only a fraction of all studies performed, the cumulative effect of achieving high qualitative and quantitative accuracy over many scanning sessions is an increased radiation burden for patients [54, 55]. Through the course of treatment and follow-up this may lead to a greater potential risk of cancer incidences. Well established recommendations for image acquisition and analysis have been made [32, 67] based on the known limits of required count rates and noise levels needed in order to provide accurate quantification in earlier generations of PET/CT imaging systems. Thus standard protocols in terms of radiopharmaceutical dosing and emission scan durations have also been evaluated and implemented. 86

101 In recent years, improvements in scanner technology and photon detection [41], reconstruction methods [47, 69] and the addition of time-of-flight capabilities [43, 74] have improved PET spatial resolution limits and quantitative accuracy. These factors have served to increase the usefulness of PET in clinical settings, particularly in oncology [21, 75, 76]. Despite such improvements, standard imaging protocols have not been reevaluated in terms of required radiopharmaceutical doses and acquisition times needed for adequate image quality and robust quantification. Although dosing for optimal image quality has been studied [7], radiation exposure for patients being imaged once or repeatedly remains a critical concern [54, 55, 77]. With this in mind, we aimed to evaluate the impact of significantly lowering the number of coincident events included in PET emission data. Making use of phantom and clinical data, the result of decreased count statistics in PET studies was explored, both qualitatively and quantitatively. PET raw data were clipped to simulate short duration scans including only a fraction of the original number of coincident counts. This method was validated in phantom studies prior to being applied to clinical data. In clinically acquired data sets image quality and variation in quantitative measures was evaluated. Materials and Methods Phantom Study A Flangeless Deluxe Jaszczak phantom with 16, 8, 4, 2, 1 and 0.5 ml hollow spheres was filled with 7.22 mci of 18 F such that the hot-sphere-to-background activity ratios were varied proportionately with the sphere volume. The activity ratios are listed in Table ml 8 ml 4 ml 2 ml 1 ml 0.5 ml Hot Sphere to Background Ratio Table 4.1. Phantom Hot Sphere to Background Ratios for Reduced PET Acquisition Durations 87

102 The phantom was imaged on a 64-slice Philips GEMINI TF PET/CT system (Cleveland, OH). A standard low-dose CT image was completed with 120 kvp, 163 mas and a 4 mm slice thickness. PET data were then acquired with one bed position encompassing the entire phantom volume. Acquisitions were conducted with varied durations, including 15, 30, 60, 90, 180 and 360 seconds per emission volume. The PET acquisitions were repeated multiple times per imaging session. The images were reconstructed using the standard OSEM algorithm with 3 iterations, 30 subsets and a 14.1 cm time-of-flight kernel width. The PET/CT system allows for the storage and separate reconstruction of the original emission data via listmode. Subsequently the 360 second data were clipped to match the other acquired durations and reconstructed with the inclusion of only that data. These simulated data were then compared to actual emission data acquired at the various durations. Image analysis included the placement of regions of interest (ROIs) over the six hot spheres and in the phantom background volume for each image produced at each duration, real and simulated. The ROIs were defined by the outlines of the spheres as visualized on the CT images. The maximum and average activity concentration of each ROI was recorded. All values were compared directly to the 360 second value. Absolute differences and percent differences from this reference were calculated for each duration. This was in order to define any differences cause by detecting fewer counts over the shortened scan durations. The real acquisition duration differences were then compared to those from simulated durations, in order to validate the accuracy of the data clipping reconstruction method prior to its use with clinical data. Clinical PET/CT Imaging PET/CT imaging was conducted for 31 patients, also on the Philips GEMINI TF. Patients fasted at least six hours prior to imaging. Scanning was initiated 66 ±15 minutes following intravenous injection of 7.94±2.66 mci/kg 18 F-FDG. A low-dose CT scan was completed, and used for attenuation correction, with 120 kvp, 163 mas and a 4 mm slice 88

103 thickness. The PET data included 180 seconds per acquisition volume from the skull to the thighs or legs with a 50% overlap between emission volumes. As with the phantom data, the emission scan data were reconstructed with 3 iterations, 30 subsets, and a 14.1 cm TOF kernel width. Also the listmode data were clipped to include 15, 30, 60, 90, and 120 seconds of the original 180 seconds of data per PET emission volume prior to completing the additional reconstructions. The total number of prompt coincident counts per emission volume was recorded and ROIs were placed on 46 target tumor lesions. ROIs were drawn such that the entire volume of tissue exhibiting increased FDG uptake was included. ROIs were also placed over reference areas in the heart, liver, kidneys, bladder, cerebellum, aorta, lungs, spleen, muscle and bone marrow. These ROIs were drawn so as to fully encompass the tissue of interest. The PET images from each of the acquisition durations were evaluated for visual quality. In a blinded review, two experienced nuclear medicine reviewers scored each image on a 5 point scale, ranking 1 if the image was completely unusable, 3 if it was adequate for clinical purposes, and 5 if it was of excellent visual quality. The average of the two scores was taken and applied later in evaluating dosing levels. An average score of 3 or better would be taken as meaning that any given duration would be considered acceptable moving forward with real dosing variations. PET images were also quantitatively evaluated, measuring the maximum and average SUV per ROI. A percent difference from the 90s reference value was calculated for all other durations. Correlation plots and Bland-Altman plots were also constructed. In evaluating whether the SUV percent difference from standard values for each of the acquisition durations was significant, a 10% change or less was considered acceptable. This cutoff was chosen given that a 20% decrease in SUV, or greater, is often considered a metabolic response in the evaluation of tumors post-therapy [78]. We thus determined that a 10% or less change in SUV with changes in acquisition times was below the level of normal variations expected in PET imaging [9]. 89

104 Results Phantom Study Phantom data with known activity concentrations were used for the validation of the PET data clipping process. Two results could be taken from these data. First is the comparison of SUV measurements between actually acquired data sets of varying durations. Figure 4.1 shows a sample phantom image with a 360 second acquisition in the axial orientation. This PET/CT image shows the clear outline of each sphere from the CT image in a dark grey ring. Also the resulting PET activity distribution is visible as the orange colored distribution within and surrounding the spheres. While partial volume effects were noted, absolute changes in PET quantification remained useable. In comparing the maximum kbq/ml measurement of each sphere ROI for each acquisition duration, minimal variations were found. Figure 4.1. Representative Image of Phantom with Hollow Spheres Containing Varied Activity Concentrations 90

105 Figure 4.2 shows the distribution of the average percent difference in activity concentration measures for each duration as compared to the full 360 second data set. Here it can be seen that 15 and 30 second durations exhibited wider variation in the largest volume, lowest activity sphere. However the average percent difference of all durations for all spheres, except this largest one, was about 10% or less. The background measurements had larger variations, but these are expected when measuring a single pixel value per ROI. Figure 4.3 shows the average absolute measurements of activity concentrations for each duration. This figure shows that while decreasing the scan duration generally leads to decreased activity measures, all measurements were within a reasonable range. Figure 4.2. Phantom Percent Differences from Reference Activity Concentrations with Variations in Emission Scan Acquisition Times 91

106 Figure 4.3. Average Phantom Sphere ROI Maximum Pixel Value Measurements for Each Emission Acquisition Duration The second result of the phantom validation involved the comparison of actually acquired data to that of the raw data clipping. Figure 4.4 shows the direct comparison of these two data types for each acquisition duration. Here the maximum kbq/ml measurement of each ROI for the actually scanned data at each time point is plotted with the measurements taken from each clipped data set. The clipped data sets were created by using only counts in any given time frame, in increments of that duration throughout the original data set. This is to say that for creating the 15 second images the first 15 seconds created one image set, then the second 15 seconds, from 15 to 30 seconds in real time, created another, and so on. In Figure 4.4 it can be seen that the clipped data produced maximum kbq/ml measurements which were on average very similar to those taken from actually acquired data of each duration. These values are the average of nine measurements taken from sequential time points of several phantom scans. The results verified that the data clipping method produced images with quantitative readouts similar to those of real image data, and therefore was sufficient in simulating shortened scan durations for the clinical data sets as well. 92

107 Figure 4.4. Average of Maximum Pixel Value Measurements from Real and Clipped Phantom PET/CT Data Continued 93

108 Figure 4.4 continued. Figure 4.4. Continued 94

109 Figure 4.4 continued. Clinical PET/CT Imaging Sample images with varying reconstruction acquisition durations are shown in Figure 4.5, with representative ROIs placed on a target tumor lesion, the bladder and the heart. Here it is seen that the shortest duration images, for 15 and 30 seconds per emission volume, appear noisier than images including more count data. However it should be noted that the small, 5.11 cm 3 volume target lesion is still visualized even on these lowest quality images. Using the standard 90 seconds reference images to identify target lesions, all lesions were identifiable on all subsequent images, even in very low count image sets. In then applying the raw data clipping method to 31 clinical patient studies, five image sets were created for each patient study, in addition to the original full length scan. 95

110 Figure 4.5. Representative Clinical PET Images for Each Emission Acquisition Duration Qualitative analysis of all reconstructed images revealed that a duration of 60 seconds per PET acquisition volume produced images of clinically acceptable image quality. Figure 4.6 shows the average score of each evaluated acquisition duration with error bars representing the range of the scores. Durations longer than 60 seconds were deemed adequate while the 15 and 30 second images had average scores below the cutoff value of 3. In proceeding with applying lower doses in real acquisitions, dosing levels below 2/3 the standard 15 mci dose would be considered with caution. It should be noted 96

111 however that 120 second images ranked only slightly above 90 second, standard duration images, despite including 30% more counts, on average. Figure 4.6. Average and Range of Reviewer Image Quality Scores per PET Emission Acquisition Duration In recording the number of true coincident events included in each clipped data set, it was found that limiting the time of each scan had a linear impact on the number of prompt counts in each resulting data set. Although there were substantial differences noted for varying anatomic regions throughout the body, consistent count rates per time frame were noted for each data set. Figure 4.7 shows representative count levels per PET acquisition volume for one patient study. Here the different bed positions are overlaid on the whole body image, alongside a chart of the corresponding counts per acquisition volume. The trend of a reduction in the time per frame decreasing the number of counts by the same fraction can be observed. 97

112 Figure 4.7. Whole Body PET Image with Overlay of Acquisition Volumes and Count of Prompt Coincident Events per Volume for Each Varied Acquisition Duration Quantitative measurements showed that SUV was not consistently impacted by varying the acquisition duration. Figure 4.8 depicts the average absolute value of the percent difference in SUVmax per acquisition time for each ROI placement. Anatomic regions containing lower count levels were the most susceptible to wide variations in SUVmax, in particular the liver, heart, and muscle. Regions with a higher count density had relatively consistent measurements. Applying a 10% change in SUVmax cutoff, 15 second acquisitions in the region of the bladder, 30 seconds at the level of the heart and cerebellum, and 60 seconds in the liver, kidneys, aorta, lung, spleen, muscle and bone marrow revealed quantitative data that was on average within the limit. 98

113 Figure 4.8. Average Percent Difference from Reference SUVmax of Each Tissue Type with Varying Acquisition Duration SUVavg proved a more consistent measure, as seen in Figure 4.9. Here the average of the absolute value of the percent difference of SUVavg measurements for each duration and tissue ROI type can be seen. The average percent difference was consistently below 10% for all tissue types except very low uptake ones such as the lungs and muscle, regardless of the scan duration. The 46 target tumor lesions evaluated were of varying sizes (ranging 2.17 to cm 3 ), locations within the body, and activity levels (SUVmax ranged 1.92 to 25.09). This provided a wide perspective of possible tumor types which may be encountered in any clinical setting. Looking exclusively at tumor lesions, the average SUVmax for all lesions was within 12% of the reference value, for all acquisition durations. The correlation of the SUVmax measured from varying emission durations to reference values for each lesion ROI can be seen in Figure While there is generally good correlation for all acquisition times, the strongest correlation was for 60 seconds and longer. These plots also illustrate the few outlying lesions measured. While the average SUVmax percent difference was within 12% of reference values, several lesions had much larger variations. These lesions 99