Abstract

Objectives. In patients with prostate cancer (PC) receiving prostate-specific membrane antigen- (PSMA-) targeted radioligand therapy (RLT), higher baseline standardized uptake values (SUVs) are linked to improved outcome. Thus, readers deciding on RLT must have certainty on the repeatability of PSMA uptake metrics. As such, we aimed to evaluate the test-retest repeatability of lesion uptake in a large cohort of patients imaged with 18F-DCFPyL. Methods. In this prospective, IRB-approved trial (NCT03793543), 21 patients with history of histologically proven PC underwent two 18F-DCFPyL PET/CTs within 7 days (mean 3.7, range 1 to 7 days). Lesions in the bone, lymph nodes (LN), and other organs were manually segmented on both scans, and uptake parameters were assessed (maximum (SUVmax) and mean (SUVmean) SUVs), PSMA-tumor volume (PSMA-TV), and total lesion PSMA (TL-PSMA, defined as )). Repeatability was determined using Pearson’s correlations, within-subject coefficient of variation (wCOV), and Bland-Altman analysis. Results. In total, 230 pairs of lesions (177 bone, 38 LN, and 15 other) were delineated, demonstrating a wide range of SUVmax (1.5–80.5) and SUVmean (1.4–24.8). Including all sites of suspected disease, SUVs had a strong interscan correlation (), with high repeatability for SUVmean and SUVmax (wCOV, 7.3% and 12.1%, respectively). High SUVs showed significantly improved wCOV relative to lower SUVs (), indicating that high SUVs are more repeatable, relative to the magnitude of the underlying SUV. Repeatability for PSMA-TV and TL-PSMA, however, was low (%). Across all metrics for LN and bone lesions, interscan correlation was again strong (). Moreover, LN-based SUVmean also achieved the best wCOV (3.8%), which was significantly reduced when compared to osseous lesions (7.8%, ). This was also noted for SUVmax (wCOV, LN 8.8% vs. bone 12.0%, ). On a compartment-based level, wCOVs for volumetric features were ≥22.8%, demonstrating no significant differences between LN and bone lesions (PSMA-TV, P =0.63; TL-PSMA, P =0.9). Findings on an entire tumor burden level were also corroborated in a hottest lesion analysis investigating the SUVmax of the most intense lesion per patient (, 0.99; wCOV, 11.2%). Conclusion. In this prospective test-retest setting, SUV parameters demonstrated high repeatability, in particular in LNs, while volumetric parameters demonstrated low repeatability. Further, the large number of lesions and wide distribution of SUVs included in this analysis allowed for the demonstration of a dependence of repeatability on SUV, with higher SUVs having more robust repeatability.

1. Introduction

Positron emission tomography (PET) with ligands targeting the prostate-specific membrane antigen (PSMA) is being increasingly utilized, with applications including treatment planning in patients with metastatic prostate cancer (PC) [1, 2]. The accessibility of the PSMA active site to high-affinity ligands, combined with rapid internalization, allows for accurate, noninvasive high-contrast imaging [3]. Given its facile synthesis without need for a cyclotron, 68Ga-labeled radiotracers have been, to date, widely used. However, recent years have also witnessed an increased use of 18F-labeled radiotracers, initially with 18F-DCFBC [4] and other first-generation compounds, and later more widely available radiotracers such as 18F-PSMA-1007, 18F-rhPSMA-7 [5, 6], and 18F-DCFPyL (piflufolastat F18, PYLARIFY®) [7]. The latter agent has been extensively investigated in major clinical trials [8, 9], including the multicenter phase 3 CONDOR and in the phase 2/3 OSPREY trials [10, 11], demonstrating positive predictive values of 78-91% in both detecting PC in pelvic lymph nodes (LN) and distant metastases. Based on the encouraging results, 18F-DCFPyL recently received approval from U.S. Food and Drug Administration (FDA) [12]. As a nationwide, commercially available, 18F-labeled PSMA PET agent [12], one may anticipate an increased use of this radiotracer in both clinical routine and for trials.

The repeatability of uptake features is an important property of 18F-DCFPyL to understand response assessment, e.g., in a theranostic setting or in men starting abiraterone or enzalutamide [8, 13]. If rigorously executed, standardization of imaging protocols and continuously calibrated PET devices allow for high test-retest repeatability [14], but biological aspects or interpatient and intrapatient variability can have a significant impact on quantitative features in repeated imaging studies [15].

In this regard, a recent study has reported high repeatability for 36 lesions in 12 patients using 18F-DCFPyL [16]. In this prospective clinical trial, we aimed to elucidate the repeatability of quantitative parameters on 18F-DCFPyL PET in a test-retest cohort by enrolling 21 men with PC with a total of 230 visible lesions. This relatively large cohort with a corresponding large number of disease sites enabled evaluation of repeatability among different organ compartments, such as in LN or osseous lesions, and among a wide range of SUVs. In addition, such an approach also allowed us to assess the dependence of SUV on original and relative units (in %) and to determine whether higher SUVs have improved repeatibility when compared to lower SUVs. This may be of importance for response assessment studies, where percentage change in SUV by comparing baseline and follow-up scans is a method to define progressive disease [17] or follow response in patients receiving PSMA-targeted radioligand therapy (RLT). Of note, higher SUVs from PSMA PET are linked to better early biochemical response [18] and overall survival [19] in patients under PSMA-directed treatment. Thus, the reader deciding on the appropriateness of RLT must have certainty on the reliability of these semiquantitative parameters.

2. Materials and Methods

This study was registered at ClinicalTrials.gov (NCT03793543) and was carried out under a United States FDA Investigational New Drug Application (IND121064). The Institutional Review Board of the Johns Hopkins Hospital approved this prospective study (IRB00174393).

2.1. Patients

Patient characteristics are displayed in Table 1. 21 patients with mean age years with history of PC were included in this trial. Among others, required inclusion criteria for the study were as follows: (1) years, (2) history of histologically or cytologically confirmed adenocarcinoma of the prostate without neuroendocrine differentiation, (3) patients with metastatic castration-sensitive or castration-resistant prostate cancer (CRPC) with evidence of metastatic disease on conventional imaging with computed tomography (CT) and/or bone scan, and (4) Eastern Cooperative Oncology Group performance status of ≤2 [14].

The exclusion criteria were as follows: (1) serious or uncontrolled coexistent nonmalignant disease, including active and uncontrolled infection; (2) administration of a physical half-lives prior to the first PET/CT; and (3) administration of an intravenous X-ray contrast medium ≤24 hours or oral contrast medium ≤120 hours prior to the first PET/CT.

2.2. Imaging Protocol

18F-DCFPyL was synthesized as previously described [7]. The imaging protocol followed current guidelines [20]. Patients were scanned in the supine position starting from the mid-thigh to the vertex of skull (whole body protocol) at approximately 60 min postinjection. PET/CT was obtained using a 128-slice Biograph mCT (Siemens Healthineers, Erlangen, Germany) with low-dose CT attenuation correction (no contrast, 120 kV, 40 effective mAs, 0.5 tube rotation time, and 0.8 pitch). Standard ordered-subset expectation maximization reconstructions with time-of-flight were used. A subsequent near-term 18F-DCFPyL PET/CT follow-up scan with identical imaging protocol was conducted to assess test-retest repeatability. No change in therapy occurred between the scans.

2.3. Image Analysis

A consensus central review was carried out with all images analyzed by three physicians with experience in the interpretation of PSMA-targeted PET/CT (BH, RAB, and RAW, having at least 3 years of experience in reading scans) who were blinded to clinical data. Images were analyzed using the InterView Fusion software (Version 3.08.005.0000, Mediso Medical Imaging Ltd., Budapest, Hungary) for lesion identification and segmentation.

As described in [21], the entire volume of all 18F-DCFPyL-avid tumor lesions (i.e., tumor burden) was manually segmented using volumes of interest. Mean and maximum standardized uptake values (SUVmean, SUVmax) were assessed. In addition, tumor volume (TV) was computed, which allowed for calculation of total-lesion PSMA (TL-PSMA, defined as ) [22].

2.4. Statistical Analysis

Corresponding uptake parameters were compared between both scans. Scatter diagrams were plotted, and linear regression analysis was performed. Bland-Altman plots were created for both absolute and relative differences of these data (expressed as a percentage), including upper and lower levels of agreement [23, 24]. For correlation of uptake, Pearson correlation was performed (providing ). Kendall’s tau () was also used for correlational analyses with indicating strong correlation [25, 26]. The within-subject coefficient of variation (wCOV, in %) was assessed [27]. For comparison of different wCOVs, the method of Forkmann was used [28]. A lesion-based head-to-head comparison including LN, osseous, and other lesions was conducted. Moreover, to assess for a dependence of the repeatability on different parameters, all lesions were subdivided into a group below (“< median”) vs. above (“> median”) the corresponding median value. In addition, the hottest lesion per patient (defined as metastatic site of disease with the highest SUVmax among all lesions) was also analyzed. A value <0.05 was considered statistically significant. Statistical analysis was performed with MedCalc software (Version 19.6, MedCalc software Ltd., Ostend, Belgium) and Microsoft Excel 2016 (Microsoft Cooperation, Redmond, WA, USA).

3. Results

3.1. Patients

Between March 2019 and March 2020, 21 patients each underwent two scans with a median time between scans of days (range, 1 to 7 days). For the test scan,  MBq (range, 310.8–326.7 MBq) were administered. For the retest scan,  MBq (range, 310.1-328.6 MBq) were injected. A total of 230 PSMA-avid lesions were delineated, with 177/230 (77%) located in the skeleton, 38/230 (16.5%) in LN, and 15/230 (6.5%) in other soft tissue sites. Figure 1 shows a test-retest scan of a patient with low and Figure 2 with high tumor burden. An overview of uptake parameters including SUVmax, SUVmean, TL-PSMA, and PSMA-TV can be found in Table 2.

3.2. Analysis of Repeatability Parameters

For the entire tumor burden on the test scan, SUVmax was (range, 1.6–66.1) and SUVmean was (range, 1.4–23.8), with almost identical results on the retest scan (SUVmax, (range, 1.6–80.5); SUVmean, (range, 1.4–24.8)). The values were ≥0.99 (Figure 3, first column; , SUV, ≥0.87; volumetric parameters, ≥0.83, , respectively). Regardless which correlative analyses were applied, SUVmean demonstrated the best correlation among all parameters (Table 2). wCOVs were high for SUVmean (7.3%) and SUVmax (12.1%). For PSMA-TV and TL-PSMA, repeatability was lower (23.5% and 24.0%, respectively). Bland-Altman plots for all lesions are displayed in Figure 3, second and third columns. For both SUVmax and SUVmean, no systematic increase or decrease between the scans was noted (+/-1.96SD: 3.3/-4.5, 0.9/-1.1, respectively). Of note, higher SUVs had more robust repeatability, in particular for relative SUVmax values in % (Figure 3, top right). On Bland-Altman plots for PSMA-TV and TL-PSMA, larger magnitude of limits was recorded when compared to SUV (+/-1.96SD: 5.9/-6.9, 34.4/-41.5, respectively).

Lesions were subdivided into a group below vs. above the respective median value. Regardless of the investigated parameter, SUV derived from lesions above the median demonstrated a more robust repeatibility, in particular for SUVmean (wCOV: SUVmean, >median, 4.1% vs. <median, 8.7%; SUVmax, >median, 8.8% vs. <median, 16.6%; , respectively; Supplementary Table).

Findings were similar with just the hottest lesion in each patient, with an value of 0.99 for SUV (SUVmax: wCOV, 11.2%; , 0.97; SUVmean: wCOV, 1.2%; , 0.97). No systematic increase or decrease was noted on Bland-Altman plots (SUVmax, +/-1.96SD: 5.6/-8.5; SUVmean, +/-1.96SD: 0.47/-0.21; Supplementary Figure).

3.3. Repeatability Parameters on a Compartment-Based Level

When investigating different types of lesions, comparable values were achieved for both LN (≥0.984) and lesions in the skeleton (≥0.988), which were slightly higher for SUVmean (≥0.996). was ≥0.78. SUVmean/max of LN and osseous lesions yielded high to intermediate repeatability, with significantly lower wCOV calculated for LN sites of disease (SUVmax: LN 8.8% vs. skeleton 12.0%, ; SUVmean: LN 3.8% vs. skeleton 7.8%, ). TV-based features again demonstrated low repeatability, with no significant differences between LN and osseous lesions (PSMA-TV: LN, 24.1% vs. bone, 22.8%, ; TL-PSMA: LN, 23.5% vs. skeleton, 23.3%, ; Table 2). Due to small number, visceral lesions were not analyzed further.

4. Discussion

230 lesions on 21 18F-DCFPyL PET/CTs were utilized to demonstrate overall high repeatability of uptake. Volumetric features revealed relatively lower repeatability, while SUVmean not only demonstrated the highest correlative indices (, 0.92-0.95) but also the best repeatability, in particular for LN (wCOV 3.8%). For SUVmax, robust correlations along with at least intermediate repeatability were noted in LN and osseous lesions, suggesting SUV as a reliable metric for quantitative assessments. For 18F-DCFPyL PET, SUV-based parameters might be an acceptable alternative to volumetric parameters [8]. Importantly, we observed an improved repeatability for higher SUVs when considered relative to the level of uptake (relative units).

18F-DCFPyL is a U.S-wide, FDA-approved, PSMA-targeted, radiolabeled imaging agent for patients with PC [8, 9, 12] and a more worldwide use can be anticipated, indicating the importance of a thorough understanding of this agent. The high repeatability of uptake parameters, both overall and based on metastasis type, is of importance, as it suggests that 18F-DCFPyL may be useful for therapy response assessment and also that manual and automated (e.g., artificial intelligence) methods for lesion detection should be repeatable and reliable [13, 29, 30].

Previous studies have revealed comparable correlations and repeatability, but differences relative to the present trial must be noted. For instance, in a preceding analysis based on 68Ga-PSMA PET in a test-retest setting [31], the authors reported substantially higher wCOV, e.g., for SUVmean derived from LN. Further, no significant differences between lesion type were observed with the 68Ga-labeled PSMA imaging agent [31]. This may be partially explained by the improved diagnostic accuracy of radiotracers labeled with 18F [32]. Intrinsic physical factors of 68Ga may contribute to the partial volume effect, which in turn has an impact on semiquantitative values such as SUV [33], potentially explaining such different wCOVs.

A recent study by Jansen et al. also reported on test-retest properties for 18F-DCFPyL, including a total of 36 lesions [16]. Similar to our findings, SUVmean had a better repeatability when compared to SUVmax [16]. However, no significant differences between LN and osseous lesions were identified in the previous trial, but a trend towards significance was noted () [16]. In our study, significant differences between lesions located in the skeleton and LN were determined, possibly due to the increased number of subjects and lesions [16]. In this regard, relative to the investigation of Jansen et al. [16], more lesions were included (230 vs. 36) providing a broad range of SUV (1.4–80.5). This allows us to demonstrate a dependence of repeatability on SUV, with higher SUVs having a higher repeatability, in particular for relative SUVmax values (Figure 3, third column). This observation is of importance, as absolute SUVs have different ranges depending on their normalization schemes, whereas relative differences allow for intra- and interindividual comparisons [34]. In addition, this marked dependence of SUV on relative units may be clinically relevant, e.g., for response assessment studies, where it is common to indicate percentage change in SUV by comparing baseline and follow-up scans, as recently demonstrated for 18F-DCFPyL [17]. Assessment of delta % has also been recently suggested by the PSMA PET Progression Criteria, with an increase in PSMA uptake of 30% indicating progressive disease [35]. As such, the observed improvement of relative repeatability at the higher SUVs may be important for future multicenter trials, e.g., for 18F-DCFPyL-based therapy response monitoring [8] or for patients scheduled for RLT.

In this regard, no study to date has explored the predictive potential of 18F-labeled PSMA PET for subsequent outcomes in patients with PC scheduled for PSMA-directed therapy [36]. The repeatability of SUVmax units demonstrated in this study may lay the foundation for future investigations of the utility of 18F-DCFPyL PET in monitoring 177Lu-based RLT. These considerations are further fueled by the fact that in patients scheduled for PSMA-targeted RLT, high average SUVs on baseline PSMA PET are frequently observed (up to 73.4) and that increased baseline SUVmax were linked to improved early biochemical response (cut-off, >19.8) [18] and overall survival (cut-off, >14.3) [19]. In this analysis, the highest SUVmax was 80.5, and thus, the results are relevant to the patient population undergoing RLT. The higher repeatability at higher SUVmax may be of importance in the theranostic setting, as the reader deciding on RLT has certainty that such findings are not related to measurement variability, suggesting SUVmax is a reliable imaging biomarker to identify high risks prone to treatment failure. This also applies regardless if lesions are located in the skeleton (Figure 2) or LN (Figure 1), as repeatability of SUVmax was high to intermediate among metastases allocated to different organ compartments (Table 2).

Both 68Ga- and 18F-labeled, PSMA-directed radiotracers demonstrate that the best repeatability is found with SUV, whereas values for TV may have to be interpreted with caution [16, 31]. As a possible explanation, the latter parameter may be subject to an operator-dependent bias of manual segmentation. Fully automated delineation software may increase repeatability, e.g., when artificial intelligence such as deep learning is applied [37]. Moreover, state-of-the-art reconstruction algorithms such as point-spread function (PSF) may also recategorize lesions as more definitive sites of disease attributable to PC, as recently demonstrated for 18F-DCFPyL [38]. However, the effect of PSF on repeatability in patients scheduled for 18F-DCFPyL has also been reported, with PSF reconstruction significantly having a negative impact on repeatability for SUV, but not for TV [16]. Given these contradictory results of increased interpretative certainty and decreased repeatability by implementing PSF, future studies should explore the impact of novel and advanced reconstruction algorithms on test-retest metrics.

This study has several limitations. Although providing the largest cohort of patients and lesions to date, some patients had a disproportionate number of lesions and clustering effects from that lesion distribution may have effected the results. Therefore, a hottest lesion analysis investigating the metastatic site with the highest SUVmax per subject was also performed. Again, a high repeatability with no systematic increase or decrease was noted (Supplementary Figure), further corroborating the findings including all suspected sites of disease. Moreover, lesion size, dose, and patient factors including interpatient and intrapatient variability can have a significant impact on semiquantitative assessments using this radiotracer [15, 39]. Therefore, future studies should also consider controlling for such day-to-day variables [16]. Partial volume effects are almost certainly a factor in repeatability in small lesions, and future test-retest studies might exclusively enroll patients with extensive tumor burden. Such an approach would then corroborate our present findings across a broad spectrum of tumor burden. Despite enrolling the largest cohort of patients in a prospective test-retest setting for 18F-DCPFyL to date, the number of patients with different therapies was too small to provide reliable results for a subanalysis focusing on prior therapeutic regimens. This should also be addressed in future studies.

5. Conclusion

Our results demonstrate that 18F-DCFPyL has highly repeatable uptake parameters in PC lesions. Further, the large number of lesions and wide distribution of SUVs included in this analysis allowed for the demonstration of a dependence of repeatability on original and relative SUVs, with higher SUVs having more robust repeatability. This observed improvement of repeatability at increased SUVs may be important for future multicenter trials, e.g., for 18F-DCFPyL-based response monitoring in patients under antihormonal treatment.

Data Availability

The data are not publicly available because, due to the European regulations regarding data protection, we cannot make data available online or disburse them. However, all data are available for revision on-site.

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The Institutional Review Board of the Johns Hopkins Hospital approved this prospective study (IRB00174393).

Informed consent was obtained from all individual participants included in the study.

Disclosure

Results of this work are part of the doctoral thesis of BH, planned to be submitted at the Medical Faculty of Bonn University. This publication was supported by the Open Access Publication Fund of the University of Wuerzburg.

Conflicts of Interest

MGP is a coinventor on a US patent covering 18F-DCFPyL and as such is entitled to a portion of any licensing fees and royalties generated by this technology. This arrangement has been reviewed and approved by the Johns Hopkins University in accordance with its conflict of interest policies. SPR is a consultant for Progenics Pharmaceuticals Inc., the licensee of 18F-DCFPyL. MAG has served as a consultant to Progenics Pharmaceuticals. SPR is a consultant for Progenics Pharmaceuticals. All other authors declare that there is no conflict of interest as well as consent for scientific analysis and publication.

Authors’ Contributions

All authors contributed to writing, critically reviewing, and approving the paper. Specific author contributions are as follows: conceptualization was done by SPR, MGP, RAW, TH, MAG, KJP, MAE, MCM, and MAL; methodology was done by RAW, MGP, SPR, LB, RAB, TD, BH, TH, RA, and AS; software was acquired by RAW, BH, SL, PH, and SES; validation was done by MGP, AKB, TD, CL, ME, KJP, MAL, and LS; formal analysis was done by RAW, RAB, BH, SL, MAL, and SPR; investigation was done by RA, AS, and SPR; visualization was done by TH, RAB, and LB; supervision was done by MGP, MAL, SPR, KJP, and MCM; project administration was done by RA, AS, and SPR; funding acquisition was done by MGP, SPR, TH, RAW, and TD. Rudolf A. Werner, Bilêl Habacha, Ralph A. Bundschuh, and Steven P. Rowe equally contributed to this work.

Funding

Funding for this study was received from the Prostate Cancer Foundation Young Investigator Award, Movember Foundation, and National Institutes of Health grants CA134675, CA184228, EB024495, and CA183031. This work was supported by HiLF at Hannover Medical School (TD, RAW) and “RECTOR” Program at Okayama (TH). A KAKENHI grant (21K19450) has been provided for Dr. T. Higuchi from the Japan Society for the Promotion of Science (JSPS). MAG and SPR have received research finding from the Progenics Pharmaceuticals.

Supplementary Materials

Supplementary Figure: hottest lesion analysis in a test-retest setting, with correlation of (A) maximum standardized uptake values (SUVmax), (B) mean standardized uptake values (SUVmean), and (C, D) corresponding Bland-Altman plots. An excellent correlation between test and retest scans along with a considerable low magnitude of limits within standard deviations (SD) was noted. Supplementary Table: differences in within-subject coefficient of variations (wCOVs, in %) for all parameters, divided into a group below (<) vs. a group above (>) the corresponding median value ( per group). Regardless of the investigated parameters, lesions above the median had a more robust repeatibility, which was markedly better for standardized uptake value (SUV), in particular for SUVmean. SUVmax: maximum SUV; PSMA-TV: PSMA tumor volume; and TL-PSMA: total lesion PSMA. has been derived from comparison of wCOV from lesions below vs. above the respective median. SD: standard deviation. (Supplementary Materials)