Urinary biomarkers for the detection of prostate cancer in patients with high‐grade prostatic intraepithelial neoplasia

High‐grade prostatic intraepithelial neoplasia (HGPIN) is a recognized precursor stage of PCa. Men who present HGPIN in a first prostate biopsy face years of active surveillance including repeat biopsies. This study aimed to identify non‐invasive prognostic biomarkers that differentiate early on between indolent HGPIN cases and those that will transform into actual PCa.


INTRODUCTION
Prostate cancer (PCa) is the most commonly diagnosed cancer and a major cause of cancer-related death among men in economically developed countries [1]. The current screening method to diagnose PCa is based on the measurement of serum prostate specific antigen (PSA) levels and a digital rectal examination (DRE), whereas the decisive diagnosis is based on the result of the transrectal ultrasound-guided prostate biopsies.
Adoption of widespread PSA-based screening during the late 1980s has substantially reduced mortality rates. However, the use of screening PSA remains controversial due to the high rates of overdiagnosis [2]. Nowadays a significant proportion of men are diagnosed with a PCa that would have remained undetected in the absence of screening. In these cases, by definition, treatment would not improve health outcomes for these men (overtreatment) [3]. Instead of applying a curative aggressive treatment, active surveillance is gaining acceptance as an alternative initial management strategy for men with low-risk PCa [4].
It has been proven that men who have an initial non-cancerous biopsy diagnosis remain at risk of PCa, especially if the initial diagnosis included suspicious lesions [5]. There is evidence that many prostate cancers are preceded by or accompanied with a pre-malignant change in the epithelial cells, known as prostatic intraepithelial neoplasia (PIN). This condition is characterized by progressive proliferation of cytologically atypical or dysplastic epithelial cells within architecturally benign-appearing glands and acini [6]. PIN is recognized as a continuum between low-grade and high-grade forms, with high-grade PIN (HGPIN) thought to represent a likely immediate precursor of early invasive carcinoma [7]. HGPIN resembles PCa in its cytological appearance and also at the molecular level, showing a pattern of gene and protein expression for various PCa biomarkers that is either similar to PCa or "intermediate" between tumoral and benign prostate tissue [8].
These premalignant lesions are strongly predictive of the presence of carcinoma. It has been shown that PCa discovered after an initial HGPIN diagnosis on biopsy are more likely to be organ confined, yet of similar grade, compared with cases diagnosed as PCa on the first biopsy. These findings likely reflect a scenario in which the PCa were missed on the initial biopsy as a result of smaller size [9]. Recent studies estimated that around 22% of all patients diagnosed with HGPIN in the first biopsy will be diagnosed with PCa in consecutive repeat biopsies [10]. This percentage has decreased over the last years, mainly due to increased needle biopsy core sampling, which detects many associated cancers on initial biopsy, such that re-biopsy, even with good sampling, does not detect many additional cancers. However, this percentage is still a bit higher than the risk reported in the literature for repeat biopsy following a benign diagnosis, and clinical and pathological parameters do not help stratify which men with HGPIN are at increased risk for a cancer diagnosis [11]. Consequently, men with HGPIN usually undergo a close clinical follow-up over several years, including measurement of serum PSA, DRE, ultrasound and repeat biopsies [12]. Evidently, the majority of the patients (78%) will have negative results year after year, thus many repeat biopsies could be avoided if clinicians were provided with an accurate preferably non-invasive predictive test for determining PCa presence. For this purpose biomarkers that distinguish between pre-malignant lesions such as HGPIN and PCa are urgently needed.
Several attempts have been made in the past to improve the current management of HGPIN patients. For instance, the number of positive HGPIN cores at the moment of diagnosis has been associated with the risk of cancer, suggesting that patients with unifocal HGPIN should be managed expectantly, whereas those with multifocal HGPIN could benefit from a more aggressive surveillance including repeat biopsies [13]. Moreover, overexpression of certain molecules in HGPIN tissue has been found to correlate with the likelihood of finding PCa in subsequent biopsies. One of these predictors is the TMPRSS2:ERG gene fusion. Park et al. assessed the presence of this molecular rearrangement by immunohistochemistry on prostate biopsies, showing that patients with ERG overexpression were more likely to develop PCa [14]. Moreover, prostate tumor overexpressed 1 (PTOV1) and alpha-methylacyl-CoA racemase (AMACR) have also been found overexpressed in HGPIN lesions adjacent to PCa in comparison with isolated HGPIN [15,16]. Markers in biological fluids have also been described. For example, an increased serum level of early prostate cancer antigen (EPCA) has been associated with a higher cancer risk in men with isolated HGPIN [17].
Previous research from our group showed that PCA3, PSGR (OR51E2), and PSMA (FOLH1) gene expression in urine sediment could be useful biomarkers for the detection of PCa in benign prostatic hyperplasia (BPH) cases. Although with a lower efficacy, we observed in these studies that PCA3 could also detect PCa in patients with a previous diagnosis of HGPIN [18,19].
Among all the first prostate biopsies performed in the Vall d'Hebron Hospital, 42.8% are diagnosed with HGPIN (data from the years 2007-2010). Those patients undergo an intensive follow-up including one or more repeat biopsies. In order to prevent unnecessary biopsies, our objective in this study was to find a gene profile in urine sediment that accurately identifies "true" HGPIN cases separating them from HGPIN cases with undetected PCa found in repeat biopsies. For this purpose, a list of markers previously associated with PCa were analyzed in a unique set of urine samples from 90 patients diagnosed with HGPIN in a first biopsy and who subsequently underwent a clinical follow-up for several years until PCa presence was confirmed or regarded as absent.

Patients
This study was approved by the institutional review board of the Vall d'Hebron Hospital. All urine samples were obtained from the Department of Urology of the Vall d'Hebron Hospital in Barcelona between 2008 and 2013 and were taken from patients subjected to a repeat prostate biopsy because of a previous biopsy result of HGPIN. Their first biopsy was recommended due to increased serum PSA levels (>4 ng/ml) and/or an abnormal diagnostic DRE. Patients with other known tumors and/or previous PCa therapies were excluded from the study. Written informed consent was obtained from all the study participants and samples were coded to ensure sample tracking and confidentiality on patient/donor identity.
The diagnosis of all patients was achieved by transrectal ultrasound (TRUS)-guided prostate biopsy. Biopsies were performed using an end-fire ultrasound transducer Falcon 2101 (BK Medical, Herlev, Denmark) and an automatic 18 gauge needle (Bard, Covington, GA). The minimum number of cores removed in every procedure was 10, and between 1 and 8 additional cores were obtained based on age and total prostate volume, according to the Vienna nomogram.
In the scope of this study, a total of 1,056 first prostate biopsies were performed. In 402 cases, PCa was detected and, of the remaining 654 PCa-free cases, 280 presented HGPIN (42.8%). From these 280 HGPIN cases, 173 agreed to a repeat biopsy in a 1-3 years time period, independently of their serum PSA levels evolution. It was possible to collect urine samples previous to the second prostate biopsy in 124 of these patients. The final study population analyzed consisted of 114 men, with a first biopsy result of HGPIN, who underwent at least one additional repeat prostate biopsy. In 24 cases, the second or subsequent biopsies revealed the presence of PCa, whereas in the rest of cases the patients were diagnosed with a benign pathology (Table I). All biopsy materials were evaluated by the same experienced uropatholgist.

Sample Preparation
Urine samples (30-50 ml) were collected after DRE within days before a repeat biopsy. Urine was collected in urine collection cups, kept on ice, transported to the lab and processed within 2 hr of its

RNA Extraction and RT-qPCR
Total RNA of the cellular fraction was extracted using the QIAmp Viral Kit (Qiagen, Hilden, Germany), following the manufacturer's instructions.
Reverse transcriptase PCR of extracted RNA was conducted to determine expression of six endogenous genes (Table II) and 14 target genes (Table III). cDNA obtained from the reverse transcription was pre-amplified using RealTime ready cDNA Pre-Amp Master, in combination with RealTime ready Pre-Amp Primer Pools (Roche Applied Science, Indianapolis, IN).
All RT-qPCR reactions were carried out in triplicate on RealTime ready custom qPCR plates (Roche Applied Science) and fluorescent signals were measured in a LightCycler 480 II (Roche Applied Science). Data analysis was carried out using the LightCycler 480 software (v. 1.5).

Differential Expression Analysis
From the initial set of 114 samples, samples with KLK3 Ct values > 35 and/or geometric mean of all Ct values > 33 were excluded due to a low amount of cDNA. To the final cohort, outlier detection was performed by computing the Kolmogorov-Smirnov statistic Ka between each sample's distribution and the distribution of the pooled data.  Relative gene expression was calculated by the DDCt method. The endogenous reference gene for the data normalization was selected from a list of six commonly used housekeeping genes: Hypoxanthine Phosphoribosyltransferase 1 (HPRT1), Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH), Delta-Aminolevulinate Synthase 1 (ALAS1), TATA Box Binding Protein (TBP), Beta-2-microglobulin (B2M), Kallikrein-3/Prostate Specific Antigen (KLK3). The selection criteria was the lowest coefficient of variation, lowest Ct geometric mean, without differences between groups (calculated by Mann-Whitney test) and with an area between the receiver-operating characteristic (ROC) curve and the no-discrimination line (area under the curve; AUC) close to 0.5.

Statistical Analysis
Univariate tests and univariate and multivariate logistic regressions were used to examine associations between PCa diagnostic status and testing genes. For this purpose, Mann-Whitney test [20] and Random Forest method [21] were used for variable importance measurement. Both methodologies were evaluated by Leave-one-out cross-validation (LOOCV) to correct the bias estimated from them. Conjointly, ROC analysis was used to assess genes performance (with 95% confidence interval).
All possible multiplex models were created using combinations of the most significant genes obtained in the univariate analysis and multivariate logistic regression was applied to them. In order to obtain better models, Akaike information criterion [22] based backward selections were used to drop insignificant terms in all the resulting models via stepwise generalized linear models [23].
The percentage of biopsies potentially avoided by the use of the proposed biomarkers was calculated by adding up the number of true negatives and the number of false negatives, divided by the total studied population.

Samples Performance
All patients enrolled in the study were men undergoing repeat prostate biopsy to rule out PCa when HGPIN was previously identified. Urine was obtained directly after DRE, and urinary cells were pelleted and used for RNA expression levels analysis. For the initial sample cohort studied, 90 out of the 114 specimens survived initial quality standards and yielded sufficient prostate derived cells (KLK3 Ct value 29) or overall amount of RNA (high geometric mean of all Ct values 27) for further analysis, corresponding to an informative specimen rate of 78.9% (benign 78.9% and PCa 21.1%).
This final cohort was analyzed for outliers detection, showing that none of the remaining samples should be considered as an outlier (Fig. 1).

Data Normalization
Urine contains a highly variable mixture of cells of different origins and, to date, there is no consensus about the best way to normalize gene expression data retrieved from this source. For that reason, we first sought to determine an optimal endogenous reference gene that can be used for normalization of target gene expression. For this purpose, we assayed five universal housekeeping genes ALAS1, B2M, GAPDH, HPRT1, and TBP, in addition to the prostate-specific KLK3, in the total final cohort of 90 samples.
Statistical analysis of the endogenous genes expression showed that the mRNA that best fit the criteria of stability that represented minimal-differences between all groups was TBP (Suppl . Table SI). Therefore, this gene was chosen for the standardization in our samples. Notably, we observed in our cohort that the commonly used normalizing gene KLK3 actually behaved as a biomarker itself, As its levels seemed to be significantly increased in the confirmed PCa samples. For this reason, KLK3 was subsequently included in the group of target genes for further evaluation.
Before proceeding with the next steps of the study, all significant markers were cross-validated using the LOOCV method. The genes CDH1, PSMA, GOLM1, KLK3, PSGR, and PCA3 were selected for further characterization, under the criterion of all individual outcomes being significant. All of these genes appeared overexpressed in PCa urine samples when compared to urine from patients presenting isolated HGPIN (Fig. 2a). The AUCs for these markers individually ranged from 0.66 to 0.77. Fixing the sensitiv-ity at 95%, the obtained specificities for the individual markers ranged between 24% and 37% (Table III).
Then, a multivariate regression analysis was applied, to test whether the variables could have a  candidate biomarkers was performed on urinary sediment cDNA from patients referred for a repeat biopsy after a previous diagnosis of HGPIN. Only biomarkers that were significant predictors for PCa (see Table III) are shown. Boxplots represent the expression levels for each one of the genes. All genes present an increased expression in patients with PCa (dark gray) compared to patients with a benign condition (light gray). Target genes expression was normalized following the DDCt method using TBP as endogenous reference. The indicated P-values correspond to the univariate analysis (Mann-Whitney test) results. Statistical significance is represented as *P < 0.05, **P < 0.01, ***P < 0.001. (b) Multivariate regression analysis resulted in a number of multiplex models (see Suppl. Table I), of which the best three were selected. These models consist of combinations of KLK3, PSMA, PSGR, CDH1, and GOLM1. ROC curves were generated according to the predicted probabilities derived from each one of the models. All the multiplex models (gray lines) present a higher AUC than PCA3 alone (black dotted line). (c) Same as (b), but LOOCV results were used to generate the curves. The AUC of the LOOVC models are also greater than the AUC of LOOCV PCA3. better performance when combined in a multiplex model. This analysis resulted in a total of 36 possible multiplex models that can distinguish between benign conditions and PCa better than any of the individual target genes (Suppl. Table II).
The most promising three models out of the 36 were selected based on overall performance values (Fig. 2b, c, and Table IV), and further evaluated. Each one of these yhree putative models greatly outperformed PCA3 (multiplex models AUC ¼ 0.81-0.86 vs. PCA3 AUC ¼ 0.70), as well as all the other assayed target genes when used alone, for the detection of PCa. When fixing the sensitivity at 95%, the obtained specificities ranged from 41% to 58%, which is significantly higher compared to the 30% of PCA3, the current gold-standard for diagnosis in urine. The PPV and NPV ranged from 30% to 38%, and from 97% to 98%, respectively. PCA3 has a similar NPV, indicating that both tests are equally reliable for discarding PCa in case of a negative result. However, the lower PPV of PCA3 (27% vs. 38%) indicates that PCA3 misclassifies a higher amount of benign cases by giving a positive result, which translates into a higher number of biopsies that need to be practiced in follow up.

High-grade prostatic intraepithelial neoplasia (HGPIN) is associated with an increased probability for
developing PCa or co-existence of undetected PCa. For this reason, the diagnosis of HGPIN automatically guarantees intensive surveillance of suspect PCa patients over the years who must undergo multiple repeat biopsies. However, only a small fraction of these patients will eventually be diagnosed with PCa or require active treatment. Consequently, a great number of the practiced invasive repeat biopsies are unnecessary, causing pointless discomfort to the patient and providing an extra expense to health care systems that are already heavily burdened. Here we developed and tested a simple qPCR-based non-invasive approach to differentiate true HGPIN cases from those at risk of hidden (i.e., undetected) PCa. To this end, we analyzed the expression of 14 candidate mRNA markers in urine samples sediment of a fully annotated clinical cohort of 90 patients and compared this to the current gold standard (i.e., FDA-approved PCA3).
In the first place, we aimed to establish a good data normalization strategy. For this purpose several housekeeping genes were analyzed, alongside the commonly used for normalization of urine data in PCa studies KLK3. It became apparent that KLK3 presents differences between groups, being able to differentiate PCa from HGPIN patients. One possible explanation for this unexpected result is that patients with a negative biopsy tend to have fewer cells of prostatic origin and therefore have less KLK3 in their post-DRE urine than malignant counterparts. This may relate to prostate cancer biology as loss of cell-cell contacts in developing cancer may facilitate shedding of prostate epithelial cells into the urine [34]. The increased levels of KLK3 in PCa patients may thus cause researchers to erroneously discard potential biomarkers that follow the same pattern of expression as KLK3. We propose that reference genes, which are constitutively expressed by all cells such as TBP, are better alternatives. The core of this project consisted on the analysis of the expression levels of a list of 14 candidate biomarkers, of which PCA3, PSMA, PSGR, KLK3, GOLM1, CDH1, and SPINK1 were found to be overexpressed in PCa when compared to isolated HGPIN cases. In the context of PCa detection, the clinical utility of a PCA3 gene-based molecular assay in urine has been extensively demonstrated, and it is currently utilized in a commercially available test under the name PRO-GENSA 1 PCA3 that was approved by the US Food and Drug Administration (FDA) in 2012 [35,36]. In this assay, PCA3 and KLK3 mRNAs are quantified, and the PCA3 Score is calculated as the ratio of PCA3 and KLK3 (PCA3 mRNA/KLK3 mRNA Â 1000). In a recent study including 177 patients undergoing repeat biopsy, the reported sensitivity, specificity, predictive positive value (PPV), and negative predictive value (NPV) of PCA3 Score (cut-off 20) in PCa diagnosis were 91.7, 25.6, 31.5, and 89.5%, respectively. In this cohort, the use of PCA3 measurements could have avoided 21% of repeat biopsies [37]. In our hands, using a cohort of 90 patients, the specificity, PPV and NPV values of PCA3 at a fixed sensitivity of 91.7% were of 30, 26, and 93%, respectively, while 25.1% of the repeat biopsies could had been avoided. Therefore, our results in an independent cohort are comparable to previously published data underlying the accuracy of our measurements. Notably however, several of the genes we assayed in this study strongly outperformed PCA3 for the detection of PCa in repeat prostate biopsies. Specially, CDH1 and PSGR show higher AUC values compared to PCA3 (0.77 and 0.75 vs. 0.70).
Finally, multiplex models combining the genes overexpressed in PCa were developed, resulting in an improvement of the predictive power. It is worth noting that our multiplex models if validated could save anywhere between a third and almost half (33-47%) of the repeat biopsies currently practiced, representing a significant improvement over the FDA-approved PCA3 with only 21% of biopsies saved (Table IV). In data calculated for the United States, the incidence of isolated HGPIN averages 9% (range, 4-16%) of prostate biopsies, representing 115,000 new cases of HGPIN without cancer diagnosed each year [7]. This means that, according to the calculated percentages, it would be possible to save approxi-mately between 37,950 and 54,050 repeat biopsies annually, a number that is only expected to increase due to an aging population. For this calculation, we used the formula: % of biopsies saved ¼ true negatives (test negative and biopsy negative) þ false negatives (test negative and biopsy positive)/all patients. Although this would imply that one could save a biopsy by incorrectly classifying (test negative and biopsy positive) a patient as not having PCa, the number of false negatives to obtain a sensitivity of 95% is negligible (in this study one patient; NPV ! 97%).
A possible limitation of our study is that the second biopsy outcome was used as the definitive diagnosis of the patient. However, there is still a small chance of missing a PCa in this second biopsy. In some cases, PCa is finally diagnosed after a third or even subsequent biopsies. For this reason, we cannot discard the possibility of having misclassified a small number of patients in our study. Furthermore, although very promising results have been obtained, a larger sample size would be needed to further validate the best predictive models.
In the last years, a variety of techniques have appeared for the optimization of prostate biopsies, in an attempt to minimize the percentage of false-negative results. TRUS-guided biopsy is still the standard approach; however, this technique has multiple limitations owing to the operator's inability in most cases to directly visualize and target prostate lesions. Magnetic resonance imaging (MRI) of the prostate can overcome many of these limitations by directly depicting areas of abnormality and allowing targeted biopsies [38]. It would be highly interesting to validate our results in a new set of patients who have also undergone a MRI-guided biopsy, in order to determine (i) whether these two methods can complement each other, (ii) if our biomarkers profile could be an useful tool to decide about performing a MRI, and (iii) the cost and specificity of these two approaches, always aiming to define a reliable diagnostic method that can help prevent unnecessary biopsies.
Alternative biomarkers-based tests are also being developed and investigated, with the aim of providing a more accurate means of PCa detection in repeat prostate biopsy cohorts. A promising new test based on serum PSA is the Prostate Health Index (PHI), which has recently been approved in the United States, Europe, and Australia. PHI is a mathematical formula that combines total PSA, free PSA and the [-2] form of proPSA (the inactive precursor of PSA), into a single score that can be used to aid in clinical decision-making [39]. Several studies have documented the performance of PHI in large groups of patients, reporting AUC values ranging from 0.68 to 0.74 [40][41][42][43][44][45] which is smaller compared to our multi-plex models (AUC ¼ 0.81-0.86) and similar to PCA3 (AUC ¼ 0.70).
Epigenetic tests performed on biopsy material have also been reported as promising independent predictors of PCa risk to guide decision making for repeat biopsy. An assay involving the epigenetic profile of GSTP1, APC, and RASSF1 resulted in a NPV of 88%. As the multiplex panels presented in this study have a higher NPV (97-98%), combining the outcome of this epigenetic assay may help to further decrease the number of unnecessary repeat prostate biopsies [46].
Finally, it is also worth mentioning the use of nomograms for the prediction of PCa. Benecchi et al. described a predictive model that incorporates clinical data and PSA kinetic data, showing outstandingly good results (AUC ¼ 0.856) [47]. This nomogram performs extremely well in the general population suspicious of PCa (with at least one negative biopsy), but its utility in the specific case of HGPIN patients remains to be elucidated. For evaluation of clinical utility of our results, a nomogram with performance statistics is planned to be generated in independent cohorts.
Porpiglia et al. performed a study comparing the predictive value of PCA3, MRI-guided prostate biopsy and PHI in the repeat biopsy setting. They found that the most significant contribution for PCa detection was provided by MRI-guided biopsies, with an AUC of 0.94 and a specificity of 57% (at a 95% sensitivity). In fact, the inclusion of PCA3 and/or PHI to models containing MRI-guided prostate biopsy did not substantially improve the net benefit. These results indicate that MRI-guided prostate biopsy has a high diagnostic accuracy in identifying patients with PCa in the repeat biopsy setting; however, its combination with minimally-invasive biomarker tests such as PCA3 or PHI does not improve the overall performance [48]. It is clear that the prostate biopsy result will continue to be the gold-standard for PCa diagnosis in the near future. However, implementation of an accurate non-invasive biomarker-based test as the one evaluated here can significantly reduce the frequency of invasive and expensive procedures when validated in independent cohorts. There is a reason to be optimistic for successful validation since in our cohort the currently approved PCA3 test showed a similar performance as reported by others [37]. Combined with the knowledge that our test can be easily implemented in daily care many erroneously suspected PCa patients could benefit from this work.

CONCLUSIONS
In summary, we have shown that a multiplexed RT-qPCR assay on urine sediments from patients presenting for a repeat prostate biopsy due to a diagnosis of HGPIN has a significantly improved predictive ability when compared to PCA3 or any other assayed gene when used alone. Further evaluation and validation of these biomarkers in larger and independent cohorts is thus warranted. In the future, a multiplexed urine-based diagnostic test for PCa with a higher specificity but the same sensitivity as the serum PSA test could be used for an easier management of patients with HGPIN, aiding clinicians to select patients that will benefit from a repeat biopsy.