Abstract

Aims. To evaluate the applicability of the Latent Class Analysis (LCA) and accuracy of transient elastography (TE), aspartate-to-platelet-ratio-index (APRI), enhanced liver fibrosis (ELF), and liver biopsy (LB) for liver fibrosis assessment in a model without a gold standard. Methods. Significant fibrosis was defined as  kPa, , , or LB METAVIR . Cirrhosis was defined as  kPa, , , or LB as METAVIR . Results. 117 patients with chronic hepatitis C were included. In the LCA, for significant fibrosis the sensitivities and specificities (95% CI) were 0.92 (0.86–0.98) and 0.79 (0.72–0.86) for TE; 0.47 (0.40–0.54) and 0.99 (0.95–1.00) for APRI; 0.81 (0.74–0.88) and 0.78 (0.71–0.85) for ELF; and 0.86 (0.68–1.00) and 0.91 (0.79–1.00) for LB. For cirrhosis, the sensitivities and specificities were 0.92 (0.76–1.00) and 0.94 (0.91–0.97) for TE; 0.57 (0.37–0.77) and 0.97 (0.93–1.00) for APRI; 0.94 (0.84–1.00) and 0.88 (0.82–0.94) for ELF; and 0.30 (0.12–0.48) and 1.00 for LB. Conclusion. LCA was useful to evaluate accuracy of methods for liver fibrosis staging. Sensitivities and specificities of noninvasive methods were increased in LCA compared to the use of LB as the gold standard.

1. Introduction

Chronic hepatitis C remains a major public health issue representing one of the leading causes of cirrhosis worldwide [1]. The correct determination of liver fibrosis stage has implications for prognostic, therapeutic, and monitoring purposes [2] and the eradication of hepatitis C virus (HCV) has been associated with lower rates of liver-related complications [3]. Serological biomarkers, such as FibroTest, FibroMeter, aspartate-to-platelet-ratio-index (APRI), and enhanced liver fibrosis (ELF), or imaging methods, such as transient elastography (TE), have been recommended to stage liver fibrosis in chronic hepatitis C [4]. TE has been described as an accurate method for fibrosis assessment in HIV/HCV coinfected patients [5, 6]. Platelet count and liver stiffness measurement by TE has been validated to predict large gastroesophageal varices in patients with compensated advanced chronic liver disease [7]. The high efficacy of direct-acting antiviral drugs (DAAs) has revolutionized the management of patients with chronic hepatitis C. However, detection of compensate cirrhosis leads to screening for hepatocellular carcinoma and might change the choice of therapeutic regimen for HCV eradication [8]. Most of noninvasive methods for fibrosis assessment have been developed and validated using liver biopsy as a reference. However, the diagnostic performance of liver biopsy has been challenged by the length of the liver specimen [9], sampling error [10], and intraobserver variability [11]. Therefore, the diagnostic accuracy of noninvasive methods might have been hampered since liver biopsy might not be a perfect gold standard [12].

Latent Class Analysis (LCA) is a mathematical modeling currently applied in qualitative social research to evaluate the accuracy of tests in the lack of a gold standard [13]. In this methodology, a reference standard is constructed based on the combination of observed and estimated tests results from each patient [14]. A potential limitation might be that this combination might not fit the data, which may be due either to nonindependence between the tests (dependency) or to variability of the disease definition (within-class heterogeneity). In addition, LCA assumes that liver biopsy is binary or dichotomy, whereas fibrosis staging uses a five-stage ordinal scale (from to ). So far, this methodology has been used in very few studies in hepatology [12, 15, 16]. The aims of the study were (i) to evaluate the applicability of the LCA models in patients with chronic hepatitis C and (ii) to estimate the sensitivities and specificities of TE, APRI, ELF, and liver biopsy for fibrosis assessment in an approach without a gold standard.

2. Material and Methods

2.1. Study Design

Patients with chronic hepatitis C from two centers in Rio de Janeiro (University of the State of Rio de Janeiro and Bonsucesso Federal Hospital) were prospective enrolled from April 2011 to July 2012 for this cross-sectional study. Patients with chronic hepatitis C, characterized by the presence of HCV-RNA in blood serum, older than 18 years were included. The exclusion criteria were hepatitis B or human deficiency virus coinfection, self-reported excessive alcohol intake (>40 g/day in men and >20 g/day in women), chronic kidney disease, LB specimens with less than six portal tracts, and unreliable LSM.

Liver fibrosis staging was classified according to the METAVIR scoring [17] and significant fibrosis and cirrhosis were defined as fibrosis stage and , respectively. Noninvasive tests were performed within 3 months from the date of liver biopsy. The study protocol was conducted in accordance with Helsinki Declaration and was approved by the local ethics committees. All patients signed an informed consent upon enrollment in the study.

2.2. Transient Elastography (TE)

TE was performed with M probe of FibroScan® (EchoSens, Paris, France) by an experimented operator (>500 exams) blinded to biomarkers results, following a validated procedure [13]. TE was considered unreliable in presence of any of the following criteria: (i) <10 successful measurements; (ii) an interquartile range (IQR) higher than 30% of the median value; and (iii) a success rate, considered as the ratio between the number of valid and total measures, lower than 60% [18]. Liver stiffness was considered as the median of all valid measurements and fibrosis staging was converted to the METAVIR scoring system as proposed by Castéra et al. [19]: <7.1 as ; 7.1–9.4 as ; 9.5–12.4 as ; and >12.4 kPa as .

2.3. Serological Biomarkers

Serum alanine aminotransferase (ALT) and aspartate aminotransferase (AST) were measured by enzymatic assay (ADVIA 1200, Siemens, IL, USA). The upper limit of normal (ULN) of aminotransferases values was 55 IU/L and 34 IU/L for ALT and AST, respectively. APRI was calculated according to the following formula: AST level (/ULN)/platelets count (109/L) 100. Liver fibrosis, estimated by APRI, was converted to the METAVIR scoring system as proposed by Wai et al. [20]: ≥1.5 as and ≥2.0 as . Calculation of ELF was performed in a frozen serum (−70°C) that was collected in the same day of TE. Serum amino-terminal propeptide of type III procollagen (PIIINP), hyaluronic acid (HA), and tissue inhibitor of matrix metalloproteinases-1 (TIMP-1) were measured in a random access automated clinical immunochemistry analyzer that performs magnetic separation enzyme immunoassay tests (ADVIA Centaur; Siemens Healthcare Diagnostics, Tarrytown, NY). The ELF score was calculated using the algorithm recommended by the manufacturer (Siemens, NY, USA): . Liver fibrosis, estimated by ELF, was converted to the METAVIR scoring system as proposed by Fernandes et al. [21]: >9.37 as and >10.31 as .

2.4. Liver Biopsy

Percutaneous liver biopsies were performed under local anesthesia (FFF) using a 16-G Menghini needle guided by ultrasound as a day-clinic hospitalization. All samples were fixed in a 10% neutral-buffered formalin solution and cut in 5 mm thick sections. Routinely, haematoxylin and eosin, Masson’s trichrome, and reticulin stains were performed. Biopsies were classified using the METAVIR scoring system [17] by the same experienced pathologist (FC), who was blinded to clinical characteristics and noninvasive tests results.

2.5. Latent Class Analysis (LCA)

LCA is a mathematical modeling that estimates diagnostic accuracy of tests in a scenario where there is no gold standard. The disease status of an individual can be considered as a categorical latent variable such as “disease” or “no disease,” which are named “latent classes.” Through a mathematical method named standard maximum likelihood, the modeling aims to obtain a unique solution for constructing a reference standard. Therefore, sensitivities and specificities for each test can be estimated. In addition, the assumption of conditional independence among tests must be respected and data must fit into the model (likelihood ratio goodness-of-fit value [likelihood squared ()] significance >0.05) [22].

In the present study, the LCA models were constructed upon four conditionally independents tests: APRI, ELF, TE, and liver biopsy. APRI and ELF are both serological biomarkers but they include different parameters in the respective formulas. Liver fibrosis is estimated by TE considering the propagation of ultrasound pulses through the hepatic parenchyma and liver biopsy stages fibrosis is based on histological analysis by a semiquantitative score. Two latent class (2LC) models were fitted, one for presence or absence of significant fibrosis (METAVIR ) and the other for presence or absence of cirrhosis (METAVIR ). In each one of the models, the patient’s status could be classified in two mutually exclusive groups: presence or absence of “disease.” Using four tests with a dichotomous result (i.e., positive or negative) in each patient, there were 16 possible combinations for each clinical endpoint. The likelihood of observing each pattern of test results was calculated according to the probability for a positive or negative test. The number of expected cases estimated by LCA models was compared to observed cases for each of the patterns of test results.

2.6. Statistical Analysis

Continuous variables were reported as median [interquartile range, IQR] and discrete variables were reported as absolute and relative frequency. Nonparametric tests, Mann–Whitney test for quantitative and Fisher’s exact test for qualitative comparisons, were applied. Significance level was determined when assuming two-tailed tests. In the classical 2 × 2 analysis, the performance of TE, APRI, and ELF was assessed using the fibrosis stage obtained by liver biopsy, the classical gold standard. The standard area under the Receiver Operating Characteristics (AUROC) curves for diagnosis of significant fibrosis and cirrhosis was estimated by the empirical (nonparametric) method [23]. In the LCA models, the sensitivities and specificities of each test, including liver biopsy, were assessed without a gold standard. For estimation of tests performance for significant fibrosis and cirrhosis by LCA, the 2LC model that assumed the conditional independence among tests was compared to models with direct effect between tests. The model that better fits for LCA was chosen based on the following criteria: the value of the likelihood squared () had to be greater than 0.05 and the Bayesian information criterion (BIC) had to be the smallest among all competing models. Statistical analyses were performed using STATA statistical package for Windows (2012; StataCorp LP, College Station, TX, USA) and LEM (log-linear event history analysis with missing data, version 1.0 (Tilburg, Netherlands)).

3. Results

Among 131 eligible patients with chronic hepatitis C, 117 patients [34% male gender, median (IQR) age of 55 (48–62) years, and BMI of 26 (24–30) Kg/m2] were included. Patients were excluded due to inadequate liver specimen (less than 6 portal tracts) () or unreliable TE (). Serological biomarkers, APRI and ELF, were reliable in all patients. According to liver biopsy, the prevalence of significant fibrosis and cirrhosis was 46% () and 7% (), respectively. Table 1 summarizes clinical and demographic characteristics of included patients.

3.1. Classical (2 × 2) Analysis Using Liver Biopsy as a Gold Standard

Using liver biopsy as reference, for diagnosis of significant fibrosis the diagnostic performance [AUROC (95% CI)] of TE, APRI, and ELF was 0.874 (0.811–0.937), 0.810 (0.732–0.887), and 0.807 (0.725–0.889), respectively. In addition, the performance [AUROC (95% CI)] of TE, APRI, and ELF for cirrhosis was 0.942 (0.890–0.993), 0.767 (0.585–0.948), and 0.783 (0.555–1.000), respectively. Sensitivities, specificities, and positive likelihood ratio of noninvasive methods are summarized in Table 2.

3.2. Latent Class Analysis without a Gold Standard

The 2LC model that respected the conditional independence among 4 tests (i.e., without direct effect between tests) was the model that better fitted data for LCA. This model presented the lower BIC among the competitive models and a nonsignificant    value for diagnosis of significant fibrosis and cirrhosis (Table 3). The observed and estimated patient’s distribution according to the 4 tests results for diagnosis of significant fibrosis and cirrhosis are described in Table 4. The tests were perfectly concordant in 54 (46%) patients (all positive in 20 and all negative in 34 patients) for diagnosis of significant fibrosis and in 77 (66%) patients (all positive in 4 and all negative in 73 patients) for cirrhosis diagnosis.

For diagnosis of significant fibrosis, the sensitivities (95% CI) were 0.92 (0.86–0.98), 0.47 (0.40–0.54), 0.81 (0.74–0.88), and 0.86 (0.68–1.00) for TE, APRI, ELF, and liver biopsy, respectively. In addition, specificities (95% CI) were 0.79 (0.72–0.86), 0.99 (0.95–1.00), 0.78 (0.71–0.85), and 0.91 (0.79–1.00) for TE, APRI, ELF, and liver biopsy, respectively. For cirrhosis, the sensitivities were 0.92 (0.76–1.00), 0.57 (0.37–0.77), 0.94 (0.84–1.00), and 0.30 (0.12–0.48); the specificities were 0.94 (0.91–0.97), 0.97 (0.93–1.00), 0.88 (0.82–0.94), and 1.00 for TE, APRI, ELF, and liver biopsy, respectively (Table 2).

Noninvasive methods performed better when analyzed by LCA in comparison to the classical analysis, using liver biopsy as a reference. For the cirrhosis diagnosis, sensitivity of liver biopsy was reduced and its specificity was similar in LCA compared to classical analysis when liver biopsy was the gold standard (i.e., sensitivity and specificity = 1.00) (Table 2).

4. Discussion

This study highlighted the reliability of LCA for assessment of liver fibrosis in a scenario without a gold standard. Sensitivities and specificities of noninvasive methods (APRI, ELF, and TE) were higher in LCA models than when liver biopsy was used as reference. Despite having a satisfactory specificity, the sensitivity of liver biopsy was fair for diagnosis of cirrhosis in LCA model. These results are in concordance with the hypothesis that liver biopsy might not be a perfect gold standard.

Serological biomarkers and new imaging technologies for liver fibrosis assessment have been developed in the last decade [24]. Overall, noninvasive methods for fibrosis staging have been validated using liver biopsy and the reference. However, liver biopsy has been challenged by potential adverse events and sampling error [10, 25]. Therefore, alternative methodologies to validate noninvasive markers for fibrosis assessment without the need of liver biopsy must be implemented and validated. LCA has been described as an accurate mathematical model to evaluate the performance of tests in the absence of gold standard [26].

Using liver biopsy as the reference, we reported an accuracy of TE, APRI, and ELF for significant fibrosis and cirrhosis similar to those described in previous studies [2729]. Positive and negative likelihood ratios reported in our study were fair and represented a slight effect on posttest probability of presence of significant fibrosis or cirrhosis (Table 2). However, LR+ values were similar to previously published [30] and the low LR− (<0.1) of TE for diagnosis of cirrhosis confirmed that this method might be used for exclusion of this condition in HCV patients. Considering the approach by LCA, the sensitivity of noninvasive methods for diagnosis of significant fibrosis increased when compared to classical analysis (Table 2). Similarly, better results were also reported for TE and FibroTest by authors that used the same mathematical model [12]. In the present study, regardless of the type of methodology used (classical analysis or LCA), TE and APRI were the most sensitive and specific tests for diagnosis of significant fibrosis, respectively (Table 2). In a sensitivity analysis when considering the dual cut-off points proposed for APRI (i.e., ≥0.5 as and ≥1.0 as ) [20], we reported similar results for the accuracy of TE, ELF, and liver biopsy (Supplementary Table in Supplementary Material available online at https://doi.org/10.1155/2017/8252980).

For diagnosis of significant fibrosis, the sensitivity and specificity of liver biopsy decreased in LCA yielding 14% of false negative and 9% of false-positive results. More importantly, we had a substantial decrease in the sensitivity of liver biopsy for cirrhosis diagnosis [0.30 (95% CI 0.12–0.48)]. Despite a decrease in sensitivity, the specificity of liver biopsy to detect significant fibrosis or cirrhosis did not change when LCA was applied. These results were aligned with previous studies: Poynard et al. reported a decrease in sensitivity of liver biopsy from 1.00 (as a gold standard) to 0.51 when LCA was applied [12]. We acknowledge that both studies included different sample size and potential distinct populations. We mostly included female patients with mild fibrosis (54%) and had a low prevalence of cirrhosis (7%). The French study included a more homogenous population, mostly male gender with 15% of cirrhotic patients [12]. In addition, the impact of prevalence of liver fibrosis stages in diagnostic performance of biomarkers based on the spectrum bias might lead to discordance in conclusions for test accuracy [31, 32].

The decrease of sensitivity of liver biopsy in LCA compared to classical analysis might be a reflex of the limitations of liver biopsy. Previous studies reported a considerable discrepancy between liver biopsies performed in both hepatic lobes of the same patient [33], as well as a significant underestimation of severity of liver disease by reduction in the liver specimen length [9]. Bedossa et al. have described that even the “classical 20 mm length” sample could misclassify liver fibrosis in a quarter of patients [10]. Mehta et al. estimated the magnitude of the bias of diagnostic performance of noninvasive methods due to liver biopsy limitations [34].

The diagnostic performance of a noninvasive method might never reach the maximal AUROC value using liver biopsy as the reference due to gold standard limitations [25]. Since noninvasive methods have been validated using histology as the reference, these tests might replicate the false negative and positive of liver biopsy, biasing its diagnostic accuracy. As a practical implication of LCA, the present study confirmed that we must move forward on estimation of diagnostic performance of noninvasive methods for fibrosis staging using methodologies without gold standard.

Noninvasive methods have also important limitations that should be considered. TE performed by the M probe may be unreliable in 20% of patients [35] and this method might have a nonnegligible inter- and intraobserver variability [36, 37]. In addition, liver fibrosis staging by TE might be impacted by presence of necroinflammatory activity, obesity, extrahepatic cholestasis, hepatic congestion, and nonfasting status [38]. In the present study, false-positive rates were higher in obese patients () compared to those with BMI < 30 Kg/m2 (Supplementary Table ). ELF includes serum markers involved in the synthesis and breakdown of the extracellular matrix that might be elevated in other systemic diseases not related to liver fibrosis [39]. Finally, APRI has a considerable variability due to laboratory upper limit for normal AST and can be overestimated in presence of necroinflammatory activity due to utilization of transaminases in its formula [40].

We acknowledge that our limited sample () with a low prevalence of cirrhosis (7%) and single blinded histological analysis were the major limitations of our study. We included treatment-naïve patients with chronic hepatitis C with fibrosis staging for therapeutic decision. The relative low prevalence of cirrhotic patients impacted both methodologies (classical analysis and LCA) to access diagnostic accuracy of noninvasive methods. In addition, Rousselet et al. have demonstrated that a single pathologist specialized in liver histology can accurately stage fibrosis in a liver specimen with a good quality [41]. In the present study, the histological analysis was performed by an experimented liver specialized pathologist (FC) in a specimen with a median of 20 mm and 10 portal tracts. We are aware that LCA models might not be considered per se as a new gold standard for estimation of sensitivities and specificities of noninvasive tests for fibrosis assessment. It should be interpreted as an estimator of accuracy of diagnostic tests with appropriate consideration of limits and strengths [42].

The major strength of this study relies on the fact that we respect the criteria for LCA and that data fitted very well in the model without colinearity between noninvasive tests (2LC model). In our study, latent class rules were strictly respected using four conditionally independent tests into the analysis. In addition, the 2LC model without direct effect between tests was the model that better fitted the data (lower BIC among competitive models and a nonsignificant value) (Table 3). In a sensitivity analysis, similar results for accuracy of the four tests were observed in models with direct effect between noninvasive methods (Supplementary Table ).

5. Conclusion

The application of LCA method was useful to evaluate diagnostic performance of noninvasive methods for liver fibrosis assessment. Sensitivities and specificities of noninvasive methods were increased in LCA compared to the use of liver biopsy as the gold standard. These results reinforced that liver biopsy might be an imperfect reference and that mathematical models without a gold standard should be considered in future studies for validation of noninvasive tests for liver fibrosis staging.

Disclosure

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article

Authors’ Contributions

Flavia F. Fernandes contributed study concept and design, data collection, interpretation of data, manuscript drafting, and critical revision; Hugo Perazzo contributed study concept and design, statistical analysis, interpretation of data, manuscript drafting, and critical revision; Luiz E. Andrade contributed data collection; Alessandra Dellavance contributed data collection and critical revision of the manuscript; Carlos Terra contributed study concept and design, analysis and interpretation of data, and critical revision of the manuscript; Gustavo Pereira contributed data collection, analysis and interpretation of data, and critical revision of the manuscript; João L. Pereira contributed analysis and interpretation of data and critical revision of the manuscript; Frederico Campos contributed histological analysis of liver biopsy and critical revision of the manuscript; Maria L. Ferraz contributed study concept and design, analysis and interpretation of data, critical revision of the manuscript, and study supervision; Renata M. Perez contributed study concept and design, analysis and interpretation of data, critical revision of the manuscript, and study supervision. All authors approved the final version of the manuscript.

Acknowledgments

This work was supported by funding from the Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ) and Siemens Healthcare for providing kits for determination of ELF score. The authors thank Professor Fátima A. F. Figueiredo who passed away during the study for her excellent technical assistance and support.

Supplementary Materials

Supplementary Table 1: Performance of tests as estimated by classical 2 x 2 analysis (liver biopsy as gold standard) and Latent Class Analysis (without gold standard) using APRI's cut-off > 0.5 and > 1.0 for diagnosis of significant fibrosis (F≥2) and cirrhosis (F=4), respectively.

Supplementary Table 2: Sensitivities and specificities of tests for diagnosis of significant fibrosis (F≥2) and cirrhosis (F=4) as estimated by Latent Class Analysis in models with co-linearity between non-invasive methods.

Supplementary Table 3: Diagnostic performance of non-invasive tests for diagnosis of significant fibrosis (F≥2) and cirrhosis (F=4) in obese patients (BMI ≥ 30Kg/m2) (n=30).

Supplementary Figure 1: Area under the ROC curve (AUROC) for diagnosis of significant fibrosis (F≥2) of (A) transient elastography (TE), (B) Aspartate-to-Platelet-Ratio-Index (APRI) and (C) Enhanced Liver Fibrosis (ELF) using liver biopsy as the reference.

Supplementary Figure 2: Area under the ROC curve (AUROC) for diagnosis of cirrhosis (F=4) of (A) transient elastography (TE), (B) Aspartate-to-Platelet-Ratio-Index (APRI) and (C) Enhanced Liver Fibrosis (ELF) using liver biopsy as the reference.

  1. Supplementary Material