BACKGROUND: Large-scale epidemiological studies of primary biliary cirrhosis (PBC) have been hindered by difficulties in case ascertainment.OBJECTIVE: To develop coding algorithms for identifying PBC patients using administrative data – a widely available data source.METHODS: Population-based administrative databases were used to identify patients with a diagnosis code for PBC from 1994 to 2002. Coding algorithms for confirmed PBC (two or more of antimitochondrial antibody positivity, cholestatic liver biochemistry and/or compatible liver histology) were derived using chart abstraction data as the reference. Patients with a recorded PBC diagnosis but insufficient confirmatory data were classified as ‘suspected PBC’.RESULTS: Of 189 potential PBC cases, 119 (60%) had confirmed PBC and 28 (14%) had suspected PBC. The optimal algorithm including two or more uses of a PBC code had a sensitivity of 94% (95% CI 71% to 100%) and positive predictive values of 73% (95% CI 61% to 75%) for confirmed PBC, and 89% (95% CI 82% to 94%) for confirmed or suspected PBC. Sensitivity analyses revealed greater accuracy among women, and with the use of multiple data sources and one or more years of data. Inclusion of diagnosis codes for conditions frequently misclassified as PBC did not improve algorithm performance.CONCLUSIONS: Administrative databases can reliably identify patients with PBC and may facilitate epidemiological investigations of this condition.