Abstract

A fluent flow of health information is critical for health communication and decision making. However, the flow is fragmented by the large amount of textual records and their specific jargon. This creates risks for both patient safety and cost-effective health services. Language technology for the automated processing of textual health records is emerging. In this paper, we describe method development for building topical overviews in Finnish intensive care. Our topical search methods are based on supervised multi-label classification and regression, as well as supervised and unsupervised multi-class classification. Our linguistic analysis methods are based on rule-based and statistical parsing, as well as tailoring of a commercial morphological analyser. According to our experimental results, the supervised methods generalise for multiple topics and human annotators, and the unsupervised method enables an ad hoc information search. Tailored linguistic analysis improves performance in the experiments and, in addition, improves text comprehensibility for health professionals and laypeople. In conclusion, the performance of our methods is promising for real-life applications.