Abstract

This paper reports a large-scale knowledge conversion and curation experiment. Biomedical domain knowledge from a semantically weak and shallow terminological resource, the UMLS, is transformed into a rigorous description logics format. This way, the broad coverage of the UMLS is combined with inference mechanisms for consistency and cycle checking. They are the key to proper cleansing of the knowledge directly imported from the UMLS, as well as subsequent updating, maintenance and refinement of large knowledge repositories. The emerging biomedical knowledge base currently comprises more than 240 000 conceptual entities and hence constitutes one of the largest formal knowledge repositories ever built.