Research Article

Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields

Table 2

Statistics of the section heading recognition corpus. Since the corpus only contained the topmost sections, several different concepts or representations may be included in each section heading category. For instance, “Personal Histories” included the occupation, daily activity amount, substance history, and allergies.

SectionDescriptionNumberPercentage

Chief ComplaintsA statement describing the symptoms, problems, diagnoses, or other factors that are the reason of a medical encounter.8035.7%
Present IllnessSeparated paragraphs summarizing chief complaints related history.8436.0%
Personal HistoriesA merged concept of individual related histories, including past medical history, past surgical history, social history, and allergy.270119%
Family HistoriesThe health status of parents, children, siblings, and spouse, whether dead or alive.4863.4%
Physical ExaminationsThe process by which a medical professional investigates the body of a patient for signs of disease.11047.9%
Laboratory ExaminationsBiochemical studies performed in clinical laboratory.4012.8%
Radiology ReportsImage studies. Some examples are X-ray, CT, MRI, and PET.87<1.0%
DataA merged concept including laboratory examinations and radiology reports.103<1.0%
ImpressionMedical diagnoses judged by doctors, also called assessments.8846.3%
RecommendationsTreatments toward impressions, also called plans.4683.3%
OthersOther section headings not included in the categories above, for example, patient ID, doctor ID, and hospital ID.608143.6%

Total13,962100%