Skip to main content
Poster 135

Using Natural Language Processing for Identification of Cognitive Impairments in Schizophrenia from Electronic Health Records Data in the United States

Speaker: Mona Nili, PharmD, MBA, PhD

Psych Congress 2024

Background: Schizophrenia is a complex condition characterized by both positive and negative symptoms. Often underrecognized are the cognitive impairments (CI). The objective of this study is to use natural language processing (NLP) to identify CI within the unstructured electronic health records (EHR) of patients with schizophrenia.
Methods: This retrospective cohort study used clinical notes from the Veradigm Network EHR database, from January 2016 to February 2023. This study included adult patients (≥18 years) with schizophrenia. Using the MATRICS Consensus Cognitive Battery (MCCB) domains, a list of key terms for CI was created and applied to annotate and extract keywords from a random sample of notes. All unstructured texts from patient records during the study period identified using NLP. Patient characteristics were assessed in the 12 months before the first documented schizophrenia diagnosis.
Results: The study included 79,326 schizophrenia patients, with 19,974 (25.2%) having evidence of CI. The most common identified CI domains were “reasoning and problem solving” (70.4%) and “working memory” (27.1%). The most common key terms found included “insight and judgment”, “reduced attention”, “poor memory”, “poverty of thought”, and “difficulty in expressing themselves.” Patients with evidence of CI had slightly higher rates of anxiety, depression, and substance use disorder compared to those without CI (all p-values < 0.001).
Discussion: This study highlights the importance of advanced data analytics in psychiatric research, offering new insights into the characteristics of CI in schizophrenia. This study informs the need to better document cognitive impairments in the clinical notes of patients with schizophrenia.