This retrospective study analyzed free-text clinical notes from medical encounters for insomnia among a sample of deployed US military personnel. Topic modeling, a natural language processing technique, was used to identify thematic patterns in the clinical notes that were potentially related to insomnia diagnosis.
Clinical notes of patient clinical encounters coded for insomnia from the US Department of Defense Military Health System Theater Medical Data Store were analyzed. Following preprocessing of the free text in the clinical notes, topic modeling was employed to identify relevant underlying topics or themes in 32,864 unique patients. The machine-learned topics were validated using human-coded potential insomnia etiological issues.
A 12-topic model was selected based on quantitative metrics, interpretability, and coherence of terms comprising topics. The topics were assigned the following labels: personal/family history, stimulants, stress, family/relationships, other sleep disorders, depression, schedule/environment, anxiety, other medication, headache/concussion, pain, and medication refill. Validation of these topics (excluding the two medication topics) against their corresponding human-coded potential etiological issues showed strong agreement for the assessed topics.
Analysis of free-text clinical notes using topic modeling resulted in the identification of thematic patterns that largely mirrored known correlates of insomnia. These findings reveal multiple potential etiologies for deployment-related insomnia. The identified topics may augment electronic health record diagnostic codes and provide valuable information for sleep researchers and providers. As both civilian and military healthcare systems implement electronic health records, topic modeling may be a valuable tool for analyzing free-text data to investigate health outcomes.
Published by Elsevier Inc.