Ibrahim ZM, Wu H, Hamoud A, Stappen L, Dobson RJB, Agarossi A.
Journal of the American Medical Informatics Association, Pages 437–443
Objectives: Current machine learning models aiming to predict sepsis from electronic health records (EHR) do not account 20 for the heterogeneity of the condition despite its emerging importance in prognosis and treatment. This work demonstrates the added value of stratifying the types of organ dysfunction observed in patients who develop sepsis in the intensive care unit (ICU) in improving the ability to recognize patients at risk of sepsis from their EHR data.
Materials and Methods: Using an ICU dataset of 13 728 records, we identify clinically significant sepsis subpopulations with distinct organ dysfunction patterns. We perform classification experiments with random forest, gradient boost trees, and support vector machines, using the identified subpopulations to distinguish patients who develop sepsis in the ICU from those who do not.
Results: The classification results show that features selected using sepsis subpopulations as background knowledge yield a superior performance in distinguishing septic from non-septic patients regardless of the classification model used. The improved performance is especially pronounced in specificity, which is a current bottleneck in sepsis prediction machine learning models.
Conclusion: Our findings can steer machine learning efforts toward more personalised models for complex conditions including sepsis.
Mapping multimorbidity in individuals with schizophrenia and bipolar disorders: evidence from the South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) case register
21 April 2022
People with severe mental illness, such as schizophrenia spectrum disorders or bipolar disorders, have higher death rates. It is difficult to study mental health records at scale, so a team of...
Evaluation of the ASSIGN open-source deterministic address-matching algorithm for allocating unique property reference numbers to general practitioner-recorded patient addresses
20 April 2022
Being able to link addresses across systems offers a valuable resource for health data science. However, they are often not standardised despite a government push towards this. Researchers at the...
A genome-wide association study with 1,126,563 individuals identifies new risk loci for Alzheimer’s disease
18 January 2022
Overview Alzheimer’s disease (AD) is a highly prevalent form of dementia – the genetic variations underlying the disease are poorly understood and the number and effectiveness of drug...