The paper describes high-quality research into the development of a new approach for deriving patient phenotype profiles using clinical narrative texts. Its aim was to find ways to use valuable, but often untapped, data stored in electronic health records (EHRs) in order to improve the diagnosis of common diseases.
Efficient and accurate diagnoses are essential in providing the best and most appropriate care. A team, led by Dr Luke T Slater, a Research Fellow in the Gkoutos Lab at the University of Birmingham Centre for Computational Biology, aimed to find new ways to enhance the diagnosis of common diseases by unlocking the value in data held in un-curated text written by doctors, nurses and others.
Their paper, published in the journal Computers in Biology and Medicine in June 2021, describes three approaches for predicting patient diagnoses. One was based on patient-to-patient comparisons, to see whether those with the greatest similarities could be confidently identified as having the same illness. Another was based on patient-to-disease comparisons – assessing data from patient records alongside phenotype-disease profiles contained in medical literature. Thirdly there was a combined approach which made use literature-derived phenotypes but extended them using phenotypes derived from patient records.
The research (supported by HDR UK) made extensive use of the MIMIC III (Medical Information Mart for Intensive Care) database. The first two approaches involved sampling data and texts associated with 1,000 patient visits recorded in MIMIC-III. The third method, which synthesised the first two, involved a further set of 500 patient visits, whose text-derived patient phenotypes were used to extend the literature-derived phenotypes.
The third approach made it clear that there is considerable potential for improving, and semi-automating, the diagnosis of common illness by making use of un-curated texts.
Impact and outcomes
The team created what Dr Slater describes as a “pipeline method” that could use comparative profiles to provide clinicians with a ranked and scored list of possible diagnoses.
They believe that their research has could lead to powerful semantic similarity derived solutions for differential diagnosis of common diseases or outcomes, as well as text classification and cohort discovery tasks in a clinical context.
They also think that their methods, which were primarily based on the cases of patients needing critical care, could be applied more widely. The team is already working (in collaboration with the Centre for Rare Diseases at the Queen Elizabeth Hospital, Birmingham) with cardiologists to enhance the care of patients with congenital heart conditions.
It is hoped that their work can be used for many other tasks including differential diagnosis, cohort discovery, document and text classification. One potential application is in the better understanding of likely patient outcomes, allowing clinicians to optimise care.
Insights from the Impact Committee
The paper was selected by the HDR UK Impact Committee for the quality of its research and its key impacts – the translation of discoveries and research for clinical use, its reach and reuse of outputs.
Two years on from RECOVERY: paving the way to a data-enabled future for clinical trials
22 March 2022
On the second anniversary of the RECOVERY trial, we look at how access to routine clinical healthcare data was a key element of its success in finding treatments for COVID-19. And how...
SPIRIT-PRO Extension explanation and elaboration: guidelines for inclusion of patient-reported outcomes in protocols of clinical trials
21 September 2021
Overview Patient Reported Outcomes (PROs) can provide valuable evidence on the impact of disease and treatment on patients’ symptoms, function and quality of life. High-quality PRO data from...
Scoping exercise launched to inform design of future UK trusted and secure data research infrastructure
28 July 2021
Health Data Research UK (HDR UK) and Administrative Data Research UK (ADR UK) today launch a scoping exercise to inform UK-wide proposals for major developments in digital research infrastructure.