When patients interact with physicians, or are admitted into hospital, information is collected electronically on their symptoms, diagnoses, laboratory test results, and prescriptions. This information is stored securely in Electronic Health Records (EHR) and is a valuable resource for researchers and clinicians for improving health and healthcare. EHRs are however of variable detail and quality and contain many inconsistencies. As a result, researchers and data providers spend considerable time creating complex computer programs to fix and statistically analyse the information in EHR and identify which patients have which disease. Currently, there is no means to share these tools across institutions in the UK resulting in duplication of effort. Reproducibility of research is also hampered as others do not have access to the precise methods and definitions used in a particular study. This project addresses these issues by creating an open resource for EHR users (researchers, clinicians, the NHS and data providers) to share their methods.
A primary reason for using data from EHR is the creation of phenotype algorithms to identify disease status, onset and progression. Phenotyping (describing the characteristics of disease) however is challenging as her data are collected for different purposes, have variable data quality and often require significant harmonisation. While considerable effort goes into these algorithms, there is no consistent methodology for creating and evaluating them and no centralised repository for depositing and sharing them.
We will create a national platform for dissemination of citable algorithms (incl. validations) and tools which will reduce duplication of effort and improve research reproducibility. We will explore methods for creating computable representations of algorithms for integration into actionable analytics for healthcare. Finally, we will fundamentally shift the EHR cultural landscape by a robust incentivisation programme, providing guidelines on best practices, cross-disciplinary training, and ensuring alignment with other international initiatives.
Impact and outcomes
Though this project, we will deliver a fundamental step-change in the current EHR community in the UK by bringing together health data scientists, clinicians, computer scientists, public health experts and data curators under the FAIR principles (www.force11.org). The National Phenomics Resource will facilitate the dissemination and re-use of algorithms, tools and methods by the community. By establishing a national standard for creating, evaluating and representing phenotypes, we will accelerate the impact of discovery through increased transparency and replicability and maximise the usability and value of existing data repositories to new users. Finally, we will take the first steps towards establishing computational biomedical knowledge objects (e.g. guidelines with embedded phenotypes endorsed by NICE) which will enable the creation of actionable health analytics in the NHS.
Spiros Denaxas, firstname.lastname@example.org
24 March 2021
CALIBER is an exemplar of HDR UK’s commitment to open source. The comprehensive, open-access resource led by Spiros Denaxas provides the research community with information, tools and...
New advances in understanding long COVID
20 February 2023
Researchers analysed the primary healthcare records of 7.4 million patients in England to develop a long COVID phenotype, strengthening further research to improve the understanding of the condition.
COVID-19 trajectories among 57 million adults in England: a cohort study using electronic health records
7 October 2022
In May, the HDR UK Impact Commitee selected Thygsen et al.'s paper as the Open Impact Publication of the Month for its impactful use of large datasets to track the impact of the COVID-19 pandemic