Sprint Exemplar: Creating one of the largest repositories of prostate cancer data in the world
27 February 2020
This Sprint Exemplar Project was funded by the UK Research and Innovation’s Industrial Strategy Challenge Fund (ISCF) as part of the Digital Innovation Hub Programme. In 2019, eleven projects helped to develop proof of concepts for technology, methodology and research services that informed the design of the Digital Innovation Hub Programme. The projects also provided early user cases that demonstrated the unique approach of the programme focusing on research services and infrastructure across NHS, academia and industry to enable the utilisation of high value linked datasets for UK scale research.
Share this page
A ‘go-to place’ for prostate cancer research has been created by researchers from the Nuffield Department of Surgical Sciences (NDS) at the University of Oxford, bringing together clinical trial datasets, traditionally kept at different locations and difficult to access. This collection of over 140,000 unique prostate cancer patients’ data and insights on the disease and its progression will be used to aid earlier diagnosis and more effective treatments.
For the past 20 years, prostate cancer researchers at the University in collaboration with 9 other institutions in the UK, supported by HTA NIHR have been testing treatment options to manage the most common cancer in men: prostate cancer. The ProtecT (Prostate testing for cancer and Treatment) trial is to date the largest clinical trial of treatment worldwide and has enrolled over 80,000 participants, comparing surgery, radiotherapy, and monitoring in men with clinically localised screen-detected prostate cancer.
The results showed that early treatment reduces rates of the disease spreading, but at the cost of over-treating with unpleasant side effects. To distinguish men who need treatment from others who could be harmed by interventions, they needed to undertake long-term translational research on samples generously donated by these participants, to understand what makes a cancer ‘aggressive’ and requiring urgent treatment.
To enhance prostate cancer research for the benefit of scientists globally, a joint project team from NDS and Databiology Ltd undertook a project to curate the wide range of biomedical data collected within the trials, including clinical, longitudinal, phenotype, genotype and other ‘omics’ datasets, using a technologically advanced Biomedical Data Management and Process Orchestration platform provided by Databiology Ltd.
Researchers and scientists can now search this composite information and run AI analytics to further understand disease progression and provide novel algorithms to stratify men with prostate cancer at diagnosis. This will help to reduce over-detection, over-treatment and under-treatment of the disease, leading to improved outcomes.
All analyses are automatically recorded for data provenance and reproducibility.
The project has created a new, rich repository of searchable data, held at the Big Data Institute within Oxford University’s secure firewall. Kept in its original format and unchanged at the source, these datasets can and will be re-used indefinitely by scientists granted access.
This repository of over 140,000 men with prostate cancer and matched controls, well phenotyped, annotated and followed up for over 14 years, is one of the most comprehensive in the world, enabling scientists to share and analyse data, and improve the efficiency of research within the field. The repository’s fully searchable capability is in active use by NDS and other collaborators and will be extended and enriched with a variety of different prostate cancer cohorts, as they become available, to continue to improve early diagnostics and transform survival rates.
This project is an exemplar of a ‘technology enabled solution’ and can be used as a model for other cancers and benign conditions, whereby clinical data including follow-up, the natural history of diseases, and the effect of interventions are integrated with information from experimental analyses, to data-mine research findings. This will have a direct impact on the quality of the research undertaken and will speed up findings and the development of new therapeutics, helping the research community to collaborate and improve patient outcomes.
Partners: University of Oxford and Databiology Ltd