Health data and research are often kept in local hospitals and institutes, recorded differently to suit their needs, and in isolation from researchers in other locations. This can prevent big health questions from being answered, as crucial data is difficult to locate, interpret and analyse.

A team in Scotland, led by the University of Edinburgh, has sought to tackle this by creating a map to link up data across different geographical locations and developing a new computer language to help interpret related data.

The map gives an overarching view of the real-world data available for different conditions, where to look for the data, and how to understand it when it is found. This makes it easier to get a complete picture of a patient group or population and help to form new insights to improve health outcomes.

Health data are collected routinely to support the work of different aspects of healthcare or research. Information about the same thing might be recorded in slightly different ways, depending on how the group collecting the data routinely refers to it, much like a local dialect. For instance, a blood test could be recorded as an ACT, also known as Activated Coagulation Time, or Activated Clotting Time, but all three names describe the same test.

As patient symptoms or test results could be described differently across multiple locations, the new computer language – or ‘interlingua’ – acts as a translator between the separate data collections. The map describes the data from each source using the interlingua, allowing related data to be identified more easily and quickly. Researchers can also ask health-related questions using the interlingua, which through the translation process indicates if, and how, their questions can be answered.

The mapping process does not change the original patient data, and will, by describing it rather than directly linking it, create a non-confidential resource. The communities who originally gathered the data are able to control how much of the confidential data is visible depending on the permissions given to the data viewer.

This new data language is an exciting advancement in health data research and the team are looking to use it to link and translate diverse data sets from, not only Scotland, but across the UK. The new language, and the linking process it supports, could lead to improved access to virtual UK wide datasets and the exploration of health-related research questions.

Long-term, the team predict an improvement in healthcare, as researchers will be able to ask questions across multiple datasets: such as identifying groups of patients that might benefit from a treatment approach, or designing a clinical trial that would require multiple locations to make viable.

Partners: University of Edinburgh, University of Strathclyde, University of Glasgow, University of Aberdeen, University of St Andrews, NHS Education for Scotland Digital Platform, University of Dundee, NHS National Services Scotland, Microsoft Research, Platinum Informatics, Trento