Tibble H, Tsanas A, Horne E, Horne R, Mizani M, Simpson CR, Sheikh A
BMJ Open (2019) 9: e028375
Introduction: Asthma is a long-term condition with rapid onset worsening of symptoms (‘attacks’) which can be unpredictable and may prove fatal. Models predicting asthma attacks require high sensitivity to minimise mortality risk, and high specificity to avoid unnecessary prescribing of preventative medications that carry an associated risk of adverse events. We aim to create a risk score to predict asthma attacks in primary care using a statistical learning approach trained on routinely collected electronic health record data.
Methods and analysis: We will employ machine-learning classifiers (naïve Bayes, support vector machines, and random forests) to create an asthma attack risk prediction model, using the Asthma Learning Health System (ALHS) study patient registry comprising 500 000 individuals across 75 Scottish general practices, with linked longitudinal primary care prescribing records, primary care Read codes, accident and emergency records, hospital admissions and deaths. Models will be compared on a partition of the dataset reserved for validation, and the final model will be tested in both an unseen partition of the derivation dataset and an external dataset from the Seasonal Influenza Vaccination Effectiveness II (SIVE II) study.
Ethics and dissemination: Permissions for the ALHS project were obtained from the South East Scotland Research Ethics Committee 02 [16/SS/0130] and the Public Benefit and Privacy Panel for Health and Social Care (1516–0489). Permissions for the SIVE II project were obtained from the Privacy Advisory Committee (National Services NHS Scotland) [68/14] and the National Research Ethics Committee West Midlands–Edgbaston [15/WM/0035]. The subsequent research paper will be submitted for publication to a peer-reviewed journal and code scripts used for all components of the data cleaning, compiling, and analysis will be made available in the open source GitHub website (https:// github. com/ hollytibble).
Health data research
Health Data Science is a discipline that combines maths, statistics and technology to study different types of health problems using data. It provides the tools to manage and analyse very large...
Health Data Research UK researchers develop innovative tools and technologies needed to unlock knowledge from complex and diverse health data, to address some of the biggest health challenges that...