HDR UK Event

Webinar series: Using Docker for Reproducible Health Data Research

Join HDR UK and DNAnexus for this series of three webinars on 13, 16 & 21 November, exploring the use of containers (Docker) for reproducibly interrogating multiple biobanks and other data enclaves.

21 November 2023

Overview:

This three day webinar on docker will be split into three sessions to look at containers (Docker) for reproducibly interrogating multiple biobanks and other data enclaves. The target audience for this webinar is end users in biotech covering pharma, diagnostics and data roles.

The first session on Monday 13 November 2023 will look at using Docker containers reproducibly in multiple computing environments. Reproducibly aggregating and comparing across Biobanks requires the use of reproducible software environments such as Docker Containers. In this webinar, we’ll review basic Docker concepts and show examples of utilizing existing Docker containers in both high performance computing (HPC) and cloud computing environments. We’ll learn tricks to effectively use Docker Image Files and specify them within existing workflows.

In the second session on Thursday 16 November 2023 will look more into building and modifying containers for reproducible research. Building your own Docker Containers for reproducible search involves modifying and rebuilding existing containers. In this webinar, we will cover the build process to build your own container images using Dockerfiles. Once your image is built, it can be shared and distributed with others using container image libraries such as DockerHub.

The third session on Tuesday 21 November 2023 will be a roundtable discussion on phenotype mining and aggregation. Panellists in this session will discuss practical and technical aspects of summary data aggregation after analysis in distinct data enclaves. We will be joined by Hernando Sanches from The Hyve, Dr Tiffany J. Callahan from IBM Research, Deepak Unni from Swiss Institute of Bioinformatics (SIB) and DNAnexus’ Ben Busby, who will be acting as a moderator.

Discussion prompts will include:

Why should people care about data harmonization when they are integrating phenotypic datasets?
What are ontologies and why are they important?
What ontologies do you use in your work?
How do you use those ontologies?
Why are large public datasets important to you?
- UKB and similar governmentally supported biobanks
- UKB, NCBI, EBI, etc.
- Health insurance data
What new datasets are you excited about?
What emerging models are you excited about?

Book your free place via the links below:

Register here for Session 1: 13 November 16:00-17:00 GMT

Register here for Session 2: 16 November 16:00-17:00 GMT

Register here for Session 3: 21 November 16:00-17:00 GMT

Timetable

Time & Date:

13 November 2023, 16:00-17:00 GMT

Learning Objectives:
1. Explain basic Docker concepts, including containers, images, and DockerHub
2. Utilise Docker containers in a high performance computing environment
3. Utilise Docker containers in an existing workflow within a cloud computing environment such as the UK Biobank Research Analysis Platform
- Docker presentation* (the basics and using Docker Hub)
- Case study: using Docker on the UKBRAP
- Using Docker in a variety of environments
Register here for Session 1: 13 November 16:00-17:00 GMT
Time & Date:

16 November 2023, 16:00-17:00 GMT

Learning Objectives:
1. Review basic interactions with DockerHub to pull and push container images
2. Modify and extend existing containers by editing and building from Dockerfiles
3. Share containers and workflows across Biobanks for reproducible analysis
- Review: Docker Hub
- Modification of existing containers
  - Including base containers
- Sharing containers with colleagues and other scientists
- Scientific results aggregation
Register here for Session 2: 16 November 16:00-17:00 GMT
Time & Date:

21 November 2023, 16:00-17:00 GMT

Roundtable discussion on phenotype mining and aggregation. Panellists in this session will discuss practical and technical aspects of summary data aggregation after analysis in distinct data enclaves. Speakers are TBC but a range of organisations will be represented. Discussion prompts will include:
- Why should people care about data harmonization when they are integrating phenotypic datasets?
- What are ontologies and why are they important?
- What ontologies do you use in your work?
- How do you use those ontologies?
- Why are large public datasets important to you?
- What new datasets are you excited about?
- What emerging models are you excited about?
Register here for Session 3: 21 November 16:00-17:00 GMT

Interested in accessing more training from HDR UK? Sign up to HDR UK Futures to access our growing online curriculum.

HDR UK Futures

Speakers

Ted Laderas

Ted Laderas is a bioinformatics trainer for DNAnexus. He trains bioinformaticians in how to use the UKB Research Analysis Platform effectively in analysing the UK Biobank Data. He is passionate about collaboration, training, and team science. He believes building communities of practice in science and research that are psychologically safe and inclusive are the key to doing better, more robust science.

Dr. Tiffany J. Callahan

Dr. Tiffany J. Callahan is currently a Postdoctoral Research Scientist at IBM Research. She recently received her Ph.D in Computational Biology from the University of Colorado Anschutz Medical Campus. Her PhD thesis leveraged graph representation learning and neural-symbolic reasoning of large-scale biological knowledge graphs in order to develop realistic estimates of human disease mechanisms. Her research interests include computational biophysics, multimodal generative AI, representation learning, and probabilistic mechanistic modeling. She is a lover of all things Semantic Web, an advocate of FAIR transparent science, and an avid supporter of open source development. The Health Data Compass, National COVID Cohort Collaborative, and the HuBMAP Consortium are some of the organizations currently utilizing her software.

Areas of work

Training

Event types

HDR UK Event

Overview:

Book your free place via the links below:

Register here for Session 1: 13 November 16:00-17:00 GMT

Register here for Session 2: 16 November 16:00-17:00 GMT

Register here for Session 3: 21 November 16:00-17:00 GMT

Timetable

Register here for Session 1: 13 November 16:00-17:00 GMT

Register here for Session 2: 16 November 16:00-17:00 GMT

Register here for Session 3: 21 November 16:00-17:00 GMT

Speakers

Ted Laderas

Dr. Tiffany J. Callahan

Inaugural HDR UK Early Career Researcher Forum

Scholarship Was A Fast Way To Develop My Skills

Mentorships Inspire Black Interns to Aim High in Their Careers