THE WHOLE GENE: APPLICATION AND IMPLEMENTATION OF WHOLE GENE SEQUENCING IN THE CLINICAL LABORATORY.

Size: px
Start display at page:

Download "THE WHOLE GENE: APPLICATION AND IMPLEMENTATION OF WHOLE GENE SEQUENCING IN THE CLINICAL LABORATORY."

Transcription

1 THE WHOLE GENE: APPLICATION AND IMPLEMENTATION OF WHOLE GENE SEQUENCING IN THE CLINICAL LABORATORY. Dr. Katy Latham Anthony Nolan Research Institute, London, UK EFI 2018: NGS teaching session

2 OBJECTIVES A focus on haematopoietic stem cell transplantation An overview of the Pacific Biosciences Single Molecule Real Time (SMRT) sequencing technique The development of an in house system Practical considerations for the clinical laboratory Lessons learnt

3 WHY DO WE NEED NGS?

4 HLA ALLELE MATCHING ON OVERALL SURVIVAL P= /10 HLA match (n=494) 9/10 HLA match (n=230)

5 An overview

6 SEQUENCING PROTOCOL Target Sequence Short Read Long Read

7 PACIFIC BIOSCIENCES SINGLE MOLECULE REAL TIME (SMRT) DNA SEQUENCING Double-stranded PCR amplicon allele 1 SMRT Bell Adaptors with polymerase Double-stranded PCR amplicon allele 2 Ligation Continuous Long Read (CLR) or Polymerase Read SMRT DNA sequencing Sub-read sequences Allele 1 Allele 2

8 PRE- PCR: BARCODE Amplification of the samples using barcoded primers

9 MULTIPLEXING Double-stranded PCR amplicon Person 1 SMRT Bell Adaptors Double-stranded PCR amplicon Person 2 Ligation SMRT DNA sequencing Continuous Long Read (CLR) or Polymerase Read Sub-read sequences

10 A SMRT CELL

11 Development of an in house system

12 TYPING STRATEGY Full-length HLA Class I genes amplified using IMGT/HLA an in-house Database version method (July 2015) HLA-A gene ~3.5kb 5 UTR UTR HLA-B and -C amplicons ~3.3 kb Near full-length Class II genes amplified using an in-house method HLA-DRB1 ~4 kb (exons 2 and 3) 5 UTR HLA-DPB1 amplicons ~5.5 kb (exons 2, 3 and 4) DRB1 DPB1 DQB1 3 UTR HLA-DQB1 amplicons ~4.5 kb (exons 2, 3, 4 and 5)

13 WORKFLOW DNA extraction Amplicon prep Library prep Sequencing

14 PRE-PCR:DNA EXTRACTION AND QUANTIFICATION Consider sample capacity Everyday 96 samples per extraction Multiple runs per day The extraction method needs to allow for the length of the PCR fragment Full length HLA-DPB1 or KIR is 16kb Avoid shearing i.e. rolling mix rather than vortex mix Ability to measure fragmentation

15 PRE-PCR: PCR A1 HLA-A H12 One barcoded primer pair per well One DNA sample per well 48plexing: 48 different barcoded primer pairs per PCR reaction Consider a taq with long range and high fidelity

16 PCR Amplicons Class I Pools POST-PCR HLA-A A Class I EMP SMRT cell Class I HLA-B B HLA-C C RSII Loading Class II Pools HLA- DRB1 DR Class II EMP SMRT cell Class II HLA- DQB1 DQ HLA- DPB1 DP

17 PCR Thermal cyclers QC01 Equinanogram Pooling Amplicons Purification QC02 Equimolar Pooling DNA DNA Rep. libraries End Rep Purification QC03 SMRT bell Adaptors Ligation Exonuclease Clean-up SMRT bell Templates purification QC04 MagBead Loading RSII Loading PCR Thermal cyclers QC01 Equinanogram Pooling Amplicons Purification QC02 Equimolar Pooling DNA DNA Rep libraries End Rep Purification QC03 SMRT bell Adaptors Ligation Exonuclease Clean-up SMRT bell Templates purification QC04 MagBead Loading RSII Loading PCR Thermal cyclers QC01 Equinanogram Pooling Amplicons Purification QC02 Equimolar Pooling DNA DNA Rep libraries End Rep Purification QC03 SMRT bell Adaptors Ligation Exonuclease Clean-up SMRT bell Templates purification QC04 MagBead Loading RSII Loading PCR Thermal cyclers QC01 Equinanogram Pooling Amplicons Purification QC02 Equimolar Pooling DNA DNA Rep libraries End Rep Purification QC03 SMRT bell Adaptors Ligation Exonuclease Clean-up SMRT bell Templates purification QC04 MagBead Loading Day 1 Day 8 RSII Loading 1 Pre-PCR Robotic Platform 3 Post-PCR Robotic Platforms Four Hamilton MicroLab Star Line Workstations (1 Pre-PCR, 3 Post-PCR) Running four different pipelines per week 1536 samples can potentially be processed in eight working days

18 PCR SET UP OPTIMISATION We occasionally observed total failure at PCR for 96 samples. Although uncommon, this phenomenon is high impact. Changing plastics to reduce surface area dimensions reducing evaporation that contributes to increased in primer concentration Increasing minimum pipetting volume of the robotics Increasing dead volume during robotic set up Monitoring pipetting drift to ensure accuracy of robotics These changes have led to a 22% reduction in locus specific whole plate PCR failures. A robotics expert is required

19 REAGENT VARIABILITY A 37% P1 value is optimal Customising the loading concentration to the SMRT cell lot has increased optimal loading

20 161,568,000 nucleotides per run

21 What do the results look like?

22 QUALITY INDICATORS Quality indicators considered in data assessment QV Value No of Reads Predicted Accuracy Linkage Disequilibrium and haplotype data Read Balance (intra and inter locus) Productivity (P1 value) RSII Run Matrix Values Specific Barcode Performance Data Control Sample Concordance ISO and EFI accreditation LIMS allows for a virtually paperless system

23 FULL LENGTH SEQUENCES Gene Number of Reference Sequences Partial Reference Sequence Full cdna Reference Sequence Full gdna Reference Sequence A B C DPB DQB DRB

24 MISSING OR PARTIAL REFERENCE SEQUENCES NGS Allele 1 ref Allele 2 ref NGS Allele 1 ref Allele 2 ref Full length reference sequence Partial reference sequence

25 HLA-B*07:02:01 ~ C*07:02: n= B*07:02:01 ~ C*07:02:01:03 B*07:02:01 ~ C*07:02:01: n=226 Total = 27,959

26 HLA-B*07:02:01 ~ C*07:02: % % 45% B*07:02:01 ~ C*07:02:01:03 B*07:02:01:03 ~ C*07:02:01:03 B*07:02:01 ~ C*07:02:01: % British-Irish African-Caribbean Where ethnicity was available

27 B alleles C alleles *08:01:01:01 *07:01:01:01 *08:01:01:02 *07:02:01:01 *15:01:01:01 *03:03:01:01 *03:04:01:01 *15:01:01:04 *04:01:01:06 Extended LD groups class I *18:01:01:01 *05:01:01:01 *18:01:01:02 *44:02:01:01 *07:01:01:01 *07:01:01:04 *12:03:01:01 *05:01 *12:03 *16:04 *44:02:01:03 *07:04:01 *44:03:01 *04:01:01:01 *16:01:01:01 *04:09N *03:03:04 *44:03:02 *07:06 *57:03:01:01 *07:18 *57:03:01:02 *07:01:02

28 WHAT DOES THIS MEAN FOR THE PATIENT HLA Patient Donor 1 Donor 2 A 31:01:02:01 68:02:01:01 02:01:01:01 31:01:02:01 03:01:01:01 31:01:02:01 B 07:02:01 40:01:02 07:02:01 40:01:02 07:02:01 40:01:02 C 03:04:01:01 07:02:01:03 03:04:01:01 07:02:01:03 03:04:01:01 07:02:01:03 DRB1 04:04:01 04:04:01 04:04:01 DQB1 03:02:01:01 03:02:01:02 03:02:01:01 03:02:01:01 DPB1 02:01 04:01:01:01 02:01 06:01:01 04:02:01:02 06:01:01 Match grade? 9/10 8/10 9/10 Clinician input is essential Agree reporting commentary Buy in and support for the change Agree how this will be used clinically

29 THE CLINICAL IMPLICATIONS O33- Dr Neema Mayor, Oral abstract session 4, Heamatopoietic Stem Cell Transplantation

30 LESSONS LEARNT Can be used in a clinical setting Enables research into the impact of HLA match between patient and donor Frequency of novel alleles (in otherwise well characterised alleles) 6.5~13% Must have an understanding of bioinformatics requirements Full control over the end to end pipeline leads to a high confidence in the data Requires a dedicated R&D team to support implementation Robotic support is essential You will find discrepancies with previous typing

31

32 ACKNOWLEDGEMENTS ALL THE ANTHONY NOLAN TEAM Prof Steven Marsh Dr Neema Mayor James Robinson Franco Tavarozzi Davide Lepore Dario Merlo Kylara Hassell Nicola Brosnan Reetinder Grewal Rebecca Goodall Jex-Ray Sayno Shem Wallis-Jones Alasdair McWhinnie Will Bultitude Thomas Turner Jeremy Stein Cristina Guijarro Dr Gayle Leen Dan Hayward Arthur Gymer Alejandro Madrigal