Presentations

Assembly-based structural variation and haplotypes from targeted sub-Megabase DNA molecules

Presented at AGBT 2018 General Meeting by GiWon Shin

ABSTRACT

GiWon Shin (Presenter), Division of Oncology, Stanford University School of Medicine, Stanford, CA
Stephanie U. Greer, Division of Oncology, Stanford University School of Medicine, Stanford, CA
HoJoon Lee, Division of Oncology, Stanford University School of Medicine, Stanford, CA
T. Christian Boles, Sage Science, Inc., Beverly, MA
Jun Zhou, Sage Science, Inc., Beverly, MA
Hanlee P. Ji, Division of Oncology, Stanford University School of Medicine, Stanford, CA

The sequencing analysis of target DNA molecules ranging from 0.1 Mb and higher has many advantages for delineating complex genomics features. These improvements include: increased target coverage that increases the sensitivity for identifying SV breakpoints; limits to the consequences of off-target sequences; rapid local assembly of Mb regions given the reduction in data size and complexity; cost-effectiveness that comes from examining only regions of interest rather than whole genomes. With such a method, one can generate large, near-Mb size haplotypes. Structural variation can be readily characterized even when existing in lower allelic fractions. However, current methods do not offer these features and typically involve smaller molecules less than 0.1 Mb. As a solution, we have developed a genomic sequencing approach that offers all of the aforementioned improvements. Our approach takes live cells as input, by which any degradation of genomic material is minimized before the target enrichment process. By combining in vitro CRISPR-Cas9 segmentation with automated electrophoretic size selection, our approach efficiently enriches intact high-molecular-weight target fragments of multiple genomic origins with high specificity. We demonstrate that large segments of DNA can be targeted and sequenced efficiently. Moreover, without any further treatment, an eluted fraction can be directly used for downstream sequencing library preparation (e.g., 10X Chromium whole genome sequencing), of which the resulting library is already target-enriched without further target-capture steps. We designed three assays and tested them with the GM12878 cell line, of which the genomic information is abundant. The assays targeted BRCA1, 41 regions with different SV events, and the entire 4-Mb MHC region, respectively. gRNAs were designed to generate 100-kb or 200-kb fragments, and the target coverage was more than 100X while overall coverage including non-target was approximately 4X. Our targeted linked-read sequencing provided the complete phased haplotypes for targets of interest with a high cost-efficiency.

PRESENTATION