Single Molecule Based Rearrangement Analysis with Linked Read Sequencing


Large-scale genomic rearrangements involve inversions, deletions and other structural events that can span megabase segments of the human genome.  They are causative for hereditary genetic disorders and diseases like cancer.  Currently, most rearrangements are characterized by whole genome sequencing with short sequence reads and DNA inserts of several hundred bases, small enough that they lack genomic contiguity. Using barcode-linked read sequences, we have developed a new somatic and germline rearrangement caller, ZoomX, that detects large scale (>200 kb intra- or inter-chromosomal) rearrangements.  ZoomX utilizes single molecules inferred from linked-read data and a Poisson field scan to identify genomic junctions. ZoomX performs Fisher’s exact test based on single molecule counts to identify statistically significant somatic junctions. ZoomX works for linked-read data and is optimized specifically for analyzing somatic variants of varying allelic fractions.  We benchmarked ZoomX calls of a well-characterized control genome and were able to discover variants had eluded previous analyses. Our analysis of metastatic gastrointestinal cancers identified a series of somatic complex rearrangements composed of multiple classes of structural variants and that are potentially related to oncogenesis.




Questions and comments shall be addressed to