Software

INTRODUCTION

Linked-read sequencing (10X Genomics or TELL-Seq) generates synthetic long reads which are useful for the detection and analysis of structural variants (SVs). The software associated with linked-read sequencing, Long Ranger, generates the essential output files (BAM, VCF, SV BEDPE) necessary for downstream analyses. However, to perform downstream analyses requires the user to customize their own tools to handle the unique features of linked-read sequencing data. Gemtools is a collection of tools for the downstream and in-depth analysis of SVs from linked-read data. Gemtools uses the barcoded aligned reads and the Megabase-scale phase blocks to determine haplotypes of SV breakpoints and delineate complex breakpoint configurations at the resolution of single DNA molecules. The gemtools package is a suite of tools that provides the user with the flexibility to perform basic functions on their linked-read sequencing output in order to address even more questions.

IMPLEMENTATION

Figure 1. Gemtools analysis of three SV events in the HCC1954 genome.

(A) The initial input for the gemtools SV analysis is a list of SVs in BEDPE format; here, three SV events (two deletions and one duplication) located within ∼2.2 Mb on chromosome 9.

(B) The genomic mapping locations of reads associated with SV-spanning barcodes for each event are plotted, where each row of the plot represents a unique barcode.

(C) Each of the two breakpoints for each of the three SV events is assigned to a haplotype generated by Long Ranger. Here, all of the SV breakpoints were assigned to the same Long Ranger phase block, denoted by the phase id. For all breakpoints, more SV-spanning barcodes were assigned to haplotype 2 compared with haplotype 1, supporting that all of the SV events occur in cis on haplotype 2 of phase block 123859090

Figure from (Greer & Ji, 2019) by permission of Oxford University Press.

AVAILABILITY

The gemtools package is freely available for download at: https://github.com/sgreer77/gemtools

CONTACTS

For queries, contact sgreer2@stanford.edu

CITATIONS

Greer SU, Ji HP. Structural variant analysis for linked-read sequencing data with gemtools. Bioinformatics. 2019;35(21):4397–99.