Whole-genome long-read TAPS deciphers DNA methylation patterns at base resolution using PacBio SMRT sequencing technology
Chen J., Cheng J., Chen X., Inoue M., Liu Y., Song C.
Long-read sequencing provides valuable information on difficult-to-map genomic regions, which can complement short-read sequencing to improve genome assembly, yet limited methods are available to accurately detect DNA methylation over long distances at a whole-genome scale. By combining our recently developed TET-assisted pyridine borane sequencing (TAPS) method, which enables direct detection of 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC), with PacBio SingleMolecule Real-Time (SMRT) sequencing, we present here whole-genome long-read TAPS (wglrTAPS). To evaluate the performance of wglrTAPS, we applied it to mouse embryonic stem cells (mESCs) as a proof-of-concept, and an N50 read length of 3.5 kb is achieved. By sequencing wglrTAPS to 8.2x depth, we discovered a significant proportion of CpG sites which were not covered in previous 27.5x short-read TAPS. Our results demonstrate that wglrTAPS facilitates methylation profiling on problematic genomic regions with repetitive elements or structural variations, and also in an allelic manner, all of which are extremely difficult for short-read sequencing methods to resolve. This method therefore enhances applications of third-generation sequencing technologies for DNA epigenetics.