Full-Length RNA-Seq Analysis using PacBio long reads: from reads to functional interpretation
Wednesday, 2nd September
17:00 to 20:00 (CEST)
Instructors and helpers
- Ana Conesa | University of Florida, United States
- Elizabeth Tseng | Pacific Biosciences, United States
- Angeles Arzalluz | Polytechnical University Valencia, Spain
- Francisco Pardo | Polytechnical University Valencia, Spain
The PacBio Single-Molecule Real-Time sequencing technology produces highly accurate long reads that is suitable for full-length RNA sequencing. The Iso-Seq method generates full-length transcript sequences of 10 kb or longer that does not require transcript assembly or error correction. The high accuracy (>99%) of Iso-Seq transcripts allows for unambiguous characterization of alternative splicing events, direct ORF prediction without a reference genome and identification of single cell barcodes.
The unique features of PacBio Iso-Seq data requires special set of bioinformatics tools that typical short read RNA-seq tools fail to provide. The PacBio SMRT Analysis software processes raw sequencing data into full-length transcript sequences, which can then be analyzed with community tools that have been developed specifically for long read data: SQANTI compares Iso-Seq transcripts against known annotations (ex: GENCODE) to classify novel vs known genes and transcript, and remove artifacts; IsoAnnot functionally annotates Iso-Seq transcripts; tappAS compares multiple Iso-Seq samples to identify differential features. Existing RNA-Seq short read data are often paired with Iso-Seq data to strengthen the analysis.
Further, the Iso-Seq method can also be applied to single cell analysis. Matching single cell libraries of both long and short read data can be generated and combined to using the deeper coverage of short reads to identify cell types, while using matching cell barcodes to link full-length isoforms generated by the long-read data back to individual cell types.
In this tutorial, we provide an overview of the Iso-Seq tools for both bulk and single cell RNA-seq analysis and guide the audience through hands on analyses.
Beginner or intermediate. This tutorial will be of broad interest to researchers from academia or industry who want to learn to understand the unique features and tool sets of long read RNA sequencing (Iso-Seq) data using PacBio’s SMRT Technology.
Attendees are expected to have basic Unix command line skills and some familiarity with R/RStudio. Programming knowledge is not required though most of the tools are written in Python.
This tutorial is open to at most 30 attendees.
Attendees are expected to use their own laptops and have installed R/RStudio and the tappAS software. We will be using a shared instance in AWS for the first part of the analysis (Iso-Seq and SQANTI), then running tappAS on the local laptops.
|17:00 - 17:15||Introduction|
|17:15 - 18:00||Demo & Hands-On Session: Iso-Seq using BioConda|
|18:00 - 18:45||Demo & Hands-On Session: Functional analysis of Iso-Seq data|
|18:45 - 19:00||Coffee Break|
|19:00 - 19:45||Single Cell Iso-Seq|
|19:45 - 20:00||Wrap Up|