We critically compare and evaluate state-of-the-art bioinformatics approaches and present a workflow that integrates the best performing data analysis and data evaluation methods in a Transparent, Reproducible and Automated PipeLINE (TRAPLINE) for RNA sequencing data analysis. A comparative transcriptomics analysis with TRAPLINE results in a set of differentially expressed genes, their corresponding protein-protein interactions, a analysis of differential splicing and promoter testing and an integrated miRNA target prediction. Ultimately, the user will receive a ready-to-use file which can be importet to Cytoscape.
TRAPLINE supports NGS research by providing a workflow that requires no bioinformatics skills and decreases the processing time of the analysis. We also support the analysis of paired-end RNA sequencing data. The adapted TRAPLINE workflow can be obtained via: https://usegalaxy.org/u/mwolfien/w/rnaseqtraplinepaired.
Our pipeline is implemented in the biomedical research platform Galaxy and is freely accessible via:
o Do your experiments (Illumina, SOLiD, Solexa Sequencing) and obtain the FASTQ files
Note: the analysis is predefined for the comparison of two experimental conditions with a triplicate for each experimental setup
o Go to the Galaxy website https://usegalaxy.org
o If you are new to Galaxy please create an account
o Import our developed analysis workflow TRAPLINE through www.sbi.uni-rostock.de/RNAseqTRAPLINE or use the “Shared Data – Published Workflows“ section of Galaxy
o (Optional): Edit the settings or parameters, especially if you want to use less replicates than 3 please adjust the workflow
o Upload your FASTQ datasets (6 slots are predefined, 2 conditions with 3 replicates per condition)
You have two possibilities for uploading your data:
o Direct upload from your hard drive
o Upload data from a FTP server
o Upload a reference annotation set for your species as a .gtf file (here: mm9) and assign it to the “Reference annotation” input file of the workflow.
The latest version of your specific species can be obtained via http://geneontology.org/page/reference-genome-annotation-project as gtf annotation file
o (Optional): Upload a miRNA target file for your species of interest and assign it to the “miRNA target prediction” input file of the workflow.
We provide formatted ready to use miRNA target prediction files for human, mice, rat, fruitfly and nematode based on microRNA.org (Betel et al., 2010).
o (Optional): Upload a protein interaction file for your species of interest and assign it to the “Protein interaction” input file of the workflow.
We provide several formatted and ready to use protein-protein interaction files based on BioGRID (Chatr-Aryamontri et al. 2015).
o Go to the “Workflow” section, select “RNASeqTRAPLINE” and click on Run
o Assign your six datasets to the given order (have a look at the annotation text) and choose your reference annotation file
o Select a reference genome of species for each TopHat2 alignment as a Galaxy build-in (mice mm9 is predefined)
We used the default TopHat2 parameter adjustments as recommended by Kim et al.(2013).
The single end read mode is also predefined, but can be changed in the TopHat2 settings
Moreover, Trapnell et al. (2012) recommended to avoid the use of genome reference annotation in the genome alignment step, because this step would prevent the identification of novel, yet uncharacterized, transcripts.
o Start the workflow
o Obtain your results
A list of all genes and additional a list containing only the significantly differentially expressed genes
A list of differential splice variants of each primary transcript
A list of differential promoter use between the samples
A list of significantly upregulated / downregulated genes
Link to DAVID to further analyze the obtained significantly differentially expressed genes regarding their annotation and impact to the phenotype (Rerun module with identifiers in column 3)
A list of significantly up regulated / down regulated miRNAs including their predicted targets that are also significantly up regulated / down regulated
A list of protein-protein interactions based on up regulated mRNAs
A table containing all the obtained results and made ready for an import into Cytoscape, example file can be seen here:
TRAPLINE can be cited via:
Betel, D., Koppal, A., Agius, P., Sander, C., and Leslie, C. (2010). Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biol 11, R90.Chatr-Aryamontri, A., Breitkreutz, B.J., Oughtred, R., Boucher, L., Heinicke, S., Chen, D., Stark, C., Breitkreutz, A., Kolas, N., O'Donnell, L., et al. (2015). The BioGRID interaction database: 2015 update. Nucleic Acids Res 43, D470-478.Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S.L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology 14.Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D.,Kelley, D.R., Pimentel, H., Salzberg, S.L., Rinn, J.L., and Pachter, L. (2012). Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7, 562-578.