| Challenge | Discover | Learn | Connect |
Ultra Highthroughput Sequence Pipeline
Challenge: Ultra Highthroughput Sequencing (UHTS) technologies are rapidly changing the way we approach fundamental questions in biology. Technological advances in nanotechnology and sequencing chemistry have together revolutionized our ability to obtain low-cost, high-throughput DNA sequence. As these technologies advance at a rapid pace, they pose new challenges for standardizing sequence information and in automating computational tasks.
The primary objective of the Ultra Highthroughput Working Group is to establish an informatics pipeline that allows members of the plant science community to process UHTS data (e.g. Illumina, Solid) using simple, user friendly interfaces. iPlant developers work with the UHTS Working Group to make tools that allow users to import UHTS files and output data matrices using the algorithms commonly used by the community for processing DNA and RNA data sets.
iPlant’s sequence analysis effort enables users to upload DNA or RNA sequencing data from their desktop, a remote server, or from the NCBI Sequence Read Archive, then view, manage, and perform basic analysis on the data in a user-centric workspace. Data management capabilities include annotation with metadata and pre-processing sequence data to remove non-biological sequence production artifacts (e.g. linkers, primers, etc). Scientists are able to perform two basic analytical workflows using their post-processed sequence data in a relatively short period of time and without complex command-line utilities.
Workflows
Variant detection - Supports DNA sequence data and allows users to detect single nucleotide polymorphisms (SNPs) in a test sequence compared to a reference sequence. The input of the workflow is a library of short read data and a reference sequence and the output is a list of SNP differences.
Transcript quantification - Supports RNA sequence data and provides transcript quantification relative to a reference genome. Initially, users will be able to choose various reference genomes (Arabidopsis thaliana, Zea mays, Arabidopsis lyrata, Brachypodium distachyon, Oryza sativa nipponbare, Oryza sativa indica, Populus trichocarpa, Sorghum bicolor, and Vitis vinifera) as the basis for their analyses.
Additional workflows are under development to allow discovery of novel RNA transcripts, comparative RNAseq analyses, and automated functional annotation of discovered polymorphisms.
Current CI services available (and more coming online regularly)
- Bioinformatics software available through the iPlant Discovery Environment
- Sequence alignments and phylogenetic tree building
- Phylogenetic and evolutionary analyses
- Ultra high-throughput sequence processing and variant detection
- QTL mapping and genome-wide association studies
- Functional analyses
- Clustering and network analyses
- ChIPseq studies
- Utility tools and scripts
- Full list at https://pods.iplantcollaborative.org/wiki/display/DEman0p4/Tools+list
- Access to collaboration tools
- Public and private wiki spaces, Mailing lists
- Video conferencing setup and support
- Data hosting - Access to mirroring, backup, and recovery services at petascale
- Web and application hosting
- Access to persistent virtual machines
- Algorithm development
- Software prototyping
- Command-line access to production and experimental supercomputers, archive systems
- Access to an online bug tracking and issue system
- Git/svn code hosting within iPlant and through SourceForge and GitHub
Working Group Members
| Name | Role | Institution | |
|---|---|---|---|
| Tom Brutnell | Working Group Lead |
Donald Danforth Plant Science Center | |
| Justin Borevitz | Collaborator | Univeristy of Chicago | |
| Todd Mockler | Collaborator | Oregon State University | |
| Pat Schnable | Collaborator | Iowa State University | |
| Michele Morgante | Collaborator | Univerita degli Studi di Udine | |
| Bob Schmitz | Collaborator | Salk Institute for Biological Studies | |
| Lin Wang | Collaborator | Cornell | |
| Blake Meyers | Collaborator | Delaware Biotechnology Institute | |
| Matt Hudson | Collaborator | University of Illinois | |
| Scott Jackson | Collaborator | Purdue University | |
| Brad Barbazuk | Collaborator | University of Florida | |
| Greg May | Collaborator | The National Center for Genomic Resources | |
| Zhenyuan (Jerry) Lu | Collaborator | iPlant Collaborative, Cold Spring Harbor Laboratory | |
| Liya Wang | Collaborator | iPlant Collaborative, Cold Spring Harbor Laboratory | |
| Chunlao Tang | Collaborator | iPlant Collaborative, Cold Spring Harbor Laboratory | |
| Chris Jordan | Collaborator | iPlant Collaborative, Texas Advanced Computing Center | |
