iPTOL

Assembling the Tree of Life for the Plant Sciences (iPToL)

Knowledge of evolutionary relationships is fundamental to biology, yielding new insights across the plant sciences, from comparative genomics and molecular evolution, to plant development, to the study of adaptation, speciation, community assembly, and ecosystem functioning. Although our understanding of the phylogeny of the half million known species of green plants has expanded dramatically over the past two decades, the task of assembling a comprehensive "tree of life" for them presents a Grand Challenge. Its solution will require a significant intellectual investment at the developing intersection between phylogenetic biology and the computer sciences. We have brought together plant biologists and computer scientists to build the cyberinfrastructure needed to scale up phylogenetic methods by 100-fold or more, to enable the dissemination of data associated with such large trees, and to implement scalable "post-tree" analysis tools to foster integration of the plant tree of life with the rest of the botanical sciences. The undertaking to unravel the evolutionary relationships among all living things, and to express this in the form of a phylogenetic tree of life, is one of the most profound scientific challenges ever undertaken, and represents a true "moonshot" for the life sciences. We anticipate that early success in addressing the plant phylogeny problem will be especially useful in connection with other Grand Challenge Projects supported through the iPlant Collaborative that involve comparisons between genes, genomes, or species, insuring a broad impact of the project as a whole. Finally, the plant tree of life provides exciting opportunities for training and outreach at all levels. Since Darwin, the tree of life has proven to be a very accessible visual metaphor for nonscientists, providing an elegant opening for communicating results in the plant sciences and evolutionary biology to people with diverse backgrounds.

Participants in the large organizational workshop that led to our white paper agreed on the fundamental importance of four main goals. They are, in increasing order of approximate challenge:

  1. Developing scalable cyberinfrastructure for the analysis of two specific problems of widespread interest that require a phylogeny: inferring ancestral traits (and trait changes) on a tree, and inferring the history of gene duplication and loss in gene families
  2. Developing database integration cyberinfrastructure to enable relatively seamless import of trees from existing databases of trees (e.g., TreeBASE, Pfam)
  3. Developing high performance computing tools to permit tree reconstruction 1-2 orders of magnitude larger than is currently practical, and assembly of data sets for plants to take advantage of this new cyberinfrastructure.
  4. Developing tree visualization tools that scale well to large trees and communicate evolutionary relationships and annotations of trees effectively for disparate end users.

Collaborative implementation is organized into working groups with focused development goals. Each group has an iPToL superuser or faculty member designated as the lead and point of contact.

The four main working groups are: Big Trees, Data Assembly, Tree Reconciliation, and Trait Evolution.
Two crosscutting working groups to develop shared data and compute infrastructure are Data Integration and Tree Visualization.

Education, Outreach and Training

The iPlant Collaborative offers opportunities for novel approaches to education, outreach, and training at multiple levels, from K12 to the citizen naturalist to the scientifically literate layperson to the fledgling scientist in training. We envision creative ways to use cyberinfrastructure (CI) to teach about plant biology and new opportunities to train teachers and students in the use of CI. We propose cross-training in biology and computer science for students of all ages and teacher workshops for training in the use and implementation of CI for teaching plant biology. Our basic, general goals for K12 education and public outreach are:

  1. To develop CI for application to K12 education and provide training to teachers to integrate the resulting tools into curricula.
  2. Facilitate access to effective educational materials for a broad public audience (e.g.,through websites, YouTube, and new CI developed through this project).
  3. Facilitate access to journals, data, and other information for students and post-docs. We propose to meet these goals through collaboration with personnel from iPlant and from other Grand Challenge projects
  4. Develop video clips on plant evolutionary history for public outreach (for dissemination via YouTube, for example). These might be created de novo, or from existing programs such as Nova or Discovery. Surprisingly, existing entries under searches for "tree of life" or "phylogeny" show virtually nothing about phylogenetic relationships among organisms but instead portray the "tree of life" clips from religious or creationist contexts. Inteviews with scientists could provide an entrez into the realm of interactions with scientists: these same scientists could be virtual mentors through the social networking system described in (3).
  5. Develop video clips on plant evolutionary history for public outreach (for dissemination via YouTube, for example). These might be created de novo, or from existing programs such as Nova or Discovery. Surprisingly, existing entries under searches for "tree of life" or "phylogeny" show virtually nothing about phylogenetic relationships among organisms but instead portray the "tree of life" clips from religious or creationist contexts. Inteviews with scientists could provide an entrez into the realm of interactions with scientists: these same scientists could be virtual mentors through the social networking system described in (3).
  6. Teacher workshops for training in the use of CI; for example, how to implement the tools proposed above and how to integrate them into curricula. We propose to recruit and support both biology and computer science teachers for summer sessions to work with iPTOL and iPlant personnel to develop curricula and to lead the workshops for other teachers. Varying state curricular standards and requirements will be kept in mind to make the teaching materials as broadly useful as possible.
  7. Cross-training of students in biology and computer science at both undergraduate and graduate levels. We propose to develop courses and other approaches to integrating biology and computer science, at both our home institutions and centralized through iPlant.

Tree of Life Community Leaders

Main contact

Michael Sanderson, Department of Ecology and Evolutionary Biology, University of Arizona. Research interests: Computational phylogenetics, plant systematics; quantitative literacy in biology.

Plant Science Community Leaders

Michael Donoghue, Department of Ecology and Evolutionary Biology, Yale University. Research interests: Diversity and evolution of flowering plants,using phylogenetic trees to understand patterns of diversification, character evolution, biogeography and ecology. As Director of Yale's Peabody Museum of Natural History he was directly involved in K-12 and family education and outreach activities, including the production of a museum exhibition entitled "Travels in the Great Tree of Life".

Pamela Soltis, Florida Museum of Natural History. University of Florida. Research and outreach interests: Angiosperm phylogeny, polyploidy (both ancient and recent), and the origin and evolution of the flower; student mentoring and public outreach through teacher education and museum exhibits and programs.

Douglas Soltis, Department of Botany, University of Florida. Research and outreach interests: Angiosperm phylogeny, genetic and genomic consequences of genome doubling (both ancient and recent), phylogeography, conservation genetics, and the origin and subsequent diversification of the flower.

Computational Science Community Leaders

Val Tannen, Department of Computer and Information Science, University of Pennsylvania. Research and outreach interests: Databases and bioinformatics; systems for data integration and sharing between collaborating scientists, on data provenance, on phylogenetic data modeling and on the integration of AToL data resources.

Alexandros Stamatakis, Department of Computer Science, Technische Universität München Research and outreach interests: design of algorithmic and HPC solutions for large-scale phylogenetic inference; fostering communication and collaboration between computer scientists and biologists.

Todd Vision, Department of Biology, University of North Carolina. Research and outreach interests: Computational genomics and genome evolution. He teaches courses in computational and evolutionary genetics at UNC Chapel Hill, and has been an organizer since 2007 of the Phyloinformatics Summer Courses and Phyloinformatics Summer of Code through the National Evolutionary Synthesis Center. Vision has worked with the Destiny Science Bus program to bring inquiry-driven bioinformatics, plant biology and evolutionary biology educational opportunities to underserved secondary students in North Carolina.

iPToL Engagement Team

Sheldon Mckay, Scientific Lead, Cold Spring Harbor Laboratory
Michael Gonzales, Project Manager, TACC, University of Texas
Natalie Henriques, Administrative Assistant, TACC, University of Texas
Mary Margaret Sprinkle, Administrative Assistant, University of Arizona
Adam Kubach, Analyst, University of Texas
Bernice Rogowitz, Analyst, TACC, University of Texas
Zhenyuan (Jerry) Lu, Analyst, Cold Spring Harbor Laboratory
Liya Wang, Analyst, Cold Spring Harbor Laboratory
Sharon Wei, Analyst, Cold Spring Harbor Laboratory