The iPlant Leaflet 09-03

Issue 09-3

Greetings from the iPlant Collaborative! As we ease into the long days of summer, we invite you to read about how we are moving forward on iPlant's initial Grand Challenge Project, what this summer's Teacher Fellows will be researching, and other iPlant activities.

A Vision for iPlant's Cyberinfrastructure: An Interview with Dan Stanzione

Conducted and edited by Tina Lee, Spec. Asst.,tinal@iplantcollaborative.org

DanST DanST_Bio

What's the biggest challenge in building cyberinfrastructure for the Grand Challenges?

We actually face several challenges. One is that the community-at-large has no clear picture of what cyberinfrastructure or "CI" is. It's a lot of different things to different people, so it's difficult to set and meet expectations. For some it's supercomputing, for others it's hardware and software, but it's probably somewhere in between. Given
our mandate--to build cyberinfrastructure for plant science Grand Challenges--it's a significant philosophical challenge out of the gate. The technical challenge is that in plant sciences, there's less definition of what proper cyberinfrastructure is, there are no central models. One big difference between cyberinfrastructure for the plant sciences and other fields, say cosmology, is that in most other fields, CI is driven by simulations--you can do simulations, modeling, and predictive analyses. But plant biology is more of a data-driven, observational science, where analysis of large data sets is needed. It's a different set of problems than CI projects have tackled in the past. It's not a scale issue; it's disparate datasets and analysis, and figuring out where computing fits into the research workflow.

What is the iPlant Collaborative's vision for the cyberinfrastructure and what drives that vision?

Let me take the second part first, because what's driving our vision is very clear. The major driver is the needs of the Grand Challenge problems that the community has identified. Saying that we are going to build "a plant science cyberinfrastructure" is too broad a vision, but what we did instead is bring the plant science community together to select the most important questions to solve, identify the requirements of those challenges, and now we are working to build the system based on those requirements. Too often cyberinfrastructure projects have the mentality of "if you build it, they will come" - we've taken the totally opposite approach and I'm convinced our grand challenge users will adopt what we build because we are responding to their needs.

Regarding iPlant's cyberinfrastructure, it's clear we need to reuse many components that are out there and apply existing best practices to minimize reinventing the wheel. We need to integrate and improve what's already out there. Next we will have very open and flexible architecture with the ability to incorporate datasets and new tools. Any decisions we make now are going to be outdated before the end of the project, so flexible architecture will allow us to be agile enough to quickly make changes. And last - data integration. It's the largest of our challenges in building cyberinfrastructure for plant sciences. So we start with the most important datasets, keeping in mind that new data will become available all the time. So we leverage what's there rather than replace. For example, we might have existing phylogenetic trees that we store, we will have trees that we build, and public users
may want to bring their own trees to the party. Of course, we also want the ability to get data out in a usable format, which is an important
aspect of collaboration, and have working teams, or just for communication.

What does 'success' look like to you?

I think success will have several layers. Certainly, it's people using what we build-if the CI is widely adopted by the community-that would be the first layer. Second, if we accelerate discoveries in science,enable solutions to grand challenges-help build a tree of 50,000 green plants, understand how crops adapt to low water use. It'll very hard to say we 'enabled' discoveries, but there are things to measure success, such as how many scientists are attacking grand challenge questions, and the quality of the uses of our cyberinfrastructure. If no one uses it, that's failure for sure. Shallow use would be failure too, but if a small number of users are doing high quality work with it, that would be success. Quality, not just quantity.

How is your team organized and how will you integrate their work across iPlant's different sites?

We've got a number of different functions we need to do. For each grand challenge question, we have an 'engagement team' to work with the users to develop the detailed requirements for the discovery environments. Right now, we've got Project Manager Karla Gendler, and Scientific Lead Sheldon McKay leading those efforts with the Phylogenetic/iPTOL Grand Challenges. To design the particular discovery environments, they must figure out the tools, the datasets that need to be incorporated,functionalities, the output formats, do the prototyping, work iteratively to refine the tools. The engagement teams are the bridges to the core developers who will build the usable production versions of the discovery environments. Physically, our developers are spread across four sites: University of Arizona, Cold Spring Harbor Laboratory, Texas Advanced Computing Center at UT-Austin, and Arizona
State University. We specifically did not try to compartmentalize the functions by each site. It's a little more difficult but we wanted to
be more collaborative.

When will we see something that we can try?

The first prototype discovery environments will be available this fall, around November. I hope to announce them at Supercomputing '09 in Portland. We will probably have online versions in early fall for the beta-testing community, where people can request to be an early user and give us feedback. We also have a couple of things on the Education, Outreach, and Training side that are going to be demonstrated at the Gene Annotation Workshop that iPlant is hosting at Washington University. We are using a parallelized Blast service that will allow people to submit jobs from a web interface. We also have some little tools from some of our pre-project work that enhance existing tools for
visualization. People won't have to wait too much longer before there'll be iPlant tools to test out.

Phylogenetics Grand Challenge Team Kickoff Meeting

By Karla Gendler, iPlant Project Manager,gendlerk@iplantcollaborative.org

iPlant officially kicked-off the Phylogenetics Grand Challenge project, meeting with the seven iPlant Tree of Life team leads at NESCent in Durham, NC, on May 7-8, 2009. The threefold goals of the meeting were ambitious:
1) define the collaborative relationship between iPlant and the Principal Investigators of the iPlant Tree of Life (iPToL) team
2) develop a management plan for the collaboration project
3) create a roadmap for work to be carried out over the next two years.

To achieve the project goals in the 2-year timeframe, four collaborative working groups were proposed, each having focused development goals. Comprising each group will be an iPToL superuser or faculty member designated as the lead and point of contact, post-docs and/or graduate students, and members of iPlant staff assigned to the Grand Challenge project. The four main groups are: 1) Big Trees, focusing on construction of large phylogenetic trees greater than 50,000 species; 2) Data Assembly, focusing on assembling the data to produce large trees; 3) Tree Reconciliation; and 4) Ancestral Character State Reconstruction. Additionally, two crosscutting working groups, Data Integration and Visualization, were proposed to develop shared data and compute infrastructure. Working groups will self-organize and meet, via phone, as required. While actual implementation of the four main working groups will be somewhat staggered, organization of all working groups will commence as soon as possible.

To oversee the progress of the working groups and manage overall project coordination, a Steering Committee comprising Michael Sanderson,Michael Donoghue, Pam Soltis, Dan Stanzione, Steve Goff, Sheldon McKay,and Karla Gendler will meet monthly.

The kickoff meeting also tackled the very important issue of compensation. Although the budget for the Phylogenetics Grand Challenge is still under consideration, compensation methods for community members involved in the project will include a competitive fellowship program for post-docs and graduate students, along with summer salary support for faculty. Support for workshops, meetings, and travel will also be included.

The team also proposed two workshops over the next two years. The first workshop, to be held late fall/early winter 2009, will be on Data Assembly. This workshop will bring scientists together to begin to identify the gaps in current data so that new data could be generated, on separate funding, when tools are ready to run a new large tree. The second workshop, proposed for 2010 in Europe, would present an opportunity to engage a broader set of international collaborators.

In summing up the kickoff, iPToL team member Michael J. Sanderson said, "we had to re-formulate and re-focus in some novel ways, but we are pretty confident that these activities will lead to something exciting."

From Classroom to Field and Back Again: iPlant Summer Teacher Fellowships

By Lisa Howells, iPlant K12 Administrator,lhowells@iplantcollaborative.org

Building upon last year's successful iPlant fellowship program, nine biology, mathematics, and computer science teachers from high schools in Arizona, Ohio, Wisconsin, and Texas will spend six weeks this summer at the University of Arizona (UA) in Tucson. The nine Teacher Fellows, selected from a pool of 55 applicants nationwide, have been paired with UA researchers to assist with gene annotation to field ecology and ecoinformatics research. Drawing upon their summer research experiences, Teacher Fellows will also develop content lessons and modules for integrating "computational thinking" in science and math curricula. These materials will then be made accessible to teachers seeking to incorporate "computational thinking" into their secondary school curricula.

The iPlant Collaborative welcomes the 2009 Summer Teacher Fellows:

Don Moore (Biology), South Shore Jr-Sr High School, Port Wing, WI Don will conduct field ecology studies related to shrub seedling establishment or the decomposition of plant material in desert grasslands under Dr.Steve Archer, University of Arizona (UA) Natural Resources.

Dean Keller (Math), Cholla Magnet High School, Tucson, AZ Also assigned to Dr. Steve Archer's lab, Dean will help mine data from existing studies in the refereed ecological literature, looking for patterns and trends across a range of geographic regions.

Steven Beall (Biology) and Daniela Figueroa (Mathematics), Luz Academy,Tucson, AZ This math-biology team will assist Dr. Judith Bronstein (UA Ecology & Evolutionary Biology) to explore the ecology of interactions between plants and moths, focusing on adult moths which pollinate the sacred datura (Datura wrightii) plant, while the caterpillars of the same species feed on datura leaves.

Cheryl Dunham (Biology) and Stephen Geislinger (Mathematics), Arcadia HS,Phoenix, AZ Under Dr. David Galbraith (UA Plant Sciences, BIO5 Institute), this math-biology team will use microarrays to compare two sequenced rice genomes, Nipponbare and 9311.

Michael Fritz (Computer Science), Columbus School for Girls, Columbus, OH Working under Dr. Brian McGill (UA Natural Resources) with a large pre-existing dataset to census different tree species across the U.S., Michael will help apply a newly developed model to predict species ranges.

Cheryl Parks (Biology), Longview Global High School, Longview, TX Cheryl will assist Dr. Ravi Palanivelu (UA Plant Sciences, BIO5 Institute) in setting up, acquiring, and processing time-lapse images of pollen tube-ovule interactions and quantifying the pollen tube behaviors with a variety of computational tools.

Tun Liang Ong (Biology), Chavez High School, Houston, TX Using Comparative Genomics, Tun Liang and two high school KEYS interns will perform genome annotation and evolutionary analysis in Dr. Rod Wing's (UA Plant Sciences, BIO5 Institute) lab.

fellowsfellows_text

Tracking the Progress and Development of the iPlant Collaborative

By Barbara Heath (East Main Educational Consulting) and Susan Brown (iPlant Collaborative)

In an effort to understand the development of the iPlant Collaborative and the community that forms around it, a social science team and an external evaluation team have been included during the early stages of program planning. The social science research team is led by Dr. Susan Brown ( suebrown@eller.arizona.edu ) of the University of Arizona and the evaluation team is led by Dr. Barbara Heath (bheath@emeconline.com ) of East Main Educational Consulting.

As part of the evaluation and research plans, community members are invited to participate in various data collection activities depending upon the iPlant event. At this time, we are reaching out to the community in the form of a short survey that will require no more than 10 minutes of your time. Your responses will be used in many different ways, including:

Studying adoption, use, and impacts of technology-related changes in society;

Furthering our understanding of how and why people choose to participate in cyberinfrastructure initiatives, as well as the ways in which collaborative relationships are strengthened and expanded;

Applying what is learned to the iPlant project to help make the experience a positive one;

Independently measuring the success of the iPlant Collaborative in relation to the project's defined goals and outcomes;

Generating formative reports so that the project team and Board of Directors can make data-based project decisions;

Creating a feedback loop that includes the community of users.

In order to ensure that each survey is concise and takes up a minimal amount of your time, we have combined the surveys from the social science research and the evaluation teams. This means that each survey has more value, as it will be used in several different capacities. We hope that you will assist us by responding to the survey. The feedback to iPlant, as well as social science theory and practice, are stronger when we have more participants responding. We thank you in advance for
your cooperation! Please visit:

http://www.emeconline.com/survey3/TakeSurvey.aspx?SurveyID=7lKI5p2 to take the online survey.

 

iPlant Website User Survey

The iPlant Collaborative is undertaking a redesign of its website(www.iplantcollaborative.org), in part to lay the groundwork for future Discovery Environments. With a focus on improving the User Experience, we've created a short survey to get your feedback on the site's structure, functionalities, and content. So here's an open invitation to the community: please visit us at Survey Monkey http://www.surveymonkey.com/s.aspx?sm=WvY0kBMKRJd5bKeuIzN4Wg_3d_3d

Tell Us Exactly What You Think!

iPlant Welcomes New Board Chair: Gwen Jacobs

iPlant is pleased to announce that Gwen Jacobs (gwenajacobs@gmail.com)is the new Chair of the Board of Directors. Many thanks to iPlant's first Chair, Rob Last, whose insights and dedication to seeing iPlant succeed helped steer iPlant through its first year and a half. Rob remains on the Board of Directors as Associate Chair.

Upcoming iPlant Activities and Events

The iPlant Collaborative Workshop on "Genomics in Education: Gene Annotation and Comparison"
Washington University, St. Louis, MO, June 16 - 19, 2009. This iPlant workshop will focus on gene annotation and comparison in educational settings. Although the workshop is full, click here to view the online workshop program.

New Process for Grand Challenge Workshop and Collaboration Proposals

With the current engagement of the Phylogenetics and the Genotype-to-Phenotype Grand Challenge teams, anyone interested in submitting a Workshop or Collaboration proposal should first contact iPlant's Project Director, Steve Goff, sgoff@iplantcollaborative.org or 520.626.4224.

Upcoming Conferences Where You Can Find the iPlant Collaborative

Evolution 2009
June 12 - 16, 2009, Moscow, ID (S. McKay)

TeraGrid Outreach Meeting
June 26, 2009, Washington, D.C. (L. Howells)

Computer Science Teachers Association
June 27, 2009, Washington, D.C. (L. Howells)

Earth Science Information Partners Federation
July 7 – 10, 2009, Santa Barbara, CA (N. Merchant, "View from the Field" session, July 8, 1–3 PM)

Generic Model Organism Database Summer School
July 13 - 16, 2009, NESCent, Durham, NC (S. McKay)

Society for Developmental Biology (Satellite Meeting)
July 23-24, 2009, San Francisco, CA (R. Jorgensen)

EBI, Synteny Visualization in Comparative Genomics
July 31, 2009, Cambridge, UK (S. McKay)

GMOD Summer School
August 3-6, 2009, Oxford, UK (S. McKay, Instructor)

Get "Linked in" to the iPlant Collaborative

We invite you to join the professional networking group that iPlant has created on Linked in: http://www.linkedin.com/groups?gid=1887479. We will use this group to announce upcoming iPlant events, conduct community discussions, and provide a forum for the community to share discoveries, resources, and ideas as we move forward together to build cyberinfrastructure to help solve Grand Challenge questions.

Subscribe to The Leaflet

We value your feedback. To subscribe, leave a comment, or suggest a topic for The iPlant Leaflet,
click here .

For more information: Visit us at http://www.iplantcollaborative.org
or contact Steve Goff, Project Director,sgoff@iplantcollaborative.org

Privacy Statement