The iPlant Cyberinfrastructure
The iPlant Collaborative will have a core infrastructure of hardware, software, and staff to support the goals of Grand Challenge teams and the larger plant sciences community. In addition to building the infrastructure to support the science and technology necessary to help answer Grand Challenge questions, a key goal of the iPlant Collaborative will be reuse and integration with relevant technology that exists in the science community and to collaborate with other researchers when building new infrastructure is necessary.
The iPlant computational facilities will be designed to support software development as well as the computing and visualization requirements of scientists doing computational modeling, analysis, data discovery, and other computing-intensive experiments. The core will contain shared-memory multiprocessors and clusters and provide an interface to grid computing facilities. It will also contain large storage systems to provide persistent, reliable, and effectively unbounded storage for plant science data. The repository will ensure that key data sets are preserved beyond the lifetime of the projects that produced them. It will also support reproducibility of experimental results by providing mechanisms to archive snapshots of experimental configurations, including all software and data used to generate a given set of results. The infrastructure will be developed and managed so that it is kept at the leading edge of technologies required to solve Grand Challenge problems in plant science.
The iPlant staff is organized into three teams. The Grand Challenge Engagement Teams, one per Grand Challenge Project, interface with the plant science community to help cultivate the science and collaborations necessary to begin answering Grand Challenge questions and to help identify the requirements of the iPlant “Discovery Environments” (DE). The Core Software team is responsible for identifying, designing, building, and testing core assets (libraries, components, and applications) for use within the iPlant DEs. The Core Services team designs and implements the software, networking, storage, hardware, and services infrastructure essential to support the iPlant Cyberinfrastructure.
The Grand Challenge Engagement Team
The Grand Challenge Engagement Team is responsible for helping to identify the science and algorithms, which will form the foundation for Discovery Environments. The Grand Challenge Engagement Team members have experience in biology, computer science, enterprise data management, bioinformatics, genomics, and software engineering. Other GC Engagement key personnel add expertise in image analysis, machine learning, workflow management, cluster computing, statistical analysis, and large-scale data management.
The Core Software Team
Core Software engineering staff are professional software engineers whose role is to provide support to research staff for data mining, algorithm implementation, data management, and application development. These software engineers have experience in agile software development, a paradigm that emphasizes early development, flexible requirements analysis, and extensive testing. Software engineers with more traditional training often easily get frustrated when dealing with biological applications due to the fluid and underspecified nature of the problems. Agile software developers, who often have open source software development backgrounds, tend to be better suited to the fluid environment of the GC projects.
The Core Services Team
The iPlant Core Services team has a full-time staff to install, develop, document, and maintain software tools in support of Grand Challenge teams, administer the physical infrastructure, and provide help-desk support for users. The team will also have a small research and development group to design, prototype, and eventually deploy software systems that are needed but not available elsewhere. For example, creating a ‘reproducible experiment’ archive as specified above is a hard, as yet unresolved problem. One key function of the infrastructure staff will be to ensure scalability of the software to both large numbers of processors and large datasets.