Skip to main navigation Skip to search Skip to main content

Comprehensive data infrastructure for plant bioinformatics

  • University of Texas at Austin
  • Cold Spring Harbor Laboratory

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

The iPlant Collaborative is a 5-year, National Science Foundation-funded effort to develop cyberinfrastructure to address a series of grand challenges in plant science. The second of these grand challenges is the Genotype-to- Phenotype project, which seeks to provide tools, in the form of a web-based Discovery Environment, for understanding the developmental process from DNA to a full-grown plant. Addressing this challenge requires the integration of multiple data types that may be stored in multiple formats, with varying levels of standardization. Providing for reproducibility requires that detailed information documenting the experimental provenance of data, and the computational transformations applied to data once it is brought into the iPlant environment. Handling the large quantities of data involved in high-throughput sequencing and other experimental sources of bioinformatics data requires a robust infrastructure for storing and reusing large data objects. We describe the currently planned workflows to be developed for the Genotype-to-Phenotype discovery environment, the data types and formats that must be imported and manipulated within the environment, and we describe the data model that has been developed to express and exchange data within the Discovery Environment, along with the provenance model defined for capturing experimental source and digital transformation descriptions. Capabilities for interaction with reference databases are addressed, focusing not just on the ability to retrieve data from such data sources, but on the ability to use the iPlant Discovery Environment to further populate these important resources. Future activities and the challenges they will present to the data infrastructure of the iPlant Collaborative are also described.

Original languageEnglish
Title of host publication2010 IEEE International Conference on Cluster Computing Workshops and Posters, Cluster Workshops 2010
DOIs
StatePublished - 2010
Event2010 IEEE International Conference on Cluster Computing Workshops and Posters, Cluster Workshops 2010 - Heraklion, Crete, Greece
Duration: Sep 20 2010Sep 24 2010

Publication series

Name2010 IEEE International Conference on Cluster Computing Workshops and Posters, Cluster Workshops 2010

Conference

Conference2010 IEEE International Conference on Cluster Computing Workshops and Posters, Cluster Workshops 2010
Country/TerritoryGreece
CityHeraklion, Crete
Period09/20/1009/24/10

Keywords

  • Bioinformatics
  • Component
  • Data
  • Gateways
  • Metadata
  • Provenance
  • Standards

Fingerprint

Dive into the research topics of 'Comprehensive data infrastructure for plant bioinformatics'. Together they form a unique fingerprint.

Cite this