The Planteome Project, an international collaborative effort that is Powered by CyVerse, recently has announced the first full release of their database and ontology browser. The browser serves as a centralized portal where common reference ontologies (structured, controlled vocabularies) for plants are used to annotate gene expression, traits, phenotypes, genomes, and genetic diversity, across a wide range of plant taxa.
The Planteome Project is funded by the National Science Foundation and leverages infrastructure provided by CyVerse to develop data standards and reference vocabularies that can be used universally to describe plant gene and phenotype annotation. With numerous institutional collaborations, the project's goal is to aid scientists in developing improved plant genotypes leading to agricultural crops capable of delivering higher yields and withstanding climate pressures.
"The Planteome platform leverages CyVerse cyberinfrastructure for server, data, and service hosting, reducing the need for redundant infrastructure and increasing service integration opportunities with other CyVerse-hosted and external projects," said Pankaj Jaiswal, Associate Professor in the Department of Botany and Plant Pathology at Oregon State University and leader of the Planteome Project.
"Various protein annotation tools housed by the CyVerse Discovery Environment were used extensively to annotate the proteome of 63 plant species with ontology-based function and phenotype assignments. Annotating a single proteome can take days or weeks on a small local computing infrastructure," he added. "Considering the number of species, and over one million proteins in our proetome set, annotation would not have been possible without the cyberinfrastructure provided by CyVerse."
In addition, Jaiswal noted, providing the Planteome database hosted by the CyVerse cloud ensures free and ready access to researchers worldwide. The Planteome database currently includes 67,272 ontology terms with links to approximately 1.9 million bio-entitites, including proteins, genes, RNA transcripts and gene models, and germplasm, among others. Annotated data were sourced from 24 unique database resources and cover 86 different plant taxa.