The Workbench design defines a set of core data structures for describing pathways, chemical compounds, interactions, and so forth. Data may be imported to, and exported from these data structures in a variety of file formats.
File format support will be through Workbench plugins that support importing and exporting data for a specific file format, such as SBML, BioPax, or KGML. File format plugins handle the translation of data into and out of the Workbench's generic data structures and the format's own view of the data.
To support a wide variety of file formats and pathway applications, the Workbench's data structures contain a richer set of data than that supported by most existing file formats. For instance, the Workbench will support simulation parameters not supported by the BioPax format, and ontology data not supported by SBML files. Export to these formats must omit this data, and import from them will produce sparsely filled data structures in the Workbench.
To support the full range of data in the Workbench's data structures, the Workbench defines a new PATH file format. The format describes a pathway, ontologies and other typing schemes, simulation parameters, presentation and layout values, and database-specific IDs.
The following table characterizes the range of data storable in each of the file formats intended for the Workbench:
Workbench
PATHBioPax CellML GML KGML PSI-MI SBML SIF Network data Structure yes yes yes basic yes yes yes basic Geometry yes basic Simulation yes yes yes Presentation yes basic basic Domain knowledge Cell types yes yes yes Compartment types yes yes yes Network types yes Organism types yes yes yes Rate law types yes Reaction types yes Compound types yes yes basic yes Units of measure yes yes yes
Comments:
The Workbench will recognize several standard ontologies for chemical compounds, reactions, organizms, compartments, and so forth. The Gene Ontology project, for instance, provides classifications for chemical compounds, while the NCBI Taxon database classifies organisms. These ontologies provide type names and features for various components in a reaction network and enable the Workbench to provide better network validation and visualization.
The Workbench design supports the following specific ontologies and their file formats:
Ontology Source Cell types Cell type Open Biomedical Ontologies Compartment types Cellular component ontology
Gene Ontology Project Pathway types Biological process ontology Gene Ontology Project Organism types Organism taxonomy National Center for Biotechnology Information Rate law types Interaction types Molecular function ontology
Gene Ontology Project Chemical compound types Protein types
Small molecule types
Alliance for Cell Signaling, Molecule Pages Units of measure International System of Units Bureau International des Poids et Mesures Reference on Constants, Units, and Uncertainty National Institute of Standards and Technology (NIST)