• No results found

T Data Management for Integrative Biology Guest Editorial

N/A
N/A
Protected

Academic year: 2021

Share "T Data Management for Integrative Biology Guest Editorial"

Copied!
1
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1 OMICS A Journal of Integrative Biology

Volume 7, Number 1, 2003 © Mary Ann Liebert, Inc.

Data Management for Integrative Biology

T

HIS FIRST ISSUE OFVOLUME7 OFOMICS A Journal of Integrative Biology is devoted to the special topic of Data Management. The issue arose from the Workshop on Data Management for Molecular and Cell Biology held at the National Library of Medicine on February 2–3, 2003. The purpose of the workshop was to formulate a research agenda for the data management community to develop better technology for sup-porting bioinformatics applications. This present issue of the Journal contains both a summary of the work-shop report and a collection of the white papers submitted by attendees of the workwork-shop.

The impetus for this workshop was the increased demand for data management systems occasioned by the industrialization of molecular and cell biology over the past 15 years. The development of high through-put technologies for sample preparation, sequencing, microarrays, proteomics, and combinatorial chemistry has led to an explosion in the amount and types of data available for biomedical research. Metabolic mod-els provide a means for linking genomics to pharmacology. It is widely anticipated that these technologies will find clinical applications within a few years, leading to further increases in data volumes. Data man-agement systems are needed to manage such large datasets. Without suitable tools for storing and query-ing these large datasets, many of the benefits of these massive investments in data collection will be de-layed or squandered. Consider the limited utility of the human genome sequence if our only access to it were by reference to printed copies, rather than online approximate sequence matching.

Bioinformaticists have been largely dependent on hand-me-down relational database technology from the business sector. But bioinformatics applications have distinct data management requirements: a wide di-versity of data types (sequences, graphs, 3D structures, etc.), extensive use of similarity and pattern match-ing queries, and a need for data provenance trackmatch-ing. Furthermore, there is a need to support large scale (e.g., 500 databases) data integration, the associated terminology management, and rapid schema evolution. The data integration problems of integrating large numbers of databases cannot be met without assistance of the individual database providers in the form of machine processable schemas, ontologies, terminolo-gies, and accompanying query APIs, query languages, and standardized data exchange formats (e.g., XML) and the associated data definitions. We need data management systems better suited to bioinformatics

ap-plications. Such systems will not appear spontaneously. The research and development for such bioinfor-matics data management technology will require federal funding of targeted interdisciplinary research.

We anticipate that bioinformatics applications will be one of the major drivers for innovation in data management technology over the next decade. The resulting data management technologies should greatly facilitate the development of bioinformatics applications, and hence the conduct of biological and biomed-ical research over the next quarter century. To read the full workshop report, please go to the workshop website: http://www.lbl.gov/,olken/wdmbio/

—F. Olken Lawrence Berkeley National Laboratory

Berkeley, California —H.V. Jagadish University of Michigan

Ann Arbor, Michigan

Referenties

GERELATEERDE DOCUMENTEN

The epistemological need for trust in research relationships generally implies that anthropological ethics starts, in the vast majority of cases, from the position of doing no harm

Mogelijk kan de spieker (structuur 2) ook in deze periode geplaatst worden, maar aangezien hier geen daterend materiaal werd aangetroffen blijft deze datering

The BioSPICE development community is comprised of researchers from various disciplines that include computer science, cellular biology, mathematics, molecular biology,

Fur- ther research is needed to support learning the costs of query evaluation in noisy WANs; query evaluation with delayed, bursty or completely unavailable sources; cost based

We believe that development of general purpose graph data management systems (GDMSs) could become major platforms for development of a wide variety of bioinformatics database

discipline specific standard operating pro- cedures for safe data collection and storage – Research teams should establish data collection and storage protocols for all team

These methods produce an overall level of inventory that senior management typically judges in terms of an inventory turnover ratio (annual sales / average

All the relevant elements of employee commitment, namely the importance of commitment, factors affecting commitment and how it affects employees, strategies for increasing