Chapter 12 – The promise of data in e-research: Many challenges, multiple solutions, diverse outcomes

Ann Zimmerman, Nathan Bos, Judith Olson, and Gary Olson

The need to share data and to exchange knowledge about data is a primary driver behind many visions of e-science. Yet, efforts to share data face considerable social, organizational, legal, scientific, and technical challenges. Large-scale solutions to these barriers have been slow to develop, although some progress is being made in the technological arena. This chapter reports findings from an analysis of the data sharing approaches used by large collaborations in several scientific disciplines. These results are based on research conducted as part of the Science of Collaboratories project, a five-year study funded by the National Science Foundation to investigate large, distributed collaborations across many domains. One type of solution for data sharing that we identify allows researchers to work as they always have, while the labor necessary to prepare data for sharing and to support their reuse are handled by others. In contrast, a second approach forces scientists to consider barriers to data sharing and aggregation at the outset of data collection and to develop solutions in advance to deal with these issues. Our results show that different types of data sharing solutions place different demands on those who produce data and on those who collect and manage data and make them available for others to use. In addition, individuals or small teams of researchers can often conduct their work privately, whereas large-scale collaborations are subject to increased accountability, greater independencies, and intensified needs for standardization. We discuss the ways in which these factors affect the production, organization, and sharing of data and the implications they have for the institutions and individuals that produce, manage, provide, and preserve scientific data.