3 March 2016

Open data: Creating a culture of transparency and reproducibility in science

Posted by Nanci Bompey

By Rebecca Fowler

This is part of a new series of posts that highlight the importance of Earth and space science data and its contributions to society. Posts in this series showcase data facilities and data scientists; explain how Earth and space science data is collected, managed and used; explore what this data tells us about the planet; and delve into the challenges and issues involved in managing and using data. This series is intended to demystify Earth and space science data, and share how this data shapes our understanding of the world.

Shifting to an open-data culture in science poses challenges, such as moving handwritten notes in the standard lab notebook to an electronic, easy-to-share format. Credit: Rebecca Fowler

Shifting to an open-data culture in science poses challenges, such as moving handwritten notes in the standard lab notebook to an electronic, easy-to-share format.
Credit: Rebecca Fowler

Researchers in the field sciences, such as geology, ecology and oceanography, collect an ever-increasing amount of data and samples. These include physical samples, information about how and where the data were collected, and the petabytes of digital data streaming in from sensors and satellites.

An article published today in Science urges stakeholders in the field sciences—funders, researchers, publishers and data repositories—to promote open, reproducible science through the sharing of all data and materials. Allowing research results to be replicated and data to be reused fosters innovation, high-quality research and public confidence in science. There are considerable benefits for scientists who make their data open too.

“Research results that are transparent will be valued more in terms of integrity, and cited more. In addition, doing the right level of linking and data curation, for small and large data sets, upfront, will save much work later; and even more work if the curation is so untransparent or poor that the data need to be collected again or are effectively useless,” said Brooks Hanson, a co-author of the article and Publications Director at the American Geophysical Union.

The sharing of information and materials is known as “open data,” which refers to making the information generated by publicly funded research accessible in a digital format that’s free and without restrictions. This requires transparency in experiment methodology, observations and collection of data; that this information and the outputs, such as data, standards and software be publicly available and resuable; and that web-based tools be used to facilitate this sharing and reuse.

Research that is open is research that is easy to build upon, cite and replicate, but making data open calls for significant changes in science. In the current culture, data creators often receive little recognition for sharing their work; many data repositories do not yet offer value-added services that would train researchers on data stewardship and deposition of data; and stakeholders, from researchers to funding agencies, are concerned about the costs and resources involved.

The authors acknowledge these issues and are keen to point out the ways the scientific community is responding. For example, the non-profit Center for Open Science advances transparent science by enabling scientists to manage and archive their research using the Open Science Framework, and find a permanent home for their data in the proper domain-specific repository. Community partnerships like the new Coalition for Publishing Data in the Earth Sciences (COPDESS) aim to connect scholarly publication more firmly with Earth science data facilities.

Though moving toward a culture of open-data requires investment from stakeholders in the field sciences, Hanson believes the changing nature nature of science demands it.

“These data and linking are important for integrity and the main use of the literature to advance and document science,” Hanson said. “That is, the data are as or more important in some cases than the particular analysis and have great value for further science. Scientists and funders are also supporting or returning value to society, who also deserve integrity in that society is increasingly dependent on science to inform decisions. So it’s part of our contract with society.”

Hanson’s co-authors on the article are Marcia McNutt of Science, Kerstin Lehnert of Columbia University, Brian Noseck of the University of Virginia, Aaron M. Ellison of Harvard University and John Leslie King of the University of Michigan.

— Rebecca Fowler is a science communicator and the Director of Communications and Outreach at the Federation of Earth Science Information Partners (ESIP).