9 December 2015

Making Earth and Space Science Data Matter

Posted by lhwang

By Rebecca Fowler

 

This is part of a new series of posts that highlight the importance of Earth and space science data and its contributions to society. Posts in this series showcase data facilities and data scientists; explain how Earth and space science data is collected, managed and used; explore what this data tells us about the planet; and delve into the challenges and issues involved in managing and using data. This series is intended to demystify Earth and space science data, and share how this data shapes our understanding of the world.

Dr. Bruce Howe and Bill Felton of the University of Washington prepare to deploy a Seaglider, a type of autonomous underwater vehicle, equipped with sensors for measuring salinity, nutrient concentrations and other parameters. Credit: Matthew Grund

Dr. Bruce Howe and Bill Felton of the University of Washington prepare to deploy a Seaglider, a type of autonomous underwater vehicle, equipped with sensors for measuring salinity, nutrient concentrations and other parameters.
Credit: Matthew Grund

Satellites carrying cameras and sensors orbit Earth collecting information about clouds, oceans, land and ice; autonomous underwater vehicles festooned with instruments map the dynamic features of the seas; and weather stations consisting of grids of still more sensors measure atmospheric conditions. These are just some of the technologies that generate the raw data used by scientists to predict weather and climate, by emergency workers to respond to natural disasters and by policymakers to address global challenges.

Delegates from over 195 countries are currently gathered in Paris at the 21st Conference of the Parties (COP21) to discuss one of these challenges. Their goal is to finalize a legally binding international agreement that will limit global carbon emissions. If participants are able to reach such a deal, COP21 will be the starting point for a long-term effort to prevent global temperatures from rising more than­ 2°C. The Paris negotiations are based on the position of the Intergovernmental Panel on Climate Change (IPCC), the international organization formed by the United Nations that assesses and reports on the state of climate change science by aggregating the work of some 2,000 peer-reviewed climate scientists.

The data that informs the IPCC reports, and thus the COP21 delegates’ decision-making, is a product of the Earth-observing satellites, tools and sensors that federal agencies such as NASA, NOAA and other groups use to monitor global and U.S. climate. These generate petabytes of data each year. A story aired by NPR on 30 November 2015 provides a look at how much data these tools and technologies yield, and how much data comes out of one climate model simulation. In the piece, Dan Duffy, the high-performance-computing lead at NASA’s Center for Climate Simulation, explains that if a person had the same amount of storage on their computer that the Center has, “you could have a music playlist that was 190,000 years long before you would have to listen to the same song twice.”

If Earth and space science data are the raw materials used to understand the planet, then the management of this data grows ever more important as the amount of data generated increases. The transformation of raw data into reliable climate records and models that inform decision-making involve a series of critical steps. After data is collected it’s transmitted, processed, organized, analyzed, archived, published and stored at data facilities. Data also requires proper management so that is accessible, discoverable and can be used across diverse platforms.

“Data management is a discipline that ensures that the information about the world is well cared for and understood,” said Shelley Stall, Assistant Director of AGU, Enterprise Data Management Program. “Librarians have recognized the importance of data management forever. You can’t just put data out there, you have to have metadata and best practices to make it meaningful.”

The goal of data management is to improve the usage and thus the value of scientific data, which improves our understanding of our Earth and its systems. Given that observational data and model simulations are the primary tools scientists use to understand our climate system and global environmental change, the better Earth and space data is managed, the more we stand to learn. This abundance of Earth and space data, and figuring out how to handle it, has necessitated the development of a new discipline: Earth and Space Science Informatics, which deals with issues of data management and analysis.

Recognizing the challenges and opportunities provided by the growing importance of data in the Earth and space sciences, a Data Fair will be held 14-17 December 2015 at the American Geophysical Union Fall Meeting. During a special Town Hall, the U.S. Chief Data Scientist and Deputy Chief Technology Officer for Data Policy, DJ Patil, will share his perspective on Earth and space science data and technology. Those attending the Fall Meeting are invited to participate in Data Fair activities, and everyone is encouraged to stay up to date on data by reading posts in this series.

— Rebecca Fowler is a science communicator and the Director of Communications and Outreach at the Federation of Earth Science Information Partners (ESIP).