24 May 2016
By Shelley Stall
This is part of a new series of posts that highlight the importance of Earth and space science data and its contributions to society. Posts in this series showcase data facilities and data scientists; explain how Earth and space science data is collected, managed and used; explore what this data tells us about the planet; and delve into the challenges and issues involved in managing and using data. This series is intended to demystify Earth and space science data, and share how this data shapes our understanding of the world.
Within the scientific data lifecycle, from data acquisition, to publication and preservation, the data manager (also known as a data steward) plays an increasingly important and often unappreciated role. This role is growing in importance due to the rapid growth in the volume of data—unlike the funds to manage it—the need for interoperability of these data, the new regulations regarding open access and long-term preservation. Data managers are driven by the dictum and aspiration that well documented, citable and preserved data is an investment in science, one that is critical to future discoveries.
A Call for Recognition
Data managers increasingly play more central roles in the entire data lifecycle (and should continue to do so), and they are deserving of better recognition in publication. When managed well, your data is an asset trusted by the scientific community beyond your lifetime, and is a better resource for you and for science. When managed poorly, your data will more likely have a short life, with little or no contribution to important scientific understanding in the future.
To establish a common point of view, the role of a data manager includes activities such as:
- receiving submitted data from scientists, researchers or teams
- ensuring that the received data is complete
- validating that the data can be understood through the metadata
- adjusting the format of the data so that it is compliant with the repository standards
- making the data accessible and discoverable at the right time
- preserving and archiving the data into the future
Researchers increasingly recognize the value of involving data experts in the collection and organization of their data, and that this improves efficiency and quality of the data. If the role of a data manager is so important to the scientific community, their contributions must be recognized. Further discussion on this subject can be found in this new D.Lib new publication on the roles and responsibilities of data stewards.
Building a Career in Data Management
What drives members of our community become data managers? There is no degree program in Scientific Repository Data Management. The best education available for those who wish to become data experts is a data management workshop, data curation workshop, or on-the-job training from folks who already perform the job. Some certificates exist to show completion of classes, and some skills are taught in Library Management course work, but there is very little training of researchers in how to work with data experts—and why. This is actually becoming an important part of best practices and ethics around data.
If data management is such a worthy task, how do we make it a career goal for students? Historically, data management has been a secondary career after learning about a data domain. This is also true outside the science community in areas like healthcare and finance. Yet we are at a turning point where we must demonstrate that this vital role of data curation and preservation is a rewarding career path—and facilitate understanding of what it takes to become a data manager.
Beyond curation and preservation tasks, data managers typically have the following:
- Excellent data management practice knowledge that helps them be efficient and consistent.
- Excellent customer service that is timely and knowledgeable.
- Expert skills in data management tools supporting data quality assessment.
- Ability to support and adhere to multiple data and metadata standards.
- Experience in data infrastructure implementation that includes reliable archive storage, and data store backup.
- Savvy understanding of grants, proposals and access to other funding.
- Ability to accurately estimate costs and manage to a budget.
- Communications skills for participation and contribution to community awareness of good data management and future scientific community needs.
- Eagerness to collaborate with other scientific data managers who can assist with improving their practices.
The list is nearly a complete job description. Add “works well in a team” and has the “patience of a saint” and you get the picture.
Connecting the Dots for Improved Impact
So, I wish to raise these questions. How do we acknowledge our data managers?
To what extent should we include them as co-authors on submitted data and papers, so that when the data are published and cited, their contributions are recognized?
Are there other ways to develop data management skills and provide viable paths that will encourage others to take on this important role?
At the very least, next time you talk with your data manager, let them know you appreciate them and their contributions to the scientific community.
Thank you to the Earth and space science data managers!
—Shelley Stall is Assistant Director for the American Geophysical Union’s Enterprise Data Management Program.