The University of Notre Dame Hesburgh Libraries received a grant from the Institute of Museum and Library Services (IMLS) National Leadership Grants for Libraries Program to conduct a collaborative planning effort to develop an open source Research Data & Software Preservation Quality Tool that addresses a universal need for preserving data and software.
As computation is increasingly interwoven with science, today’s researchers can explore and analyze data and possible scenarios more quickly than ever before. The associated software, data, and platforms of these scientific endeavors can foster rapid progress when shared between scientists and information systems. However, preserving and sharing the massive volume of research has become an increasingly challenging effort, and existing solutions are disjointed and vary dramatically across institutions and disciplines. This collaborative project will garner broad institutional and researcher input toward creating a framework of new and existing tools that addresses the critical need for data and software preservation.
The Notre Dame research team is led by Zheng (John) Wang, Associate University Librarian for the Hesburgh Libraries. Wang will be supported by co-PIs Richard Johnson, Co-Director of Digital Initiatives and Scholarship, and Natalie K. Meyers, E-Research Librarian, of the Hesburgh Libraries as well as co-PI Sandra Gesing, Ph.D. Computational Scientist, of Notre Dame’s Center for Research Computing (CRC).
“The digital age presents significant challenges for libraries and their partners across the research enterprise when it comes to preserving and sharing data and related software in a timely and streamlined manner,” said Wang. “It is imperative that Libraries take a collaborative leadership role with research faculty to develop open source tools that integrate research workflows and library processes to preserve data, software, and methods throughout the research lifecycle.”
The proposed Research Data & Software Preservation Quality Tool will allow reuse of preserved software applications, improve technical infrastructure, and build upon existing data preservation services. Additional outcomes include: captured digital workflows and methods, improved data and software provenance, automatically enhanced metadata, and improved file format recognition and data integrity. The planning design allows for input, consensus building, and support from regional, domestic, and international stakeholders. This collaborative approach will ensure that the tool will be flexible to fit a wide range of existing preservation tools and workflow systems. It will also broaden the awareness and adoption of across user communities.
“The project promises to strengthen international opportunities for collaborative software development, help like-minded organizations develop solutions across national and disciplinary borders, and empower the research data repository.” said Sandra Gesing.
The Center for Open Science (COS) joins the project team as a dedicated partner organization. The center’s role will be focused on reproducibility and interoperable data sharing aspects of the project. COS will also provision and support the project’s use of the Open Science Framework (OSF) to store, share, and collaborate on project components. “Data sharing, access and collaboration among researchers are some of our most important priorities at COS," said Rusty Speidel, Marketing Director at COS. "We are pleased to be involved in developing these critical tools and in furthering the preservation and sharing of open and transparent research."
Several organizations are project participants, including: the Scientific Information Service at CERN, The Research Data Alliance (RDA) Interest Group on Virtual Research Environments (VRE IG), RDA Interest Group on Metadata (Metadata IG), the Science Automation Technology Laboratory at the USC Information Sciences Institute, as well as Cal Poly’s Project Jupyter. The project team is pleased to have pledges of participation from the Confederation of Open Access Repositories (COAR), SHARE, DataCite, re3data.org registry of research data repositories, the Digital Research and Curation Center at Johns Hopkins University, Yale Libraries, and NCSA’s Midwest Big Data Hub.
Information gathered during the grant-funded work and a detailed project development proposal will be shared transparently using the Open Science Framework (https://osf.io/d3jx7) DOI: 10.17605/OSF.IO/D3JX7 and be archived at project’s end at Notre Dame’s research repository, CurateND (curate.nd.edu).
Contact: Natalie Meyers, Hesburgh Libraries, 574-631-1546, natalie.meyers@nd.edu
Originally published by Notre Dame Research.