About

     The Data Conservancy (DC) embraces a shared vision: scientific data curation is a means to collect, organize, validate and preserve data so that scientists can find new ways to address the grand research challenges that face society.  DC infrastructure development will occur in an iterative manner with testing and prototyping of systems informed directly by user-centered design and feedback from information science and computer science research. At every stage, the Data Conservancy will engage deeply domain scientists, generate substantive broader impacts and focus on sustainability planning.  DC embraces the concept of principles of navigation rather than a rigid road map for infrastructure development.  Our initial prototyping efforts have resulted in:

  • Well designed, modular architecture according to the Open Archival Information Systems (OAIS) reference model.
  • A data model modified from the PLANETS project.
  • A storage framework abstraction that demonstrates seamless use of two different storage systems.
  • Content from the Sloan Digital Sky Survey and Dry Valleys prepared for preservation and ingested into the DC digital archive.
  • Pilot projects that build upon DC APIs including:
    • Interoperability between DC and the National Snow and Ice Data Center glacier photo service.
    • Connection of data and publications through arXiv.org.
    • Access to DC data through the Sakai collaboration and learning environment.
    • Integration of DC data with an existing science research framework through the International Virtual Observatory Alliance (IVOA).
  • A proof of concept focused on ice road development in Alaska that demonstrates data synthesis and integration from distributed sources.
  • Meta-analysis of urban vulnerability that may result in a new form of holistic science of inherent urban vulnerability.

     Perhaps most importantly, DC's initial prototyping efforts have illuminated the complex, large-scale socio-technical dimensions of infrastructure development, particularly as it relates to multi-disciplinary or trans-disciplinary fourth paradigm science.  Even at this early stage, Data Conservancy has developed the foundation for full-fledged preservation, improved conduct of science, greater insights into current science and frameworks for new forms of science.  However, there remains much work to be done for achieving both operational support for data management and for investigating further the full problem space of infrastructure development.

 

     The Data Conservancy will address its goals with a comprehensive program comprising four inter-depending objectives and teams: infrastructure research and development (IRD), information science and computer science research (IS/CS), broader impacts (BI), and sustainability (S).  In addition to the four objective-based teams, DC will begin with four Scientific Working Groups (SWG) in the areas of astronomy, earth sciences, life sciences and social sciences. Each SWG will be charged with specific tasks, aimed at ensuring that Data Conservancy development provides appropriate support and resources for the scientific research and education communities.

 

 

     The Data Conservancy is sponsored by the National Science Foundation under DataNET award OCI0830976.