de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Contact usLogin
  Advanced SearchBrowse

Item


Conference Paper

Released

Workflows for ingest of research data into digital archives : tests with Archivematica

Kirchner, I., Bertelmann, R., Gebauer, P., Hasler, T., Mettig, N., Klump, J., Peters-­Kottig, W., Rusch, B., Ulbricht, D. (2013): Workflows for ingest of research data into digital archives: tests with Archivematica, AGU 2013 Fall Meeting (San Francisco, USA 2013).



http://gfzpublic.gfz-potsdam.de/pubman/item/escidoc:356936
Resources

356936.pdf
(Publisher version), 4MB

Authors
http://gfzpublic.gfz-potsdam.de/cone/persons/resource/jklump

Kirchner ,  Ingo
External Organizations;
EWIG, External Organizations;

http://gfzpublic.gfz-potsdam.de/cone/persons/resource/rab

Bertelmann ,  Roland
Library, Scientific Infrastructure and Plattforms, GFZ Publication Database, Deutsches GeoForschungsZentrum;
EWIG, External Organizations;

Gebauer ,  Petra

Hasler ,  Tim

Mettig ,  Nora

http://gfzpublic.gfz-potsdam.de/cone/persons/resource/jklump

Klump ,  Jens
CeGIT Centre for GeoInformation Technology, Geoengineering Centres, GFZ Publication Database, Deutsches GeoForschungsZentrum;
EWIG, External Organizations;

Peters-­Kottig ,  Wolfgang

Rusch ,  Beate

http://gfzpublic.gfz-potsdam.de/cone/persons/resource/ulbricht

Ulbricht ,  Damian
CeGIT Centre for GeoInformation Technology, Geoengineering Centres, GFZ Publication Database, Deutsches GeoForschungsZentrum;
EWIG, External Organizations;

Abstract
Publication of research data and future re-use of measured data require the long-term preservation of digital objects. The ISO OAIS reference model defines responsibilities for long-term preservation of digital objects and although there is software available to support preservation of digital data, there are still problems remaining to be solved. A key task in preservation is to make the datasets ready for ingest into the archive, which is called the creation of Submission Information Packages (SIPs) in the OAIS model. This includes the creation of appropriate preservation metadata. Scientists need to be trained to deal with different types of data and to heighten their awareness for quality metadata. Other problems arise during the assembly of SIPs and during ingest into the archive because file format validators may produce conflicting output for identical data files and these conflicts are difficult to resolve automatically. Also, validation and identification tools are notorious for their poor performance. In the project EWIG Zuse-Institute Berlin acts as an infrastructure facility, while the Institute for Meteorology at FU Berlin and the German research Centre for Geosciences GFZ act as two different data producers. The aim of the project is to develop workflows for the transfer of research data into digital archives and the future re-use of data from long-term archives with emphasis on data from the geosciences. The technical work is supplemented by interviews with data practitioners at several institutions to identify problems in digital preservation workflows and by the development of university teaching materials to train students in the curation of research data and metadata. The free and open-source software Archivematica [1] is used as digital preservation system. The creation and ingest of SIPs has to meet several archival standards and be compatible to the Metadata Encoding and Transmission Standard (METS). The two data producers use different software in their workflows to test the assembly of SIPs and ingest of SIPs into the archive. GFZ Potsdam uses a combination of eSciDoc [2], panMetaDocs [3], and bagit [4] to collect research data and assemble SIPs for ingest into Archivematica, while the Institute for Meteorology at FU Berlin evaluates a variety of software solutions to describe data and publications and to generate SIPs. [1] http://www.archivematica.org [2] http://www.escidoc.org [3] http://panmetadocs.sf.net [4] http://sourceforge.net/projects/loc-xferutils/