WP8: Data archiving and distribution
February 12, 2010
This work package was assigned the task of designing a data-handling system capable of coping with the very large data volumes expected from EISCAT_3D. It included both short-term storage and long-term archiving, as well as the consideration of data visualisation and user interactions with the various archiving systems.
Some of the recommendations that originates from this Work Package are:
- A mix of storage technologies should be utilised to meet the proposed design goals of the archive and distribution system
- There should be no encryption in the data system. The data are public, and the overhead of encryption is resource costly. Security will be provided through access control at the session level.
- Further investigation into the power requirements of a MAID-like system compared to conventional disk technology is needed. However, we recommend MAID as it matches the survey-use mode.
- The EISCAT_3D staff responslible for the e-infrastructure should become more involved in discussions with industry, as network and infrastructure performance is critical to the data distribution design and the selection of hardware.
- A test or prototype system should be purchased for the remote sites and that a test deployment is made over winter.
- We were impressed by Union Solution and DataDirect Networks and recommend that they be encouraged to tender for any contracts arising from this Work Package.
- At least one vendor with a specialisation in telecoms equipment should be included in any further research or tenders in data centre construction or integration.
- The remote site should have a hardware solution that allows all channels to be handled simultaneously.
- Increased communication between the Work Packages in the EISCAT_3D Design Study is needed. The outcome of these meetings would be the initial draft of interface control documentation. This closer collaboration between the Work Packages should be encouraged by the EISCAT_3D project management.
- A contract should be awarded to a single oversight-organisation for the EISCAT_3D data centre and data handling system, so there are no “responsibility gaps”, which result in extended periods of down time.
- Analysis of the reviews and investigations made by digital curation groups should be performed and their recommendations for data storage policy should be acted upon. EISCAT_3D should collaborate with these projects so that accepted standards are used, allowing EISCAT_3D to integrate with the general scientific archive community.
- Additional funding should be allocated to pursue the formulation of a data curation policy that mitigates potential drains on archive performance and data reliability.
- No factory tests by vendors should be done. These are contrived and are a waste of time for both the vendor and for us.
- As the acquisition, analysis and archiving of data constitutes such a large part of the final EISCAT_3D project, we recommend the provision of a data system prototyping budget to match the staff effort and capital costs devoted to the Demonstrator Array.
The work in this Work Package was performed by the Rutherford Appleton Laboratory and the University of Tromsø.
Data archiving and distribution in the context of the entire EISCAT_3D project (September 2008) (This is a large file [9.91 MB], thus it could not fit in the list of attachements. It is available here instead.)