Think about how you or scientists in related fields might use your data in the future.
Are the data reproducible?
If the data are reproducible, how much time, money, and effort would be required to reproduce the data?
|
For example:
- Ecological and environmental field data collected at a given time and place are often considered to be unreproducible -- so consider whether future ecologists might want to use your data as a historical comparison to present conditions.
- Modeling data, by comparison, are often considered to be reproducible -- if you save the model and the input data, then the modelling results can theoretically be recreated at any time (unless the model used a software program that becomes obsolete). However, if running the model required a great deal of computing power, then you would want to compare the costs of storing the results versus re-computing the results in the future.
|
If you’ve deposited your data in a repository: |
The repository should have retention policies in place stating how long they will preserve the data (you might see a policy stating that dataset will be "reappraised" after a certain number of years to determine if the data are still of interest and/or accessible to other researchers). |
If you want to preserve your data for a number of years but do not wish to deposit the data in a repository: |
You'll need to periodically test the data to be sure they are still accessible.
Even if your file formats were open and accessible when you created them, as technologies change the data might need to be migrated to a new format to remain accessible. Before you migrate data to a new file format, always consider what data could be lost along the way, and how you can ensure that your documentation/metadata remains associated with the new file.
Questions? Contact us! |