Thinking about the type of data and the average file size can help you plan for your data organization and storage needs (for example: image data usually require a great deal of storage space).
How much data will you collect, and at what growth rate?
Assess the potential growth rate of data for your project and plan for this growth. For example, are you gathering data by hand or using sophisticated instrumentation that is able to capture a lot of data at once? Will larger amounts of data be collected as the project progresses?
Remember that you can request funding for data management when you write your grant -- so request enough funding to pay for data storage for the life of the project and beyond, if necessary.
Will the data change frequently?
If your dataset undergoes frequent changes, you should implement a "versioning system" to keep track of which file contains the latest iteration of your data, and to help remember how various versions differ.
- To distinguish between versions:
- Implement a systematic naming convention for your files.
- Decide in advance how many versions should be kept -- identify and keep versions that qualify as "milestones" for your dataset.
- Keep the milestone versions and your current/master version in one location, in order to track how many versions exist at any one point in time.
- GoogleDocs contains simple versioning control; for more extensive versioning control try a program such as Subversion.
For more information about version control, please see the UK Data Archive's Version Control & Authenticity.
How can you structure data and name files to facilitate both your own analysis and future data sharing?
If you know in advance what statistical tests or software you want to use to analyze your data, this might affect how you decide to structure the data you record.
Similarly, if you choose a data repository for data sharing before you start the project, then you can structure the data in a format accepted by the repository (this will enable you to easily share data after the project without spending time re-formatting or re-restructuring the data).
How long should your data be retained? (e.g. 3-5 years, 10-20 years, permanently)
Not all data needs to be retained indefinitely. Figure out what's important to keep, make sure your plan for those datasets is solid and compliant with funder or journal requirements.
Recommendations are based on the MIT Libraries Data Management Guide about evaluating your data needs.