Skip to main content

Data Management @ NAU

Ask yourself:

What type(s) of data will be produced?

Thinking about the type of data and the average file size can help you plan for your data organization and storage needs (for example: image data usually require a great deal of storage space).

How much data will you collect, and at what growth rate?

Assess the potential growth rate of data for your project and plan for this growth. For example, are you gathering data by hand or using sophisticated instrumentation that is able to capture a lot of data at once? Will larger amounts of data be collected as the project progresses?

Remember that you can request funding for data management when you write your grant -- so request enough funding to pay for data storage for the life of the project and beyond, if necessary.

Will the data change frequently?

If your dataset undergoes frequent changes, you should implement a "versioning system" to keep track of which file contains the latest iteration of your data, and to help remember how various versions differ.

  • To distinguish between versions:
    • Implement a systematic naming convention for your files.
    • Decide in advance how many versions should be kept -- identify and keep versions that qualify as "milestones" for your dataset.
    • Keep the milestone versions and your current/master version in one location, in order to track how many versions exist at any one point in time.
  • GoogleDocs contains simple versioning control; for more extensive versioning control try a program such as Subversion.

For more information about version control, please see the UK Data Archive's Version Control & Authenticity.

How can you structure data and name files to facilitate both your own analysis and future data sharing?

If you know in advance what statistical tests or software you want to use to analyze your data, this might affect how you decide to structure the data you record.

Similarly, if you choose a data repository for data sharing before you start the project, then you can structure the data in a format accepted by the repository (this will enable you to easily share data after the project without spending time re-formatting or re-restructuring the data).

How long should your data be retained? (e.g. 3-5 years, 10-20 years, permanently)

Not all data needs to be retained indefinitely. Figure out what's important to keep, make sure your plan for those datasets is solid and compliant with funder or journal requirements.

 

Recommendations are based on the MIT Libraries Data Management Guide about evaluating your data needs.

Need funding for data management?

When creating your data management plan, be sure to include estimates for data sharing.  

Both the NIH and the NSF allow you to include line-item requests for resources to support your data management plans.

*NIH: see "Funds for data sharing"

*NSF: see "Other direct costs -- Publication/Documentation/Dissemination" (Grant Proposal Guide, Chapter II.C.g(vi)(b)

Need to submit supplemental data with a publication?

Joint Data Archiving Policy (JDAP)

Some journals now require authors to archive the data required to support the claims made in their publication. In order to coordinate their requirements, many of these journals have adopted the Joint Data Archiving Policy (JDAP).

Check the journal's official website to see if the journal requires you to make supporting data publicly available.

Where to archive data?

Individual journals vary in their depositing requirements and may specify a specific repository or requirements for archived data (ie DOI).

NAU's institutional repository OpenKnowledge@NAU can be used to archive small data sets, however we do not currently provide DOIs or other permanent identifiers which some publishers/funders require. Check requirements before using OpenKnowledge@NAU.

You can find other data repositories through the Registry of Research Data Repositories or by looking at your publisher's/funder's recommended repository lists.