Metadata for Data: Data Management Plan




What is metadata?

Metadata is structured information describing a resource, for example, the dates, title, and creators associated with a dataset.  Metadata needs vary across scientific fields, but would ideally cover general descriptive information, access and use policies, data characteristics, and archive terms. A metadata record consists of a set of predefined elements that describe specific attributes of a resource.  Here's an example of a record for a dataset. For datasets, metadata can also refer to codebooks.  See this sample codebook, and check out the Codebook Cookbook for more information.

Why document data?

Metadata is a type of documentation.  Documenting data

  • Allows you to easily find and reuse your own data
  • Enables you to discover, evaluate, and reuse the data of others
  • Helps others discover, reproduce, reuse, and cite your data
  • Aids digital preservation by documenting files as software and formats evolve over time

Establishing a metadata strategy that sufficiently describes your data and meets your data management needs is an important part of a data management plan. Metdata planning should occur at the beginning of a research project. Doing so will make automation of metadata creation easier and reduce the need for time-consuming metadata capture or clean up later.

Metadata standards

There are a number of metadata standards for datasets.  While it is not necessary to have an in-depth understanding of metadata standards, it is important to create metadata that will be interoperable with recognized standards. Some scientific communities have their own widely used standards (e.g., Astronomy Visualization Metadata StandardDarwin CoreEcological Metadata Language).  The Digital Curation Centre provides a disciplinary metadata guide that lists metadata standards by discipline.  Some general dataset standards that are discipline agnostic include DataCite, Project Open Data, and DDI.


The best way to ensure your dataset will be discovered is to publish it to the web with good metadata.  One easy way to do that is to deposit your data into a repository.  The repository you choose may depend on your funder, publisher, or discipline.  For guidance in choosing a repository, contact the NCSU Libraries.

If you don’t submit your data into a repository, you can still submit metadata about your datasets in a registry or catalog, for example DataCite.

Where ever you publish your data, use persistent and unique identifiers whenever possible.  Datasets should have identifiers like a DOI to create linkages to and from publications and other datasets.  Researchers should have an ORCiD to connect them to publications, datasets, and other researchers.  Funders and grants may also have identifiers to collocate related research.

Best practices

The NCSU Libraries suggests the following set of standard metadata elements that should be captured to describe the content of your data resources as well as the nature of the files.

  Field Description
General information Title Name of the collection of datasets or of the project that produced them
Creator Names and institutions of the people who created the data
Dates Key dates associated with the data, such as the date span covered by the data or date of creation
Funding agencies/period Organizations or agencies who funded the research and the periods of funding
Keywords Keywords or phrases describing the subject or content of the data
Identifier Unique number or alphanumeric string used to identify the data
Coverage (if applicable) Geographic coverage of the dataset
Access information Access restrictions Where and how your data can be accessed by other researchers
Technical details File formats Stata, Excel, tab delimited text, TIFF images, WAV audio, etc.
List of files  
Count of files  

At a minimum, metadata records should be kept in a fielded form, such as a spreadsheet, CSV file, or tab-delimited file. Auxiliary information necessary to interpret the metadata - such as explanations of codes, abbreviations, or algorithms used - should be included as accompanying documentation.

Who can you contact if you need help or have questions?

Please contact us at for assistance identifying the metadata necessary for the project to aid in both data management and search and discovery.