Describing your Data

What is metadata?

Metadata is structured information describing a resource, for example, the dates, title, and creators associated with a dataset.  Metadata needs vary across scientific fields, but would ideally cover general descriptive information, access and use policies, data characteristics, and preservation plans. A metadata record consists of a set of predefined elements that describe specific attributes of a resource.  For datasets, metadata can also refer to codebooks and readme files. See this sample codebook, and check out the Codebook Cookbook for more information. Cornell has a comprehensive web guide on how to create readme files.

Why describe your data?


  • Allows you to easily find and reuse your own data
  • Enables you to discover, evaluate, and reuse the data of others
  • Helps others discover, reproduce, reuse, and cite your data
  • Aids you in updating file and software formats to ensure that they are accessible in the future

Establishing a metadata workflow that sufficiently describes your data is an important part of data management. Metadata planning should occur at the beginning of a research project. 

Metadata standards

There are a number of metadata standards for datasets. Some scientific communities have their own widely used standards (e.g., Astronomy Visualization Metadata Standard, Darwin Core, Ecological Metadata Language) and some data repositories will have metadata guidelines and requirements as well. The Digital Curation Centre provides a disciplinary metadata guide that lists metadata standards by discipline. If a widely used metadata standard does not exist in your field, or if you do not know if one exists, please contact the Libraries’ data management team at


The best way to ensure your dataset will be discovered is to deposit your data into a repository that is findable and searchable on the web.  The repository you choose may depend on your funder, publisher, or discipline.  For guidance in choosing a repository, start with our data repository support or contact us at

If you don’t submit your data into a repository, you can still submit metadata about your datasets in a registry or catalog, for example DataCite.

A DOI should be given to your dataset wherever it is published  Researchers should have an ORCiD to connect them to publications, datasets, and other researchers.  Funders and grants may also have identifiers to collocate related research.

Who can you contact if you need help or have questions?

Please contact us at for assistance identifying the metadata necessary for the project to aid in both data management and search and discovery.